Designing APIs that survive real-world load
Versioning, idempotency, pagination contracts, and error shapes that keep mobile, partner, and batch clients stable when traffic spikes.
Integrations fail quietly when APIs are optimistic about clients. Mobile apps retry on flaky networks. ERP jobs replay overnight batches. Partners cache IDs for months. Your contract must assume duplication, partial failure, and old binaries in the wild.
Make writes safe to repeat
POST that create money movement, inventory, or tickets should accept Idempotency-Key headers with server-side dedupe windows. Return the same resource representation on replay so clients do not branch on accidental doubles.
Pagination is part of the product
Offset pagination breaks under concurrent writes. Cursor-based lists with stable sort keys (created_at, id) keep exports and infinite scroll honest. Document maximum page sizes and rate limits in the schema, not a PDF appendix.
- Explicit error codes with machine-readable
typefields. - Problem+json (or equivalent) for validation failures with field paths.
- 429 responses that include
Retry-After. - Deprecation headers before you remove fields—months, not days.
Load testing belongs in CI: synthetic mixes of read-heavy dashboards and bursty write batches. We catch N+1 query explosions and lock contention before launch week, not after the press release.
Public APIs are integration contracts measured in years. Breaking changes strand mobile apps in app-store review, stall partner certifications, and turn your Slack into a wall of "why did this 404".
Versioning without shame
Prefer additive changes inside a major version. When you must break, ship parallel routes (/v2/invoices) with sunset headers and a published timeline. Never silently change field types—clients deserialize into strongly typed models that will not forgive ambiguous JSON.
Rate limits are product behavior
Document burst vs sustained limits. Return structured bodies on 429 with retry guidance. For partners running batch jobs, offer bulk endpoints or async job APIs instead of letting them hammer list endpoints with offset pagination.
Webhooks done safely
- Sign payloads with rotating secrets.
- Expose delivery IDs for deduplication on the consumer side.
- Replay tooling for support—not raw re-fire without guards.
- Backoff and dead-letter queues on your outbound worker.
Contract testing in CI
Consumer-driven contract tests or OpenAPI diff gates catch accidental breaking changes before merge. Pair them with synthetic monitors that hit critical paths from outside your VPC—the view integrators actually see.
API reviews at Triaxo include a "bad client" pass: duplicate submits, aborted uploads, clock-skewed JWTs, and oversized payloads. Designing for those upfront is cheaper than emergency hotfixes when a partner onboarding goes live.



