🚦developer

API Rate Limiting Explained: Why It Exists and How to Work With It

Rate limiting blocks your requests when you hit the limit. Here's how it works under the hood, how to handle 429 errors gracefully, and how to implement rate limiting in your own APIs.

PSBy Priya Shah · Senior Software EngineerJanuary 12, 2026Updated February 10, 20267 min read

Free to read

Frequently Asked Questions

What HTTP status code does rate limiting return?+

429 Too Many Requests is the standard status code for rate limiting, defined in RFC 6585. The response should include a Retry-After header indicating either the number of seconds to wait or a specific UTC datetime when the limit resets. Many APIs also return rate limit information in every response as headers: X-RateLimit-Limit (total requests allowed), X-RateLimit-Remaining (requests left in current window), and X-RateLimit-Reset (Unix timestamp when the window resets). Reading these headers proactively lets you slow down before hitting the limit rather than after.

What is the difference between fixed window and sliding window rate limiting?+

Fixed window rate limiting counts requests within fixed time buckets — for example, 100 requests per minute, where the minute resets at :00 seconds on the clock. Problem: you can send 100 requests at 0:59 and 100 more at 1:01, effectively making 200 requests in two seconds. Sliding window rate limiting counts requests within a rolling time window — 100 requests in the last 60 seconds, recalculated continuously. This is fairer and prevents the burst problem but requires more memory to implement. Token bucket and leaky bucket are algorithm variants that smooth out burst traffic more elegantly.

How should I handle rate limit errors in production code?+

The standard pattern is exponential backoff with jitter. When you receive a 429, wait a short period and retry. If you get another 429, wait twice as long. Repeat up to a maximum number of attempts. Add random jitter (a small random delay) to prevent multiple clients from synchronizing their retries and hammering the API simultaneously when the window resets. Read the Retry-After header if present and wait at least that long before retrying. Implement a circuit breaker that stops retrying after N consecutive failures and alerts your team — persistent rate limiting usually means you need to reduce request volume or upgrade your API tier.

🔧 Free Tools Used in This Guide

Json Formatter Url Encoder Base64 Encoder

Priya Shah

Senior Software Engineer · 9+ years experience

Priya has nine years of experience building distributed systems and developer tooling at two B2B SaaS companies. She writes about APIs, JSON/JWT workflows, regex, DevOps, and the small utilities that make debugging faster at 2am.

View all posts by Priya Shah →

Tags:

apirate-limitinghttpbackenddeveloper