- Home
- HTTP Status Codes
- HTTP 429 Too Many Requests: Rate Limiting
Status Code
HTTP 429 Too Many Requests: Rate Limiting
Learn what 429 Too Many Requests means, how rate limiting works, and how to handle API throttling in your applications.
TL;DR: 429 Too Many Requests means you hit the rate limit. Wait and retry, checking Retry-After header.
TL;DR: 429 means “slow down!” You’ve hit the rate limit. Wait before retrying.
What is 429 Too Many Requests?
A 429 error indicates you’ve exceeded the allowed number of requests in a time window. It’s the server’s way of protecting itself from overload or abuse.
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1737277421
{"error": "Rate limit exceeded", "retry_after": 60}
```javascript
## Rate Limit Headers
| Header | Meaning |
| ----------------------- | -------------------------------- |
| `Retry-After` | Seconds to wait (or date) |
| `X-RateLimit-Limit` | Max requests per window |
| `X-RateLimit-Remaining` | Requests left in window |
| `X-RateLimit-Reset` | Unix timestamp when limit resets |
## Handling 429 in Code
```javascript
async function fetchWithRateLimit(url) {
const response = await fetch(url)
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After')
const delay = retryAfter ? parseInt(retryAfter) * 1000 : 60000
console.log(`Rate limited. Waiting ${delay}ms...`)
await new Promise((r) => setTimeout(r, delay))
return fetchWithRateLimit(url) // Retry
}
return response
}
Exponential Backoff
async function fetchWithBackoff(url, attempt = 0) {
const response = await fetch(url)
if (response.status === 429 && attempt < 5) {
const delay = Math.pow(2, attempt) * 1000 // 1s, 2s, 4s, 8s, 16s
await new Promise((r) => setTimeout(r, delay))
return fetchWithBackoff(url, attempt + 1)
}
return response
}
```javascript
## Implementing Rate Limiting
### Express.js with express-rate-limit
```javascript
import rateLimit from 'express-rate-limit'
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
standardHeaders: true, // Return rate limit info in headers
handler: (req, res) => {
res.status(429).json({
error: 'Too many requests',
retry_after: Math.ceil(req.rateLimit.resetTime / 1000)
})
}
})
app.use('/api', limiter)
Per-User Limits
const userLimiter = rateLimit({
windowMs: 60 * 1000,
max: 30,
keyGenerator: (req) => req.user?.id || req.ip,
skip: (req) => req.user?.isPremium // Skip for premium users
})
Rate Limiting Strategies
| Strategy | Description |
|---|---|
| Fixed Window | X requests per minute (simple, can burst at edges) |
| Sliding Window | Smoother, considers recent history |
| Token Bucket | Allows bursts up to bucket size |
| Leaky Bucket | Constant rate, queues excess |
Best Practices
- Always include Retry-After - Helps clients know when to retry
- Use standard headers - X-RateLimit-* for transparency
- Different limits per tier - Free vs paid users
- Limit by appropriate key - IP, API key, or user ID
- Document your limits - Include in API docs
Related
- 503 Service Unavailable - Server overloaded
- Retry-After header - When to retry
Rate Limiting Algorithms and Choosing the Right One
The choice of rate limiting algorithm affects both the user experience and the server’s protection against abuse. Fixed window counting is the simplest: count requests in each time window (e.g., per minute) and reset the counter at the window boundary. The downside is that a client can make double the allowed requests by sending half at the end of one window and half at the start of the next.
Sliding window counting fixes this by tracking the exact timestamp of each request and counting only requests within the last N seconds. This provides smoother rate limiting but requires more memory to store individual timestamps.
Token bucket allows controlled bursting. The bucket fills at a constant rate (e.g., 10 tokens per second) up to a maximum capacity (e.g., 100 tokens). Each request consumes one token. A client that has been idle can burst up to the bucket capacity, then is limited to the refill rate. This is the most user-friendly algorithm for APIs where occasional bursts are legitimate.
For most web APIs, token bucket or sliding window provides the best balance of protection and user experience. Always document your rate limiting algorithm and limits in your API documentation so clients can implement appropriate retry logic.
Frequently Asked Questions
What does 429 Too Many Requests mean?
A 429 error means you've sent too many requests in a given time period. The server is rate limiting you to prevent abuse or overload.
How do I fix a 429 error?
Wait before retrying (check Retry-After header), reduce request frequency, implement exponential backoff, or request a higher rate limit.
What is the Retry-After header?
Retry-After tells you how long to wait before sending another request. It can be seconds (120) or a date (Mon, 19 Jan 2026 09:00:00 GMT).
How do I implement rate limiting?
Track requests per client (by IP or API key) using a sliding window or token bucket algorithm. Return 429 when limit exceeded.