HTTP

Status Code

HTTP 429 Too Many Requests: Rate Limiting

Learn what 429 Too Many Requests means, how rate limiting works, and how to handle API throttling in your applications.

3 min read intermediate Try in Playground

TL;DR: 429 Too Many Requests means you hit the rate limit. Wait and retry, checking Retry-After header.

TL;DR: 429 means “slow down!” You’ve hit the rate limit. Wait before retrying.

What is 429 Too Many Requests?

A 429 error indicates you’ve exceeded the allowed number of requests in a time window. It’s the server’s way of protecting itself from overload or abuse.

HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1737277421

{"error": "Rate limit exceeded", "retry_after": 60}
```javascript

## Rate Limit Headers

| Header                  | Meaning                          |
| ----------------------- | -------------------------------- |
| `Retry-After`           | Seconds to wait (or date)        |
| `X-RateLimit-Limit`     | Max requests per window          |
| `X-RateLimit-Remaining` | Requests left in window          |
| `X-RateLimit-Reset`     | Unix timestamp when limit resets |

## Handling 429 in Code

```javascript
async function fetchWithRateLimit(url) {
  const response = await fetch(url)

  if (response.status === 429) {
    const retryAfter = response.headers.get('Retry-After')
    const delay = retryAfter ? parseInt(retryAfter) * 1000 : 60000

    console.log(`Rate limited. Waiting ${delay}ms...`)
    await new Promise((r) => setTimeout(r, delay))

    return fetchWithRateLimit(url) // Retry
  }

  return response
}

Exponential Backoff

async function fetchWithBackoff(url, attempt = 0) {
  const response = await fetch(url)

  if (response.status === 429 && attempt < 5) {
    const delay = Math.pow(2, attempt) * 1000 // 1s, 2s, 4s, 8s, 16s
    await new Promise((r) => setTimeout(r, delay))
    return fetchWithBackoff(url, attempt + 1)
  }

  return response
}
```javascript

## Implementing Rate Limiting

### Express.js with express-rate-limit

```javascript
import rateLimit from 'express-rate-limit'

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // 100 requests per window
  standardHeaders: true, // Return rate limit info in headers
  handler: (req, res) => {
    res.status(429).json({
      error: 'Too many requests',
      retry_after: Math.ceil(req.rateLimit.resetTime / 1000)
    })
  }
})

app.use('/api', limiter)

Per-User Limits

const userLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 30,
  keyGenerator: (req) => req.user?.id || req.ip,
  skip: (req) => req.user?.isPremium // Skip for premium users
})

Rate Limiting Strategies

StrategyDescription
Fixed WindowX requests per minute (simple, can burst at edges)
Sliding WindowSmoother, considers recent history
Token BucketAllows bursts up to bucket size
Leaky BucketConstant rate, queues excess

Best Practices

  1. Always include Retry-After - Helps clients know when to retry
  2. Use standard headers - X-RateLimit-* for transparency
  3. Different limits per tier - Free vs paid users
  4. Limit by appropriate key - IP, API key, or user ID
  5. Document your limits - Include in API docs

Rate Limiting Algorithms and Choosing the Right One

The choice of rate limiting algorithm affects both the user experience and the server’s protection against abuse. Fixed window counting is the simplest: count requests in each time window (e.g., per minute) and reset the counter at the window boundary. The downside is that a client can make double the allowed requests by sending half at the end of one window and half at the start of the next.

Sliding window counting fixes this by tracking the exact timestamp of each request and counting only requests within the last N seconds. This provides smoother rate limiting but requires more memory to store individual timestamps.

Token bucket allows controlled bursting. The bucket fills at a constant rate (e.g., 10 tokens per second) up to a maximum capacity (e.g., 100 tokens). Each request consumes one token. A client that has been idle can burst up to the bucket capacity, then is limited to the refill rate. This is the most user-friendly algorithm for APIs where occasional bursts are legitimate.

For most web APIs, token bucket or sliding window provides the best balance of protection and user experience. Always document your rate limiting algorithm and limits in your API documentation so clients can implement appropriate retry logic.

Frequently Asked Questions

What does 429 Too Many Requests mean?

A 429 error means you've sent too many requests in a given time period. The server is rate limiting you to prevent abuse or overload.

How do I fix a 429 error?

Wait before retrying (check Retry-After header), reduce request frequency, implement exponential backoff, or request a higher rate limit.

What is the Retry-After header?

Retry-After tells you how long to wait before sending another request. It can be seconds (120) or a date (Mon, 19 Jan 2026 09:00:00 GMT).

How do I implement rate limiting?

Track requests per client (by IP or API key) using a sliding window or token bucket algorithm. Return 429 when limit exceeded.

Keep Learning