Skip to main content
短.be

Rate Limiting

A mechanism that caps the number of requests to an API or service within a given time window. Protects servers and ensures fair usage.

Nov 25, 2025 · About 1 min read

Security

Rate limiting caps the number of requests an API or service accepts within a defined time window. Rules like "60 requests per minute" or "1,000 requests per day" are enforced, and requests exceeding the limit receive an HTTP 429 (Too Many Requests) response.

Rate limiting serves three purposes: server protection (preventing service outages from request floods), fairness (stopping individual users from monopolizing resources), and cost management (controlling pay-per-use cloud billing).

URL shortening services apply rate limiting in several areas: URL shortening API request limits (preventing spam-like mass generation), access limits on shortened URLs (mitigating DDoS attacks), and login attempt limits on dashboards (blocking brute-force attacks).

Four main rate limiting algorithms exist. Fixed window resets a counter at regular intervals. Sliding window continuously calculates requests within the most recent time frame. Token bucket replenishes tokens at a steady rate, consuming one per request. Leaky bucket processes requests at a constant rate, queuing any excess.

Developers working with rate-limited APIs should check response headers (X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After) and adjust request frequency based on remaining quota. When receiving a 429 response, exponential backoff - waiting for the duration specified in the Retry-After header before retrying - is the recommended approach. You can find related books on Amazon.

Share on XHatena

Was this article helpful?

Related Terms

Related Articles

FAQ

What should I do when I hit a rate limit?
Check the Retry-After header in the HTTP 429 response and wait the specified number of seconds before retrying. If you frequently hit limits, space out your requests or consider upgrading to a higher-tier plan.
What are typical rate limit values?
It varies by service, but free plans commonly allow 10 to 60 requests per minute, while paid plans offer 100 to 1,000 per minute. For URL shortening APIs, 100 to 500 shortenings per hour is a typical ceiling.
How do I implement rate limiting on my own API?
Common approaches include Nginx's limit_req module, built-in features of API gateways (like AWS API Gateway), or a custom token bucket implementation using Redis. For small-scale services, Nginx configuration alone is often sufficient.

Ready to create a short URL?

Shorten a URL for Free