Building an In-Memory Rate Limiter in Next.js
Rate limiting is a critical technique for protecting APIs from abuse, ensuring fair usage, and maintaining application stability. Let us delve into understanding how to build an in-memory rate limiter in Next.js
1. What Is a Rate Limiter?
A Rate Limiter is a mechanism used in software systems to control the number of requests a client can send to a server within a defined time interval. Its primary purpose is to protect backend services from overload, abuse, or misuse while ensuring fair access for legitimate users. Rate limiting can be applied at different levels, such as per IP address, per authenticated user, per API key, or even per specific endpoint. Common examples include:
- 100 requests per minute per IP address
- 10 login attempts per minute per user account
- 5 API calls per second per access token
1.1 How Does a Rate Limiter Work?
At a high level, a rate limiter tracks request counts over time and enforces predefined limits. The general flow looks like this:
- A request arrives at the server or API gateway
- The system identifies the client (IP address, user ID, API key, or token)
- The rate limiter retrieves the client’s recent request history from memory, cache, or storage
- The number of requests is compared against the configured limit for the current time window
- If the limit is exceeded:
- The request is rejected with an HTTP
429 Too Many Requestsresponse - Optional headers (e.g.,
Retry-After) inform the client when they can retry
- The request is rejected with an HTTP
- If the request is within the allowed limit, it is forwarded to the application for processing
1.2 Benefits of Using a Rate Limiter
Implementing a rate limiter provides both security and operational advantages:
- Prevents abuse such as brute-force attacks and automated scraping
- Mitigates the impact of denial-of-service (DoS and DDoS) attacks
- Ensures fair usage so one client cannot starve others of resources
- Reduces infrastructure and operational costs by avoiding unnecessary load
- Protects sensitive endpoints such as authentication, payments, and password resets
- Improves overall system reliability and stability under high traffic
1.3 Different Types of Rate Limiters
Rate limiters can be implemented using different algorithms, each with its own trade-offs in terms of accuracy, complexity, and performance.
1.3.1 Fixed Window
The fixed window algorithm divides time into discrete intervals (e.g., one minute). A client is allowed a fixed number of requests within each window.
- Easy to implement and understand
- Low memory and computational overhead
- Can cause traffic spikes at window boundaries (burst problem)
1.3.2 Sliding Window
The sliding window approach tracks request timestamps and calculates usage over a rolling time period rather than fixed intervals.
- Provides smoother and more accurate rate limiting
- Prevents burst traffic at window edges
- Requires more memory and computation to track timestamps
1.3.3 Token Bucket
In the token bucket algorithm, tokens are added to a bucket at a fixed rate. Each request consumes one or more tokens.
- Allows short bursts of traffic while enforcing an average rate
- Widely used in APIs and networking systems
- Simple to implement with in-memory or distributed caches
1.3.4 Leaky Bucket
The leaky bucket algorithm processes requests at a constant rate, regardless of how fast they arrive.
- Smooths traffic and prevents sudden spikes
- Requests may be queued or dropped if the bucket overflows
- Useful when consistent processing speed is required
2. Building an In-Memory Rate Limiter in Next.js
Before implementing the rate limiter, you need a basic Next.js project set up. You can create a new Next.js application using the official starter by running npx create-next-app and following the prompts. Once the project is created, navigate into the project directory and start the development server. Next.js automatically provides an api folder under pages, which allows you to define backend API routes alongside your frontend code. This makes it easy to implement and test server-side logic, such as rate limiting, without additional setup.
2.1 Rate Limiter Utility
// lib/rateLimiter.js
const rateLimitMap = new Map();
export function rateLimiter({
limit = 10,
windowMs = 60 * 1000
}) {
return (identifier) => {
const now = Date.now();
const record = rateLimitMap.get(identifier);
if (!record) {
rateLimitMap.set(identifier, {
count: 1,
startTime: now
});
return {
allowed: true
};
}
const {
count,
startTime
} = record;
if (now - startTime < windowMs) {
if (count >= limit) {
return {
allowed: false,
retryAfter: Math.ceil(
(windowMs - (now - startTime)) / 1000
)
};
}
record.count++;
return {
allowed: true
};
}
// Reset window
rateLimitMap.set(identifier, {
count: 1,
startTime: now
});
return {
allowed: true
};
};
}
2.1.1 Code Explanation
This code implements a simple in-memory rate limiter using a JavaScript Map to track requests per identifier (such as a user ID or IP). The rateLimiter function accepts configuration options for the maximum number of allowed requests (limit) and the time window in milliseconds (windowMs), and returns a function that checks whether a given identifier is allowed to proceed. Each time the returned function is called, it records the current time and looks up the identifier’s request record. If no record exists, it initializes one and allows the request. If a record exists and the request is within the same time window, it either blocks the request when the limit is exceeded—returning a retryAfter value in seconds—or increments the count and allows it. If the time window has expired, the limiter resets the count and start time for that identifier and allows the request again.
2.2 Using the Rate Limiter in a Next.js API Route
// pages/api/hello.js
import {
rateLimiter
} from "../../lib/rateLimiter";
const limiter = rateLimiter({
limit: 5,
windowMs: 60 * 1000
});
export default function handler(req, res) {
const ip =
req.headers["x-forwarded-for"] ||
req.socket.remoteAddress;
const result = limiter(ip);
if (!result.allowed) {
return res.status(429).json({
message: "Too many requests",
retryAfter: result.retryAfter
});
}
res.status(200).json({
message: "Request successful"
});
}
2.2.1 Code Explanation
This API route applies the previously defined rate limiter to incoming requests in a Next.js API handler. It imports the rateLimiter utility and configures it to allow a maximum of 5 requests per IP address within a 60-second window. Inside the handler, the client’s IP address is determined using the x-forwarded-for header (for proxied requests) or the socket’s remote address as a fallback. The limiter is then called with this IP as the identifier. If the limiter determines the request is not allowed, the handler responds with HTTP status 429 (Too Many Requests) and includes a retryAfter value indicating when the client can try again. If the request is allowed, the handler responds with a 200 status and a success message.
2.3 Code Run and Code Output
Once the API route is running, you can test the rate limiter by repeatedly calling the /api/hello endpoint (for example, using a browser, curl, or Postman). For the first five requests made from the same IP address within one minute, the API responds successfully as shown below:
HTTP/1.1 200 OK
{
"message": "Request successful"
}
After the request limit is exceeded (more than 5 requests within 60 seconds from the same IP), the rate limiter blocks further requests and returns an error response indicating that the client has sent too many requests:
HTTP/1.1 429 Too Many Requests
{
"message": "Too many requests",
"retryAfter": 42
}
The retryAfter value represents the number of seconds the client should wait before making another request. Once the time window expires, the counter resets automatically, and the client can again make successful requests to the API.
3. Conclusion
An in-memory rate limiter is a simple and effective way to protect APIs in small-scale or single-instance Next.js applications. It’s easy to implement, fast, and requires no external dependencies. However, for production systems running on serverless platforms or multiple instances, a distributed solution using Redis or similar storage is strongly recommended. Understanding rate limiting fundamentals allows you to build more secure, scalable, and resilient applications—starting simple and evolving as your system grows.

