beprodready
Build Your Own Rate Limiter

Stage 3: Sliding Window Log

stage 3 of 3 · ~30 min · runs in your browser

Objective:Implement the most accurate rate limiting algorithm and understand why it's too expensive for most production systems — and what "sliding window counter" does differently to get 99% of the accuracy at 1% of the memory cost.

sliding window logsorted set / deque per userO(n) memory per userfixed window vs sliding window tradeoffRedis ZSET for timestamps

The sliding window log eliminates the boundary burst problem by keeping a precise log of request timestamps and counting how many fall within the window ending at now.

Algorithm:

  1. Store every request timestamp in a sorted log per user
  2. On each new request, remove timestamps older than now - window_seconds
  3. Count remaining timestamps — if count < limit, allow and add now

Why it's accurate: At any point in time, the count reflects exactly how many requests happened in the last window_seconds — no boundary matters.

Why it's expensive: Memory scales with request rate × window size × users. At 1000 rps per user with a 60s window, each user's log holds 60,000 timestamps. With 1M users, that's 60B entries in memory. This is why Redis ZSETs are used — they can be paginated, but the memory problem remains.

Redis implementation (for context):

# ZADD rate:{user_id} {now} {now}              # add current timestamp
# ZREMRANGEBYSCORE rate:{user_id} 0 {now - window}  # evict old
# ZCARD rate:{user_id}                          # count within window
# EXPIRE rate:{user_id} {window}                # auto-cleanup

Implement create_sliding_window_log(limit, window_seconds) returning:

  • allow(user_id, current_time) — return True if within limit, False otherwise. current_time is a Unix timestamp (float).
  • count(user_id, current_time) — return the number of requests in the current sliding window (for testing and observability).
Why this matters in production

Cloudflare, Figma, and systems where precision matters use sliding window log (or approximations of it). Understanding WHY it's more accurate than fixed window — and why it's impractical at billions of requests — gives you the mental model to evaluate any rate limiting system in a design interview.

tests (6)

allows first request

allows up to the limit in a window

allows after old timestamps expire

no boundary burst

different users are independent

count reflects live window

create_sliding_window_log.py· Pyodide
First run loads Pyodide (~10s)