Design, Security Analysis, and Evaluation of Endpoint-Aware Token-Bucket Rate Limiting for Web APIs Using Database-Configured Policies
A common recommendation for preventing brute-force authentication, credential stuffing, scraping, and resource exhaustion is rate limiting, a standard control for API availability and security. Uncertain semantics under horizontal scaling, proxy-induced client-IP ambiguity, unsafe identity binding, ambiguous policy resolution, canonicalization gaps in endpoint matching, and unbounded in-memory state are some of the reasons why deployments frequently fail. In this paper, we propose an endpoint-aware mechanism that uses token buckets with bounded bucket state to enforce per-endpoint RPS limits that are configured in a database and cached for low-latency lookup. The mechanism offers robust endpoint normalization (including an explicit UNKNOWN bucket for unmatched paths), deterministic policy precedence, strict identity validation, trusted-proxy boundaries for client IP extraction, optional per-endpoint cost weights, and operational controls for safe defaults and cache reload. We propose adversarial test suites and microbenchmarks for reproducible evaluation and structure a security analysis by attack surfaces (identity, endpoint normalization, state management, configuration, and distributed enforcement). Additionally, we compare the mechanism to commonly used gateway/edge systems and shared-state primitives, and place it within related work on distributed algorithms, adaptive controllers, overload control, and API rate-limit adoption patterns.