feat: bot mitigation proxy & health endpoint by thedaviddias · Pull Request #248 · thedaviddias/ux-patterns-for-developers

thedaviddias · 2026-02-10T23:35:25Z

Summary

Next.js 16 proxy (proxy.ts) blocks abusive bots (DotBot, GPTBot, AI scrapers, headless browsers, vulnerability scanners) and rate-limits requests per IP + route category
Health endpoint (/api/health) runs on Edge runtime for zero-cold-start uptime monitoring — replaces polling GET / which triggered full SSR
Sentry fix: hoisted @sentry/* packages in .npmrc to resolve @sentry/node-core module-not-found errors that caused 500s on all SSR pages

Rate limits

Route	Limit
`/proxy/api/event`, `/api/event`	10 req/min
`/api/*`	20 req/min
All pages	30 req/min

Bot handling

Type	Action
Good bots (Googlebot, Bingbot, etc.)	Pass through
Bad bots (DotBot, GPTBot, scrapers)	403 Forbidden
Empty user-agent	403 Forbidden
Vulnerability scan paths (`/wp-admin`, `/.env`)	403 Forbidden

Test plan

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added service health monitoring endpoint.
- Implemented bot detection and blocking to protect against malicious traffic.
- Introduced rate limiting per user/route to prevent abuse and ensure service stability.

Block abusive crawlers hitting GET requests (DotBot, GPTBot, etc.) and rate-limit per IP to prevent Plausible 429s. Add /api/health edge endpoint for uptime monitoring without SSR overhead. Fix Sentry module resolution by hoisting @sentry/* packages in .npmrc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-10T23:35:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
ux-patterns-for-developers	Ready	Preview, Comment	Feb 10, 2026 11:36pm

coderabbitai · 2026-02-10T23:35:48Z

📝 Walkthrough

Walkthrough

This pull request introduces a health check API endpoint, a bot detection and rate limiting utility module, and a proxy middleware that integrates these components to filter requests based on bot classification and rate limit thresholds by IP and route category.

Changes

Cohort / File(s)	Summary
Health Check Endpoint `apps/web/app/api/health/route.ts`	New edge-runtime GET handler that responds with status and timestamp, including cache-busting and indexing-prevention headers.
Bot Detection & Rate Limiting `apps/web/lib/bot-detection.ts`	New utility module providing bot classification (good/bad/suspicious/human) via user-agent pattern matching, IP extraction from headers, and in-memory rate limiting with periodic cleanup and route-based rate limit categories.
Proxy Middleware `apps/web/proxy.ts`	New middleware that detects bots, blocks bad bots with 403, rate limits requests per IP and route category returning 429 when exceeded, and passes through prefetch and good-bot requests to the application.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Proxy
    participant BotDetection
    participant RateLimiter
    participant NextApp as Next.js App
    
    Client->>Proxy: HTTP Request
    Proxy->>BotDetection: detectBot(userAgent, pathname)
    BotDetection-->>Proxy: BotDetectionResult
    
    alt Bad Bot Detected
        Proxy-->>Client: 403 Forbidden
    else Good Bot or Human
        Proxy->>Proxy: Check if prefetch request
        alt Prefetch Request
            Proxy-->>NextApp: NextResponse.next()
            NextApp-->>Client: Response
        else Non-prefetch
            Proxy->>BotDetection: getClientIP(request)
            BotDetection-->>Proxy: IP Address
            Proxy->>BotDetection: getRouteCategory(pathname)
            BotDetection-->>Proxy: Category & Limit
            Proxy->>RateLimiter: checkRateLimit(key, limit)
            alt Rate Limit Exceeded
                RateLimiter-->>Proxy: false
                Proxy-->>Client: 429 Too Many Requests
            else Rate Limit OK
                RateLimiter-->>Proxy: true
                Proxy-->>NextApp: NextResponse.next()
                NextApp-->>Client: Response
            end
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: bot mitigation proxy & health endpoint' accurately summarizes the main changes: adding a bot mitigation proxy and a health endpoint, which align with the core objectives of the PR.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/bot-mitigation-health-endpoint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00558730eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-10T23:39:13Z

apps/web/lib/bot-detection.ts

+	const goodMatch = userAgent.match(GOOD_BOTS);
+	if (goodMatch) {
+		return { isBot: true, botType: "good", botName: goodMatch[0] };


Check bad patterns before returning good-bot classification

detectBot returns as soon as GOOD_BOTS matches, so a user-agent like Googlebot HeadlessChrome is classified as good and then bypasses all enforcement via the proxy fast-path. Because BAD_BOTS is evaluated later, any mixed UA containing an allowlisted token can evade the intended bot/scanner blocking; evaluate bad signatures first (or treat mixed matches as bad) to avoid this bypass.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-10T23:39:13Z

apps/web/lib/bot-detection.ts

+	return (
+		forwardedFor?.split(",")[0]?.trim() || realIp || cfConnectingIp || "unknown"
+	);


Derive rate-limit IP from trusted proxy headers

The rate-limit key prioritizes the first value in x-forwarded-for, which is often client-controllable or prependable in many proxy setups; an attacker can vary that value per request to bypass per-IP throttling. Use a trusted ingress-provided IP source (or only parse x-forwarded-for at a known trusted hop boundary) before building the limiter key.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@apps/web/lib/bot-detection.ts`:
- Around line 12-13: The SUSPICIOUS_PATHS regex currently includes
/.well-known/security.txt which blocks a standardized disclosure location;
update the SUSPICIOUS_PATHS constant (the regex assigned to SUSPICIOUS_PATHS) to
remove the \.well-known\/security\.txt alternative, or add an explicit
allow-list check for the path before applying SUSPICIOUS_PATHS (e.g., explicitly
permit the exact path "/.well-known/security.txt" in the request-path handling
code) so security.txt is not treated as suspicious.

🧹 Nitpick comments (4)

apps/web/lib/bot-detection.ts (3)

23-46: Good-bot classification takes priority over suspicious-path check — spoofed UAs bypass path blocking.

If a request carries a spoofed Googlebot UA and hits /wp-admin, it will be classified as "good" and pass through without the suspicious-path check ever running. This is a common trade-off, but worth noting. To harden this, you could check suspicious paths before (or independently of) the good-bot match, or add reverse-DNS verification for search engine bots at a later stage.

48-56: Fallback IP "unknown" collapses all unidentified clients into one rate-limit bucket.

If none of the IP headers are present, every request maps to the same rate-limit key (unknown:<category>), causing all such clients to share a single quota. On platforms like Vercel or Cloudflare this is unlikely, but if this ever runs behind a different reverse proxy that doesn't set these headers, legitimate users will be collectively throttled.

Consider logging a warning or using request.ip (available in Next.js middleware) as an additional fallback.

58-78: In-memory rate limiter and setInterval are ineffective in serverless/edge environments.

Two concerns here:

Ephemeral state: The Map is local to each invocation's memory. In serverless environments (e.g., Vercel), each cold start creates a fresh Map, so rate limits won't persist across invocations. The rate limiter will only be effective for warm instances that handle multiple requests.

Top-level setInterval: In a long-running Node.js server this works fine, but in serverless/edge runtimes, the interval either never fires (short-lived isolates) or keeps the process reference alive unnecessarily. The comment on Line 58 says "Node.js runtime" but the proxy may run in edge context depending on deployment.

If the goal is best-effort rate limiting for warm instances, this is acceptable as-is. For stricter enforcement, consider an external store (e.g., Vercel KV, Upstash Redis) or at minimum document the limitation. You could also replace the setInterval with a lazy cleanup approach inside checkRateLimit itself (e.g., prune stale entries every N calls).

apps/web/proxy.ts (1)

24-37: X-Blocked-Reason header discloses blocking rationale to the client.

This is helpful for debugging but also informs attackers why they were blocked, making it easier to adjust their approach (e.g., switching UA, avoiding probe paths). Consider removing these headers in production or restricting them to internal/debug builds.

coderabbitai · 2026-02-10T23:39:45Z

apps/web/lib/bot-detection.ts

+const SUSPICIOUS_PATHS =
+	/^\/(wp-admin|wp-login|wp-content|wp-includes|\.env|\.git|phpmyadmin|phpinfo|administrator|cgi-bin|\.aws|\.well-known\/security\.txt)/i;


thedaviddias and others added 2 commits February 10, 2026 18:32

style: apply biome formatting fixes

0055873

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview February 10, 2026 23:36 View deployment

chatgpt-codex-connector bot reviewed Feb 10, 2026

View reviewed changes

coderabbitai bot reviewed Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: bot mitigation proxy & health endpoint#248

feat: bot mitigation proxy & health endpoint#248
thedaviddias wants to merge 2 commits intomainfrom
feat/bot-mitigation-health-endpoint

thedaviddias commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

vercel bot commented Feb 10, 2026 •

edited

Loading

coderabbitai bot commented Feb 10, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

chatgpt-codex-connector bot left a comment

chatgpt-codex-connector bot Feb 10, 2026

chatgpt-codex-connector bot Feb 10, 2026

coderabbitai bot left a comment

coderabbitai bot Feb 10, 2026

Labels

1 participant

		const SUSPICIOUS_PATHS =
		/^\/(wp-admin\|wp-login\|wp-content\|wp-includes\|\.env\|\.git\|phpmyadmin\|phpinfo\|administrator\|cgi-bin\|\.aws\|\.well-known\/security\.txt)/i;

Uh oh!

Conversation

thedaviddias commented Feb 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Rate limits

Bot handling

Test plan

Summary by CodeRabbit

vercel bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

coderabbitai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

chatgpt-codex-connector bot Feb 10, 2026

Choose a reason for hiding this comment

chatgpt-codex-connector bot Feb 10, 2026

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Labels

1 participant

thedaviddias commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

vercel bot commented Feb 10, 2026 •

edited

Loading

coderabbitai bot commented Feb 10, 2026 •

edited

Loading