Agent/MCP Audit Sprint

Self-owned dogfood audit report

jackjin1997/sentinel

No-execution dogfood sample for a public autonomous incident-response agent. The review is based on commit 2ebf5a5363db4ee95483ef3ecbae8e0842550131. It is not a commissioned audit, private vulnerability disclosure, or certification.

TargetSentinel AI incident agent
ValidationStatic source review + scanner
Scanner Score54/100 heuristic score
ExecutionNo target code run

Scope

Out of scope: live Sentinel deployments, model provider accounts, Bright Data account behavior, private telemetry, Vultr host state, Cloudflare account state, and unpublished branches.

Executive Summary

sentinel is a useful dogfood target because it is a real autonomous incident-response agent: a browser UI starts a server-side run, multiple model vendors participate in four phases, tools can query internal mock telemetry and external web data, and results stream back to the operator over SSE.

The repo already includes several good defensive choices for a demo-grade agent: request bodies are capped, client disconnects abort provider calls, slow consumers are cut off, tool schemas restrict common inputs, phase output is bounded, and the vendor-status tool uses a fixed vendor allowlist.

The main production risk is the missing operator boundary around a cost-bearing, credential-backed agent endpoint. Before Sentinel is reused outside a controlled demo, it needs authentication, rate/concurrency limits, output redaction, deployment hardening, and CI gates that prove those controls do not regress.

Boundary Map

AreaEvidenceRisk Notes
Browser entryapp/page.tsx:119-124The UI posts { incidentId } directly to /api/agent and reads streamed events.
Agent APIapp/api/agent/route.ts:19-73The route validates body size and JSON, then starts runIncidentAgent. No auth, rate limit, origin policy, or concurrency cap is visible.
SSE outputapp/api/agent/route.ts:52-56, lib/agent.ts:8-15Tool calls, tool results, text deltas, errors, and final report objects are serialized to the client.
Model credentialslib/agent.ts:31-55, .env.local.exampleQwen, Anthropic, Google, and Bright Data credentials are environment backed.
External web toolslib/tools/brightdata.ts:83-200Vendor status is allowlisted; public postmortem search and GitHub commit lookups can still spend quota and return untrusted web content.
Deploymentscripts/deploy-vultr.sh:47-88The demo script copies .env.local, creates a root-managed systemd service, and exposes HTTP :80.
Validation gatespackage.jsondev, build, and start exist, but no test, lint, typecheck, or CI workflow was visible.

Findings

HighPublic agent run endpoint needs auth, quota, and concurrency boundaries

Evidence: app/page.tsx:119-124 starts a run with a plain POST to /api/agent. app/api/agent/route.ts:19-73 accepts any request with a small JSON body and invokes runIncidentAgent, which can call multiple LLM providers and tools across phases.

Recommended fix: add an auth gate before runIncidentAgent, then enforce per-IP or per-token rate limits, a global concurrency cap, and a per-run budget ceiling.

MediumTool results and errors need centralized redaction before streaming

Evidence: AgentEvent includes tool-result and error payloads with unknown content, and the API route streams JSON.stringify(event) directly to the client.

Recommended fix: add a single sanitizeAgentEvent(event) layer before SSE serialization and test redaction for tokens, cookies, signed URLs, query strings, session IDs, and provider error text.

MediumExternal web tools need a stricter tool policy for production telemetry

Evidence: fetchVendorStatus uses a vendor enum, while searchPublicPostmortems accepts an LLM-provided query and fetchGithubRecentCommits accepts an LLM-provided public repo name.

Recommended fix: mark each tool as internal, external, read-only, write-capable, cost-bearing, and prompt-injection exposed. Include source URL, fetch time, fallback status, and freshness in external tool results.

MediumDemo deploy scripts need production hardening notes

Evidence: scripts/deploy-vultr.sh:47-88 syncs code to /opt/sentinel, copies .env.local, creates a systemd service, and exposes HTTP :80. scripts/add-cf-tunnel.sh can publish a trycloudflare URL.

Recommended fix: add demo-only warnings, run as a non-root service user, use a secret manager or locked-down env file, require HTTPS/auth proxy for public deployments, and document log retention.

LowRelease gates are too thin for a security-sensitive agent

Evidence: package.json exposes only dev, build, and start; no test, lint, or typecheck scripts were visible.

Recommended fix: add CI for install, typecheck, lint, unit tests, build, and scanner output generation.

Positive Signals

Priority Fix Plan

  1. Put /api/agent behind auth and add rate, concurrency, and spend controls.
  2. Add centralized SSE event redaction with tests for tool results and provider errors.
  3. Add a tool policy table for internal/external, cost-bearing, prompt-exposed, and fallback-capable tools.
  4. Harden deploy docs and scripts for non-root service execution, HTTPS, firewall, auth proxy, and secret handling.
  5. Add CI with typecheck, lint, unit tests, build, and scanner output generation.

Example Validation Commands

node tools/agent-mcp-audit.mjs /path/to/sentinel --json
node tools/agent-mcp-audit.mjs /path/to/sentinel --sarif > agent-mcp-audit.sarif
bun run typecheck
bun run lint
bun test
bun run build

What the Paid Sprint Adds

The paid sprint would go deeper than this public dogfood sample: implementation-ready patches for auth and rate limiting, sanitizer tests, deployment-mode threat table, CI workflow, agent tool policy, and a concise launch handoff for the repo owner.