Self-owned dogfood audit report
jackjin1997/sentinel
No-execution dogfood sample for a public autonomous incident-response agent. The review is based on commit 2ebf5a5363db4ee95483ef3ecbae8e0842550131. It is not a commissioned audit, private vulnerability disclosure, or certification.
Executive Summary
sentinel is a useful dogfood target because it is a real autonomous incident-response agent: a browser UI starts a server-side run, multiple model vendors participate in four phases, tools can query internal mock telemetry and external web data, and results stream back to the operator over SSE.
The repo already includes several good defensive choices for a demo-grade agent: request bodies are capped, client disconnects abort provider calls, slow consumers are cut off, tool schemas restrict common inputs, phase output is bounded, and the vendor-status tool uses a fixed vendor allowlist.
The main production risk is the missing operator boundary around a cost-bearing, credential-backed agent endpoint. Before Sentinel is reused outside a controlled demo, it needs authentication, rate/concurrency limits, output redaction, deployment hardening, and CI gates that prove those controls do not regress.
Findings
HighPublic agent run endpoint needs auth, quota, and concurrency boundaries
Evidence: app/page.tsx:119-124 starts a run with a plain POST to /api/agent. app/api/agent/route.ts:19-73 accepts any request with a small JSON body and invokes runIncidentAgent, which can call multiple LLM providers and tools across phases.
Recommended fix: add an auth gate before runIncidentAgent, then enforce per-IP or per-token rate limits, a global concurrency cap, and a per-run budget ceiling.
MediumTool results and errors need centralized redaction before streaming
Evidence: AgentEvent includes tool-result and error payloads with unknown content, and the API route streams JSON.stringify(event) directly to the client.
Recommended fix: add a single sanitizeAgentEvent(event) layer before SSE serialization and test redaction for tokens, cookies, signed URLs, query strings, session IDs, and provider error text.
MediumExternal web tools need a stricter tool policy for production telemetry
Evidence: fetchVendorStatus uses a vendor enum, while searchPublicPostmortems accepts an LLM-provided query and fetchGithubRecentCommits accepts an LLM-provided public repo name.
Recommended fix: mark each tool as internal, external, read-only, write-capable, cost-bearing, and prompt-injection exposed. Include source URL, fetch time, fallback status, and freshness in external tool results.
MediumDemo deploy scripts need production hardening notes
Evidence: scripts/deploy-vultr.sh:47-88 syncs code to /opt/sentinel, copies .env.local, creates a systemd service, and exposes HTTP :80. scripts/add-cf-tunnel.sh can publish a trycloudflare URL.
Recommended fix: add demo-only warnings, run as a non-root service user, use a secret manager or locked-down env file, require HTTPS/auth proxy for public deployments, and document log retention.
LowRelease gates are too thin for a security-sensitive agent
Evidence: package.json exposes only dev, build, and start; no test, lint, or typecheck scripts were visible.
Recommended fix: add CI for install, typecheck, lint, unit tests, build, and scanner output generation.