Implemented in v1
Partial / simplified in v1
Future work, not in v1
SMCP
Secure Model Context Protocol
Proposes protocol-level security for MCP: unified identity, mutual authentication, security context propagation, fine-grained policy enforcement, and comprehensive audit logging.
arxiv.org/abs/2602.01129 →
Paper proposes
Unified digital identity for all entities (users, agents, servers, tools) with cryptographically anchored credentials and a Trusted Component Registry.
Implemented
Unified Identity struct for SSO users, API key holders, and agents. All resolve to one type through the auth resolver. Agent credentials include code_hash for tamper detection. Registry is the agents: section in config.
Paper proposes
Mutual authentication protocol with challenge-response (mutual TLS, Ed25519 signatures over nonces) establishing bidirectional assurance.
Partial
Gateway verifies callers (bearer token + code hash), but does not perform mutual auth with upstream MCP servers. v1 trusts upstream servers over TLS. Mutual TLS with upstreams is future work.
Paper proposes
Security context propagation: signed, structured context (session ID, call chain, delegator chain, risk level, data sensitivity, timestamp, nonce) on every request, protected via AEAD.
Partial
Delegation chain propagated via X-SMCP-Delegation-Chain header with depth tracking and allowed_delegators enforcement. Chain is trusted, not cryptographically signed in v1. Nonce-based replay protection is future work.
Paper proposes
Fine-grained policy enforcement with PDP/PEP separation, attribute-based access control, and runtime obligations (redaction, rate limiting, downgrade strategies).
Implemented
Role-based + delegation-depth policy engine with glob matching (slack:*, github:read_*). Roles derived from SSO groups, API key config, or agent config. Per-tool allow/deny with explicit-deny-wins semantics. Rate limiting in Phase 4.
Paper proposes
Comprehensive audit logging across the full agent-tool interaction lifecycle.
Implemented
Structured JSON audit log with identity_type, caller name, role, delegation chain, tool, decision, deny reason, and latency. Supports stdout, file, and OpenTelemetry export.
Paper proposes
AEAD message protection (AES-256-GCM or ChaCha20-Poly1305) ensuring confidentiality, authenticity, and replay resistance for all application-layer messages.
Future
v1 relies on TLS for transport encryption. The reverse-proxy model doesn't require application-layer message encryption if TLS is properly configured between all hops.
Security
Malicious Agent Skills in the Wild
First labeled dataset of malicious agent skills: 98,380 skills analyzed from community registries, 157 confirmed malicious with 632 vulnerabilities. Two archetypes: Data Thieves (credential exfiltration) and Agent Hijackers (instruction manipulation).
arxiv.org/abs/2602.06547 →
Paper finds
Data Thieves: exfiltrate credentials through supply chain techniques. Hidden reverse shells in skill.md files, reading ~/.ssh/id_rsa, connecting to attacker-controlled servers.
Implemented
Tool description scanner (Phase 4) with regex patterns for sensitive file access, credential exfiltration, and hidden instructions. Directly uses the paper's 14 static analysis pattern categories. Tools below reputation threshold are blocked.
Paper finds
Agent Hijackers: subvert agent decision-making through instruction manipulation hidden in tool descriptions and parameter hints.
Implemented
Scanner patterns include ignore.*previous.*instructions, system.*prompt, and prompt injection signatures. Any tool matching these patterns is flagged and blocked before the agent sees it.
Paper finds
Multi-phase kill chains: malicious skills average 4.03 vulnerabilities across a median of 3 kill chain phases. Attacks are deliberate and layered, not incidental.
Partial
Gateway blocks individual tool calls based on static analysis, but does not correlate multi-step attack chains across sequential calls within a session. Cross-call behavioral pattern detection is future work.
Paper finds
Brand impersonation at scale: a single actor accounts for 54.1% of confirmed malicious skills through templated fake tools mimicking popular packages.
Implemented
The gateway model makes this structurally impossible. Only admin-registered MCP servers exist in smcp.yaml. Developers cannot install arbitrary community tools. The gateway is an allowlist, not an open registry.
Paper finds
Minimal vetting in community registries: skills execute with user privileges and are distributed through registries with no security review process.
Implemented
Opposite model. Admin-approved servers only. The credential_mode system further scopes access per server (gateway-managed, passthrough, or optional). Each server's tools can be filtered by policy (hide tools the role can't use).
Framework
The 4C Framework for Agentic AI Security
Organizes agentic security risks across four interdependent dimensions inspired by societal governance: Core (system integrity), Connection (communication + trust), Cognition (beliefs + goals), and Compliance (institutional governance).
arxiv.org/abs/2602.01942 →
Core layer
Protect the agent's “digital body” — the integrity of its runtime, code, infrastructure, and the environment it operates in. Prevent tampering with the agent itself.
Implemented
Agent code_hash verification on every request. The gateway compares the agent's reported hash against the registered value. If the agent's code has been tampered with, the hash won't match and the request is rejected.
Connection layer
How agents communicate, coordinate, and influence one another. Trust establishment between entities. Preventing unauthorized agent-to-agent interaction.
Implemented
Delegation chain tracking via X-SMCP-Delegation-Chain header. allowed_delegators whitelist controls which agents can spawn which. max_delegation_depth limits how far trust can propagate. The gateway is the Connection enforcement point.
Cognition layer
How beliefs, goals, and plans are formed and updated. Protect against goal hijacking, reasoning manipulation, and unintended behavioral drift.
Partial
Tool description scanning catches instruction manipulation (the Agent Hijacker pattern from the Malicious Skills paper). But the gateway does not inspect the agent's internal reasoning or detect goal drift. Cognition-layer protection is better addressed by runtime monitors like TrajAD, not a proxy.
Compliance layer
How agent behavior stays within ethical, legal, and institutional boundaries. Governance, audit trails, and institutional accountability.
Implemented
The entire policy engine + audit system implements Compliance. Every tool call is logged with caller identity, delegation chain, tool name, and allow/deny decision. Policies are YAML files, git-versioned, code-reviewed. SSO ties every action to a real person.
Coverage summary
Implemented
10
of 15 paper points fully addressed in v1 design
Partial
4
simplified for v1, full version in future releases
Future
1
application-layer encryption — TLS sufficient for proxy model