Security infrastructure for the agent era
Four pillars that turn “we gave the bot an API key and hope for the best” into enforceable, verifiable policy.
Unified LLM API
One OpenAI-compatible endpoint for every model and provider.
Point your existing OpenAI SDK at the KeyForge base URL and every model becomes a string parameter — GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Flash and more, streamed or buffered, with JSON mode and tool calls. No per-provider SDKs, no request-shape translation in your agent code, no vendor lock-in baked into your prompts.
agent → vk_ key → KeyForge gateway → [OpenAI | Anthropic | Google | …]
Virtual Keys
Agents hold vk_ tokens. Real provider keys stay encrypted in the vault.
A vk_ key has no mathematical relationship to any provider credential. At request time the gateway resolves it to a policy — allowed models, remaining quota, spend cap, rate limit, expiry — and injects the real key server-side for the outbound call. A compromised agent leaks a capped, revocable token, not your provider account. Revocation is instant and surgical: kill one key, every other agent keeps running.
vk_a7f3… → policy { quota: 20k, cap: $50, rpm: 120 } → sk_… (vault, server-side only)
Rate-Limit Resilience
Key pools auto-shuffle on 429s so agents never stall.
Provider rate limits are enforced per key — so a pool of keys is a pool of independent rate-limit budgets against the same capacity. When a provider returns 429, KeyForge rotates the request onto the next healthy key in your pool and replays it: same provider, same model, same behavior. No exponential-backoff stalls, no silent fallback to a different model that breaks step seven of your pipeline.
429 on key #2 → shuffle → key #3 (same provider/model) → 200 OK
Tamper-Evident Audit Chain
HMAC hash-chained logs — cryptographic proof of what your agents did.
Every request writes an audit entry containing the model, tokens, cost, status, and timestamp. The entry is signed with HMAC-SHA256 and chained to the previous entry’s hash. Insert, delete, or edit any record and every hash after it breaks — tampering is mathematically provable, not just “unlikely.” After the 2026 LiteLLM supply-chain compromise, append-only is no longer enough: verify, don’t trust.
h₁ = HMAC(GENESIS + entry₁) → h₂ = HMAC(h₁ + entry₂) → h₃ = HMAC(h₂ + entry₃) → …