AI Cost Firewall
Documentation for the OpenAI-compatible gateway that reduces wasted LLM spend with exact and semantic caching.
- • Quickstart and Docker Compose setup
- • Configuration reference
- • Redis, Qdrant, Prometheus, and Grafana
- • Semantic cache lifecycle and diagnostics