Proxy & Performance
A high-performance gateway that connects your tools to any AI provider with minimal overhead.
15K+
Requests/sec
<20ms
Latency overhead
99.9%
Uptime
What It Does
The iri proxy sits between your development tools and AI providers. It handles routing, tracking, and policy enforcement—all in the background.
You get visibility into every request, budget controls that actually work, and the freedom to use any AI model from a single endpoint.
Works With
Cursor
AI-powered code editor
Claude Code
Anthropic's CLI tool
Windsurf
AI development environment
Any OpenAI Client
SDKs, scripts, apps
AI Providers
OpenAI
GPT-4o, o3, GPT-5
Anthropic
Claude 4.5, Claude 4
Gemini 3, Gemini 2
Use any model from any provider. Switch between them without changing your code.
Cursor Setup
- Open Cursor Settings
- Go to Models → OpenAI API Key
- Set Base URL to:
https://api.iri.ai/v1 - Enter your iri API key (from your dashboard)
- Select any model—including Claude models
That's it. All your Cursor requests now go through iri with full tracking and controls.
Claude Code Setup
Set one environment variable:
Add this to your shell profile (~/.zshrc or ~/.bashrc) to make it permanent.
SDK / Script Setup
For Python, JavaScript, or any OpenAI-compatible SDK:
# Python
client = OpenAI(base_url="https://api.iri.ai/v1", api_key="your-iri-key")
# JavaScript
const client = new OpenAI({ baseURL: "https://api.iri.ai/v1" });
Performance
Benchmark Results
What this means: The proxy adds less than 20ms to your requests. Since AI responses typically take 500ms to 10+ seconds, the proxy overhead is less than 3% of total request time—essentially invisible.
With 15K+ requests per second capacity, a single proxy instance can handle thousands of concurrent users without breaking a sweat.
Common Questions
Does it slow down my requests?
No. The proxy adds less than 20ms overhead, which is negligible compared to AI response times (500ms-10s). You won't notice any difference.
Can I use Claude models in Cursor?
Yes. The proxy handles the translation automatically. Just select a Claude model in Cursor and it works.
What if iri goes down?
We maintain 99.9% uptime with redundant infrastructure. If needed, you can switch back to direct API calls instantly—just remove the base URL override.
Is my data secure?
Yes. We don't store your prompts or responses. Request metadata (tokens, timing, model) is used for tracking and billing only.