Proxy & Performance

A high-performance gateway that connects your tools to any AI provider with minimal overhead.

15K+

Requests/sec

<20ms

Latency overhead

99.9%

Uptime

What It Does

The iri proxy sits between your development tools and AI providers. It handles routing, tracking, and policy enforcement—all in the background.

Your Tool→iri→AI Provider

You get visibility into every request, budget controls that actually work, and the freedom to use any AI model from a single endpoint.

Works With

Cursor

AI-powered code editor

Claude Code

Anthropic's CLI tool

Windsurf

AI development environment

Any OpenAI Client

SDKs, scripts, apps

AI Providers

OpenAI

GPT-4o, o3, GPT-5

Anthropic

Claude 4.5, Claude 4

Google

Gemini 3, Gemini 2

Use any model from any provider. Switch between them without changing your code.

Cursor Setup

Open Cursor Settings
Go to Models → OpenAI API Key
Set Base URL to: https://api.iri.ai/v1
Enter your iri API key (from your dashboard)
Select any model—including Claude models

That's it. All your Cursor requests now go through iri with full tracking and controls.

Claude Code Setup

Set one environment variable:

export ANTHROPIC_BASE_URL=https://api.iri.ai

Add this to your shell profile (~/.zshrc or ~/.bashrc) to make it permanent.

SDK / Script Setup

For Python, JavaScript, or any OpenAI-compatible SDK:

# Python

client = OpenAI(base_url="https://api.iri.ai/v1", api_key="your-iri-key")

# JavaScript

const client = new OpenAI({ baseURL: "https://api.iri.ai/v1" });

Performance

Benchmark Results

Throughput15,360 requests/sec

Average latency13-17ms

P99 latency<60ms

What this means: The proxy adds less than 20ms to your requests. Since AI responses typically take 500ms to 10+ seconds, the proxy overhead is less than 3% of total request time—essentially invisible.

With 15K+ requests per second capacity, a single proxy instance can handle thousands of concurrent users without breaking a sweat.

Common Questions

Does it slow down my requests?

No. The proxy adds less than 20ms overhead, which is negligible compared to AI response times (500ms-10s). You won't notice any difference.

Can I use Claude models in Cursor?

Yes. The proxy handles the translation automatically. Just select a Claude model in Cursor and it works.

What if iri goes down?

We maintain 99.9% uptime with redundant infrastructure. If needed, you can switch back to direct API calls instantly—just remove the base URL override.

Is my data secure?

Yes. We don't store your prompts or responses. Request metadata (tokens, timing, model) is used for tracking and billing only.

Next Steps

Set Up Policies→Invite Your Team→Manage API Keys→Full Setup Guide→