About iri

AI costs money.
We help you spend it wisely.

Enterprise AI spending will hit $2T in 2026. Most companies have no idea where it goes. We built iri to fix that.

Source: IDC Worldwide AI Spending Guide, 2024

What we do

One product.
Three problems solved.

01

Visibility

See every API call, every user, every dollar—in real time. No more surprise bills.

02

Control

Set budgets that actually stop spending. Per user, per team, per model. Enforced before the API call.

03

Optimization

Route expensive requests to cheaper models when quality doesn't matter. Cache repeated queries. Save 30-50%.

How it works

Swap your API endpoint. Done.

# Before

OPENAI_BASE_URL=https://api.openai.com/v1

# After

OPENAI_BASE_URL=https://api.iri.ai/v1

Works with Cursor, Claude Code, Windsurf, and any OpenAI-compatible client.

50+

Models supported

<20ms

Latency overhead

15K+

Requests/sec

99.9%

Uptime SLA

Infrastructure

Built for speed.
Engineered for scale.

Our proxy handles over 15,000 requests per second with less than 20 milliseconds of overhead. That's less than 3% of a typical AI response time—essentially invisible.

Works seamlessly with Cursor, Claude Code, Windsurf, and any tool that speaks the OpenAI protocol. Use Claude models in Cursor. Use GPT in your Anthropic SDK. One endpoint, every provider.

Our principles

How we build

Prevention first

A limit that stops spending is better than a report about what went wrong. We enforce budgets before the API call, not after.

Day-one useful

If it takes a week to set up, it gets abandoned. You should see value in the first five minutes.

No lock-in

Standard OpenAI-compatible API. Remove iri anytime—your code doesn't change.

Honest pricing

See what it costs before you sign up. No "contact sales" games. Free tier included.

The name

iri — named after a cat who watches everything and doesn't tolerate waste. Lowercase, always.

Start tracking today

Free tier. No credit card. Setup in minutes.