Documentation

Learn iri AI

Everything you need to reduce your AI API costs by up to 40% with intelligent caching, compression, and smart routing.

Quick Start

Get started in 5 minutes

Create your account, configure your organization, and make your first optimized API call.

Topics

Overview

What is iri AI?

iri AI is an optimization layer for AI APIs. It sits between your application and providers like OpenAI, Anthropic, and Google, automatically reducing costs through intelligent caching, prompt compression, and model routing.

Smart Caching

Identical or semantically similar requests return cached responses instantly at zero cost.

Prompt Compression

Automatically reduce token count while preserving semantic meaning.

Intelligent Routing

Route simple queries to cost-effective models without sacrificing quality.

Typical Savings

Response Caching30-50%
Prompt Compression10-20%
Model Routing20-40%

Combined savings typically range from 20-40% depending on usage patterns.

Changelog

Need help? Contact support@iri-ai.com