← Documentation

Optimization

Configure caching, compression, and routing to reduce API costs.

Admin Only — Only organization administrators can modify optimization settings. Members benefit from the applied optimizations.

Optimization Methods

Smart Caching30-50%

Reuse responses for identical or similar requests

Compression10-20%

Reduce token count while preserving meaning

Smart Routing20-40%

Route simple queries to cost-effective models

Security Scanning

Detect sensitive data before sending

Smart Caching

When a request is made, iri saves the response. Subsequent identical or semantically similar requests return the cached response instantly without an API call.

Example

Request 1:"What is the capital of France?"
API call made, response cached
Request 2:"what's the capital of france"
Cache hit, instant response, $0 cost

Settings

Enable Caching

Turn response caching on or off

Cache TTL

Duration to keep cached responses (default: 24 hours)

Semantic Matching

Match similar questions, not just exact text

Prompt Compression

Compression reduces token count by removing redundant content while preserving semantic meaning. Less tokens means lower cost.

Compression Levels

Light10-15%

Removes whitespace and filler words. Minimal quality impact.

Balanced15-25%

Summarizes verbose sections. Good for most use cases.

Aggressive25-40%

Maximum compression. May affect complex task quality.

Start with Balanced for general use. Only use Aggressive for simple, repetitive tasks.

Smart Routing

Analyzes request complexity and routes simple queries to less expensive models while preserving quality for complex tasks.

Example Rules

Complexity < 30%GPT-3.5-turbo
Complexity 30-60%Claude-3-haiku
Complexity > 60%Requested model

Monitor response quality after enabling routing. Adjust thresholds if needed.

Security Scanning

Scans requests for sensitive data before sending to prevent accidental exposure.

Detection Types

  • API keys and tokens
  • Passwords and secrets
  • Credit card numbers
  • Social Security numbers
  • Personal email addresses and phone numbers

Response Options

Warn Only

Log the finding but allow the request

Block Request

Reject requests containing sensitive data

Redact

Replace sensitive data with [REDACTED] and continue

Configuration

  1. Navigate to your organization
  2. Click the Optimization tab
  3. Configure each feature according to your needs
  4. Click Save Changes

Changes apply immediately to all API calls through your organization.

Next Steps