ENTERPRISE · SOVEREIGN · ON YOUR INFRASTRUCTUREDocsTrust centerAboutContact sales →

← Developers§ Public API · One front door for the platform

Developers · Public API · 39 endpoints · OpenAI-compatible

Every UI feature.
One API away.

Fourteen endpoints behind one prefix. Chat, search, evaluate, generate, run governed tools. Same RBAC, audit trail, and guardrails as the UI.

OpenAI-compatible drop-in. SSE streaming. Per-key rate limits. Audit logged.

Read the API reference →Get an API key

39 endpoints Streaming · SSE OpenAI-compatible Per-key rate limits

§ 01 · The problemInternal AI vs developer AI shouldn't be different

Two AIs is one too many.
Build it once.

Most enterprises end up with two AI stacks: one for internal users (chat, agents, RAG) and one for customer-facing apps (developer SDKs, raw model APIs). Different policies. Different audit. Different bills. The customer-facing one usually has the weakest controls because it was built fastest.

PROBLEM 01POLICY DRIFT

Two stacks, two policies

Internal agents go through guardrails, audit, eval gates. Customer-facing apps hit a raw provider API. The compliance review only covers one of them.

PROBLEM 02BUDGET BLINDNESS

Two budgets, two surprises

Internal AI billing rolls up neatly. The customer-facing API bill comes in separately, with different unit economics and different blast radius. Engineering finds out at month-end.

PROBLEM 03AUDIT GAP

Two audits, half the truth

When something goes wrong, you have to correlate logs across two systems. The customer who saw a bad response and the internal user who saw a similar one are in different stores. Investigation costs hours.

§ 02 · The 39 endpointsNine capability groups · One prefix · /public/v1

Every capability
has a URL.

Thirty-nine endpoints across nine capability groups: chat, agents, sessions, knowledge, evaluations, prompt optimisation, data flywheel, tools and models, analytics. Each one runs through the same services that power the UI - so policy, permissions, and audit behave identically whether the call came from a browser or a build pipeline.

01 · Agents

Talk to them, list them, manage them, promote them.

5 endpoints

POST/public/v1/chatSSE or syncChat with any published agent by ID. Streaming or final response.

GET/public/v1/agentsPaginatedList published agents with status and metadata.

POST/public/v1/agentsCreate an agent programmatically.

PATCH/public/v1/agents/{id}Update agent config, guardrails, or knowledge bindings.

POST/public/v1/agents/{id}/promoteEval-gatedPromote agent to next environment. Eval threshold enforced.

02 · Knowledge + LLM

Permission-aware retrieval and multi-model completions.

3 endpoints

POST/public/v1/searchPermission-filteredHybrid search across enterprise knowledge. Inherits the user's permissions.

POST/public/v1/completionsOpenAI-compatibleChat completions via the AI Gateway. Tier-based or model-specific. BYOK honoured.

GET/public/v1/modelsEntitlement-filteredList models and tiers the calling key is permitted to use.

03 · Evaluation

CI/CD quality gates without logging into a UI.

3 endpoints

POST/public/v1/evaluations/runAsyncTrigger an eval run against a test suite. Poll for completion.

GET/public/v1/evaluations/{id}Fetch eval results, per-evaluator scores, sample-level details.

POST/public/v1/evaluations/compareCompare two runs for regression detection. Returns diff report.

04 · Documents + Tools

Generative work, delivered as a file.

3 endpoints

POST/public/v1/documents/createdocx · pdf · pptx · xlsxGenerate a document from a natural-language instruction. Returns download URL.

POST/public/v1/tools/executeGovernedExecute a tool. Passes through governance with policy check, PII scan, approvals.

GET/public/v1/toolsList tools available to the calling key.

One prefix

Every endpoint sits under /public/v1. Versioned. Documented in OpenAPI 3.1. Imports as a typed client into Python, Node, Go, and Java.

§ 03 · One call, end-to-endRequest · Streaming response · Final payload

From cURL
to streaming answer.

A real chat call to a published agent. Authentication with a key. Permission inheritance from the user_email header. Streaming Server-Sent Events as the agent thinks, retrieves, and responds. Final payload includes citations, retrieval log, and trace ID for the audit row.

cURLPythonNodeGo

POST /public/v1/chat

1 · REQUEST

$ curl -N https://api.your-org.katonic.ai/public/v1/chat \
  -H "Authorization: Bearer kat_live_..." \
  -H "X-User-Email: sarah@your-org" \
  -H "Content-Type: application/json" \
  -d '{     "agent_id": "hr-assistant",     "messages": [{       "role": "user",       "content": "What's the remote work policy?"     }],     "stream": true   }'

WHAT THIS DOES
· Authenticates with a published API key
· Carries Sarah's identity for permission inheritance
· Streams the response via SSE (stream: true)

2 · STREAMING · SSE

t+0msagent.start
trace_id: 7c3f...
t+82mstool.call
knowledge_search · 'remote work policy'
t+340mstool.result
3 results · permissions filtered for Sarah
t+12msguardrails.check
input ✓ · 8/8 rails passed
t+1.2scompletion.delta
"Our current remote work policy is"
t+1.4scompletion.delta
" documented in HR-2024-08..."▌
t+1.6scompletion.delta
" Up to 3 days per week with..."
t+2.1sguardrails.check
output ✓ · 8/8 rails passed
t+2.2sagent.complete
tokens: 247 · cost: $0.0042

3 · FINAL · agent.complete

{  "trace_id": "7c3f9a2b...",  "answer": "Our current...",  "citations": [    {      "doc_id": "HR-2024-08",      "section": "2.1",      "page": 3    }  ],  "usage": {    "tokens": 247,    "cost_usd": 0.0042,    "latency_ms": 2247  },  "audit": {    "rbac_check": "pass",    "guardrails": "8/8 pass",    "logged": true  }}

SAME GUARANTEES AS THE UI
· trace_id links to the audit log row
· citations only from docs Sarah can read
· guardrails fired before output left platform

● 200 OK · 2.2s total · text/event-streamevent count: 9 · sse → final

Three things every call gets, automatically

Citations

Every grounded claim points back to a document, page, and section. No hallucinations to defend in a customer escalation.

Trace ID

Every call gets a trace ID linking the request to the audit log, the policy decisions, the retrieval log, and the cost meter.

Streaming

SSE by default for chat. Tool calls, retrieval steps, and completion deltas come through as separate events your UI can render progressively.

§ 04 · OpenAI-compatibleOne line · 2,600+ models · Same SDK

Change one line.
Get the platform.

Existing app on the OpenAI SDK? Point base_url at /public/v1, swap the API key, and the same code now routes through Katonic. Multi-provider routing, BYOK, guardrails, audit, and tier picking come along - no rewrite required.

app.py · Pythondiff: 2 lines changed

- BEFORE · DIRECT TO OPENAI

from openai import OpenAI

client = OpenAI(  api_key="sk-...")

resp = client.chat.completions.create(
  model="gpt-4o",
  messages=[{
    "role": "user",
    "content": "Hello"
  }]
)

WHAT YOU LOSE
· No multi-provider routing
· No guardrails on input/output
· No audit log · no permission check
· No team budget enforcement

+ AFTER · KATONIC PUBLIC API

from openai import OpenAI

client = OpenAI(  api_key="kapi_live_...",
  base_url="https://api.your-org            .katonic.ai/public/v1")

resp = client.chat.completions.create(
  model="balanced",  # tier, not model  messages=[{
    "role": "user",
    "content": "Hello"
  }]
)

WHAT YOU GET
· Multi-provider routing (140+ providers)
· Guardrails on input + output (8 rails)
· Audit log · trace ID · permission check
· Team budgets · BYOK enforcement

Note on the model field

You can pass a tier name like "balanced" or a specific model like "claude-sonnet-4". Tiers route to whichever model is currently the best fit per your config; specific model names give you direct control. Both work.

§ 05 · CI/CD for agentsVersion control · Eval gate · Auto-promote

Treat agents like software.
Because they are.

Agent definitions live in your git repo. Pull request changes trigger an eval run. Merge to main promotes to test. Tag a release to promote to prod. The same CI/CD discipline you use for your product, applied to AI - including the eval gates that block bad versions from shipping.

.github/workflows/agent-ci.yml

YAML

name: Agent CI
on:
  pull_request: { paths: ['agents/**'] }
  push: { branches: [main], tags: ['v*'] }

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      # Validate the agent config
      - run: katonic agents validate ./agents/hr.yml

      # Run all 7 evaluators against the test suite
      - run: |
          curl -X POST $KAT_URL/public/v1/evaluations/run \
            -H "Authorization: Bearer $KAT_KEY" \
            -d '{"agent": "hr-assistant",
                  "suite": "hr-eval-v3"}'

      # Compare against the last known-good baseline
      - run: katonic eval compare $RUN_ID --baseline=prod

  promote:
    needs: evaluate
    if: github.ref == 'refs/tags/v*'
    steps:
      - run: katonic agents promote hr-assistant --env=prod
        # Eval gate enforced server-side · blocks if scores fail

Run #284 · feat: HR-2024-08 amendment

● SUCCESS · 2m 14s

✓

validate

Agent config valid

0.4s

✓

evaluations/run

127 cases · 7 evaluators · 98% pass

2m 8s

✓

evaluations/compare

vs prod baseline · +0.04 ↑ improvement

0.6s

✓

promote

Eval gate ✓ · promoted to prod

1.2s

● PROMOTED TO PROD · v0.7View eval report →

Server-side enforcement

Eval gates run on the platform, not in your CI. Even if a developer skips the eval step in their workflow, the promote endpoint still calls the gate. Bad versions cannot reach production by being clever with YAML.

§ 06 · Inside the platformThree real surfaces · Keys · Reference · Quick Start

Three surfaces.
One Developer Hub.

Manage keys at /studio/public-api. Browse the 39 endpoints with live curl examples. Copy a working snippet for Python, TypeScript, or curl. The chrome below is what your developers see on day one.

app.your-org.katonic.ai/studio/public-apiStudio

Developer Hub

39 endpoints · 9 groups · OpenAPI 3.1 · OpenAI-compatible at /public/v1

Active keys

1 rotating

Calls today

284,128

across all keys

Errors / 1k

0.4

0.04% error rate

P95 latency

412ms

across all endpoints

✓New API key created. Copy it now - you won't see it again.

kapi_live_b71d8f4a92c5d05e6f7a8b9c0d1e2f3a4b5c6d7e8f9

Name	Key prefix	Scopes	Team	Rate (used)	Last used	Status
Customer chat widget · prod	kapi_live_8a3f...e92	chatsearchcompletions	Customer Success	120/min 84%	2m ago	● ACTIVE
Mobile app backend	kapi_live_2f1c...c47	chatcompletionsmodels	Mobile team	60/min 23%	12m ago	● ACTIVE
GitHub Actions · CI	kapi_live_9d4b...b83	evaluationsagents/*	Engineering	10/min 0%	3h ago	● ACTIVE
Internal Slack bot	kapi_live_6c2a...a18	chatsearchtools	Engineering	30/min 45%	1m ago	● ACTIVE
Mobile app backend (rotating)	kapi_live_b71d...d05	chatcompletionsmodels	Mobile team	60/min 0%	never	○ ROTATING

What every key gives you, by construction

GUARANTEE 01

Endpoint scopes

Each key only calls endpoints you grant. The CI key can promote agents but cannot chat. The chat widget can chat but cannot promote.

GUARANTEE 02

Per-key rate limits

Calls per minute, per hour, per day. Auto-throttle when exceeded. The chat widget can't accidentally DOS the platform.

GUARANTEE 03

Zero-downtime rotation

Issue a new key, deploy it, revoke the old one. The platform accepts both during the window. Audit logs stay continuous.

§ 07 · By roleFour conversations · Four answers

The questions you'll be asked.
The answers, on hand.

Backend EngineerQ1

How fast can I integrate?

Five minutes for a hello world. Bring your existing OpenAI client, point base_url at /public/v1, get back streaming responses with citations and audit trace IDs. No new SDK to learn. Typed clients available for Python, Node, Go, Java if you want them.

Frontend EngineerQ2

Can I render the streaming UX?

Yes. Every chat call streams Server-Sent Events with typed event names: agent.start, tool.call, completion.delta, agent.complete. Render thinking states, tool calls, and citations as they arrive. The reference UI in the platform Workroom uses this exact same stream.

Platform EngineerQ3

What about rate limits and key management?

Per-key limits at minute, hour, day granularity. Scope each key to specific endpoints. Bind keys to teams for budget attribution. Rotate without downtime - the platform accepts both old and new keys during your migration window.

CISOQ4

Same controls as the UI?

Yes. Every endpoint runs through the same RBAC, guardrails, audit, and policy layers as the UI. The only difference is what's calling - browser session or API key. Audit log includes the key ID so you know exactly which integration triggered each event.

§ 08 · vs the alternativesThree ways to reach the model

Three ways to call AI.
Only one is governable.

✗ GAP01

Direct provider APIs

Fast to start. Painful to govern.

Apps call OpenAI / Anthropic / Google directly. No platform features. Each integration reinvents permissions, audit, retries, rate limits. Compliance review finds you the moment you scale.

+No multi-provider routing
+No guardrails on input/output
+No audit trail
+Reinvent rate limits per app
+Vendor lock-in compounds

○ PARTIAL02

Build your own gateway

Months of platform engineering.

Build the OpenAI-compatible router. Build BYOK key management. Build streaming. Build audit. Build rate limits. Maintain it forever. Half a year before a single feature ships.

+Build OpenAI compat layer
+Build BYOK + key rotation
+Build streaming + SSE
+Build audit log + trace IDs
+Maintain compat with provider changes

✓ COMPLETE03

Katonic Public API

39 endpoints. Same governance. Day one.

Every UI capability has a URL. Drop-in OpenAI compatibility. SSE streaming with typed events. Per-key scopes, rate limits, and team budgets. Audit trace IDs link to the same logs as the UI.

+39 endpoints · /public/v1
+OpenAI-compatible · 1-line switch
+SSE streaming with typed events
+Per-key scopes · rate limits · team budget
+Same RBAC, audit, guardrails as UI

§ 09 · The positionOne platform, two surfaces, same controls

Most enterprises end up with two AI stacks. Internal users get the curated version with audit and guardrails. Customer-facing apps get raw API calls because that ships in two weeks. The compliance review only sees one of them. The customer escalation comes from the other. The platform's job is to make sure that does not happen - one set of services, two surfaces, same controls.
Prem Naraindas
Founder & CEO, Katonic AI
Read the founder\'s letter →

§ 10 · ExploreAdjacent surfaces

Beyond the API,
where it connects.

§ A→

MCP Server

Expose your Katonic agents and knowledge as MCP tools that other AI systems can call. Same governance, two protocols.

§ B→

SDKs

Typed clients for Python, Node, Go, and Java. Generated from the OpenAPI spec. Update with every release.

§ C→

Build an agent

The same agents you call via the API are built in the agent builder. Five paths from guided wizard to bring-your-own-code.

§ 11 · Next stepsGet a key · Read the docs · Ship in 5 minutes

Get a key.
Ship hello world in 5 minutes.

Sandbox access in 24 hours. Comes pre-loaded with a sample agent, an API key with chat scope, and a cURL command in the docs that you paste into your terminal to make your first call.

Then point your existing OpenAI client at /public/v1 and you've migrated.

Request sandbox →Read API reference

Ready to get started?

Deploy sovereign AI on your infrastructure - in weeks, not months.

Book a demo →

← Developers§ Public API · One front door for the platform

Developers · Public API · 39 endpoints · OpenAI-compatible

Every UI feature.
One API away.

Fourteen endpoints behind one prefix. Chat, search, evaluate, generate, run governed tools. Same RBAC, audit trail, and guardrails as the UI.

OpenAI-compatible drop-in. SSE streaming. Per-key rate limits. Audit logged.

Read the API reference →Get an API key

39 endpoints Streaming · SSE OpenAI-compatible Per-key rate limits

§ 01 · The problemInternal AI vs developer AI shouldn't be different

Two AIs is one too many.
Build it once.

PROBLEM 01POLICY DRIFT

Two stacks, two policies

Internal agents go through guardrails, audit, eval gates. Customer-facing apps hit a raw provider API. The compliance review only covers one of them.

PROBLEM 02BUDGET BLINDNESS

Two budgets, two surprises

Internal AI billing rolls up neatly. The customer-facing API bill comes in separately, with different unit economics and different blast radius. Engineering finds out at month-end.

PROBLEM 03AUDIT GAP

Two audits, half the truth

§ 02 · The 39 endpointsNine capability groups · One prefix · /public/v1

Every capability
has a URL.

01 · Agents

Talk to them, list them, manage them, promote them.

5 endpoints

POST/public/v1/chatSSE or syncChat with any published agent by ID. Streaming or final response.

GET/public/v1/agentsPaginatedList published agents with status and metadata.

POST/public/v1/agentsCreate an agent programmatically.

PATCH/public/v1/agents/{id}Update agent config, guardrails, or knowledge bindings.

POST/public/v1/agents/{id}/promoteEval-gatedPromote agent to next environment. Eval threshold enforced.

02 · Knowledge + LLM

Permission-aware retrieval and multi-model completions.

3 endpoints

POST/public/v1/searchPermission-filteredHybrid search across enterprise knowledge. Inherits the user's permissions.

POST/public/v1/completionsOpenAI-compatibleChat completions via the AI Gateway. Tier-based or model-specific. BYOK honoured.

GET/public/v1/modelsEntitlement-filteredList models and tiers the calling key is permitted to use.

03 · Evaluation

CI/CD quality gates without logging into a UI.

3 endpoints

POST/public/v1/evaluations/runAsyncTrigger an eval run against a test suite. Poll for completion.

GET/public/v1/evaluations/{id}Fetch eval results, per-evaluator scores, sample-level details.

POST/public/v1/evaluations/compareCompare two runs for regression detection. Returns diff report.

04 · Documents + Tools

Generative work, delivered as a file.

3 endpoints

POST/public/v1/documents/createdocx · pdf · pptx · xlsxGenerate a document from a natural-language instruction. Returns download URL.

POST/public/v1/tools/executeGovernedExecute a tool. Passes through governance with policy check, PII scan, approvals.

GET/public/v1/toolsList tools available to the calling key.

One prefix

Every endpoint sits under /public/v1. Versioned. Documented in OpenAPI 3.1. Imports as a typed client into Python, Node, Go, and Java.

§ 03 · One call, end-to-endRequest · Streaming response · Final payload

From cURL
to streaming answer.

cURLPythonNodeGo

POST /public/v1/chat

1 · REQUEST

$ curl -N https://api.your-org.katonic.ai/public/v1/chat \
  -H "Authorization: Bearer kat_live_..." \
  -H "X-User-Email: sarah@your-org" \
  -H "Content-Type: application/json" \
  -d '{     "agent_id": "hr-assistant",     "messages": [{       "role": "user",       "content": "What's the remote work policy?"     }],     "stream": true   }'

WHAT THIS DOES
· Authenticates with a published API key
· Carries Sarah's identity for permission inheritance
· Streams the response via SSE (stream: true)

2 · STREAMING · SSE

t+0msagent.start
trace_id: 7c3f...
t+82mstool.call
knowledge_search · 'remote work policy'
t+340mstool.result
3 results · permissions filtered for Sarah
t+12msguardrails.check
input ✓ · 8/8 rails passed
t+1.2scompletion.delta
"Our current remote work policy is"
t+1.4scompletion.delta
" documented in HR-2024-08..."▌
t+1.6scompletion.delta
" Up to 3 days per week with..."
t+2.1sguardrails.check
output ✓ · 8/8 rails passed
t+2.2sagent.complete
tokens: 247 · cost: $0.0042

3 · FINAL · agent.complete

{  "trace_id": "7c3f9a2b...",  "answer": "Our current...",  "citations": [    {      "doc_id": "HR-2024-08",      "section": "2.1",      "page": 3    }  ],  "usage": {    "tokens": 247,    "cost_usd": 0.0042,    "latency_ms": 2247  },  "audit": {    "rbac_check": "pass",    "guardrails": "8/8 pass",    "logged": true  }}

SAME GUARANTEES AS THE UI
· trace_id links to the audit log row
· citations only from docs Sarah can read
· guardrails fired before output left platform

● 200 OK · 2.2s total · text/event-streamevent count: 9 · sse → final

Three things every call gets, automatically

Citations

Every grounded claim points back to a document, page, and section. No hallucinations to defend in a customer escalation.

Trace ID

Every call gets a trace ID linking the request to the audit log, the policy decisions, the retrieval log, and the cost meter.

Streaming

SSE by default for chat. Tool calls, retrieval steps, and completion deltas come through as separate events your UI can render progressively.

§ 04 · OpenAI-compatibleOne line · 2,600+ models · Same SDK

Change one line.
Get the platform.

app.py · Pythondiff: 2 lines changed

- BEFORE · DIRECT TO OPENAI

from openai import OpenAI

client = OpenAI(  api_key="sk-...")

resp = client.chat.completions.create(
  model="gpt-4o",
  messages=[{
    "role": "user",
    "content": "Hello"
  }]
)

WHAT YOU LOSE
· No multi-provider routing
· No guardrails on input/output
· No audit log · no permission check
· No team budget enforcement

+ AFTER · KATONIC PUBLIC API

from openai import OpenAI

client = OpenAI(  api_key="kapi_live_...",
  base_url="https://api.your-org            .katonic.ai/public/v1")

resp = client.chat.completions.create(
  model="balanced",  # tier, not model  messages=[{
    "role": "user",
    "content": "Hello"
  }]
)

WHAT YOU GET
· Multi-provider routing (140+ providers)
· Guardrails on input + output (8 rails)
· Audit log · trace ID · permission check
· Team budgets · BYOK enforcement

Note on the model field

§ 05 · CI/CD for agentsVersion control · Eval gate · Auto-promote

Treat agents like software.
Because they are.

.github/workflows/agent-ci.yml

YAML

name: Agent CI
on:
  pull_request: { paths: ['agents/**'] }
  push: { branches: [main], tags: ['v*'] }

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      # Validate the agent config
      - run: katonic agents validate ./agents/hr.yml

      # Run all 7 evaluators against the test suite
      - run: |
          curl -X POST $KAT_URL/public/v1/evaluations/run \
            -H "Authorization: Bearer $KAT_KEY" \
            -d '{"agent": "hr-assistant",
                  "suite": "hr-eval-v3"}'

      # Compare against the last known-good baseline
      - run: katonic eval compare $RUN_ID --baseline=prod

  promote:
    needs: evaluate
    if: github.ref == 'refs/tags/v*'
    steps:
      - run: katonic agents promote hr-assistant --env=prod
        # Eval gate enforced server-side · blocks if scores fail

Run #284 · feat: HR-2024-08 amendment

● SUCCESS · 2m 14s

✓

validate

Agent config valid

0.4s

✓

evaluations/run

127 cases · 7 evaluators · 98% pass

2m 8s

✓

evaluations/compare

vs prod baseline · +0.04 ↑ improvement

0.6s

✓

promote

Eval gate ✓ · promoted to prod

1.2s

● PROMOTED TO PROD · v0.7View eval report →

Server-side enforcement

§ 06 · Inside the platformThree real surfaces · Keys · Reference · Quick Start

Three surfaces.
One Developer Hub.

Manage keys at /studio/public-api. Browse the 39 endpoints with live curl examples. Copy a working snippet for Python, TypeScript, or curl. The chrome below is what your developers see on day one.

app.your-org.katonic.ai/studio/public-apiStudio

Developer Hub

39 endpoints · 9 groups · OpenAPI 3.1 · OpenAI-compatible at /public/v1

Active keys

1 rotating

Calls today

284,128

across all keys

Errors / 1k

0.4

0.04% error rate

P95 latency

412ms

across all endpoints

✓New API key created. Copy it now - you won't see it again.

kapi_live_b71d8f4a92c5d05e6f7a8b9c0d1e2f3a4b5c6d7e8f9

Name	Key prefix	Scopes	Team	Rate (used)	Last used	Status
Customer chat widget · prod	kapi_live_8a3f...e92	chatsearchcompletions	Customer Success	120/min 84%	2m ago	● ACTIVE
Mobile app backend	kapi_live_2f1c...c47	chatcompletionsmodels	Mobile team	60/min 23%	12m ago	● ACTIVE
GitHub Actions · CI	kapi_live_9d4b...b83	evaluationsagents/*	Engineering	10/min 0%	3h ago	● ACTIVE
Internal Slack bot	kapi_live_6c2a...a18	chatsearchtools	Engineering	30/min 45%	1m ago	● ACTIVE
Mobile app backend (rotating)	kapi_live_b71d...d05	chatcompletionsmodels	Mobile team	60/min 0%	never	○ ROTATING

What every key gives you, by construction

GUARANTEE 01

Endpoint scopes

Each key only calls endpoints you grant. The CI key can promote agents but cannot chat. The chat widget can chat but cannot promote.

GUARANTEE 02

Per-key rate limits

Calls per minute, per hour, per day. Auto-throttle when exceeded. The chat widget can't accidentally DOS the platform.

GUARANTEE 03

Zero-downtime rotation

Issue a new key, deploy it, revoke the old one. The platform accepts both during the window. Audit logs stay continuous.

§ 07 · By roleFour conversations · Four answers

The questions you'll be asked.
The answers, on hand.

Backend EngineerQ1

How fast can I integrate?

Frontend EngineerQ2

Can I render the streaming UX?

Platform EngineerQ3

What about rate limits and key management?

CISOQ4

Same controls as the UI?

§ 08 · vs the alternativesThree ways to reach the model

Three ways to call AI.
Only one is governable.

✗ GAP01

Direct provider APIs

Fast to start. Painful to govern.

Apps call OpenAI / Anthropic / Google directly. No platform features. Each integration reinvents permissions, audit, retries, rate limits. Compliance review finds you the moment you scale.

+No multi-provider routing
+No guardrails on input/output
+No audit trail
+Reinvent rate limits per app
+Vendor lock-in compounds

○ PARTIAL02

Build your own gateway

Months of platform engineering.

Build the OpenAI-compatible router. Build BYOK key management. Build streaming. Build audit. Build rate limits. Maintain it forever. Half a year before a single feature ships.

+Build OpenAI compat layer
+Build BYOK + key rotation
+Build streaming + SSE
+Build audit log + trace IDs
+Maintain compat with provider changes

✓ COMPLETE03

Katonic Public API

39 endpoints. Same governance. Day one.

Every UI capability has a URL. Drop-in OpenAI compatibility. SSE streaming with typed events. Per-key scopes, rate limits, and team budgets. Audit trace IDs link to the same logs as the UI.

+39 endpoints · /public/v1
+OpenAI-compatible · 1-line switch
+SSE streaming with typed events
+Per-key scopes · rate limits · team budget
+Same RBAC, audit, guardrails as UI

§ 09 · The positionOne platform, two surfaces, same controls

Most enterprises end up with two AI stacks. Internal users get the curated version with audit and guardrails. Customer-facing apps get raw API calls because that ships in two weeks. The compliance review only sees one of them. The customer escalation comes from the other. The platform's job is to make sure that does not happen - one set of services, two surfaces, same controls.
Prem Naraindas
Founder & CEO, Katonic AI
Read the founder\'s letter →

§ 10 · ExploreAdjacent surfaces

Beyond the API,
where it connects.

§ A→

Get a key.
Ship hello world in 5 minutes.

Sandbox access in 24 hours. Comes pre-loaded with a sample agent, an API key with chat scope, and a cURL command in the docs that you paste into your terminal to make your first call.

Then point your existing OpenAI client at /public/v1 and you've migrated.

Request sandbox →Read API reference

Every UI feature.One API away.

Two AIs is one too many.Build it once.

Two stacks, two policies

Two budgets, two surprises

Two audits, half the truth

Every capabilityhas a URL.

From cURLto streaming answer.

Citations

Trace ID

Streaming

Change one line.Get the platform.

Treat agents like software.Because they are.

Three surfaces.One Developer Hub.

Endpoint scopes

Per-key rate limits

Zero-downtime rotation

The questions you'll be asked.The answers, on hand.

Three ways to call AI.Only one is governable.

Direct provider APIs

Build your own gateway

Katonic Public API

Beyond the API,where it connects.

MCP Server

SDKs

Build an agent

Get a key.Ship hello world in 5 minutes.

Every UI feature.One API away.

Two AIs is one too many.Build it once.

Two stacks, two policies

Two budgets, two surprises

Two audits, half the truth

Every capabilityhas a URL.

From cURLto streaming answer.

Citations

Trace ID

Streaming

Change one line.Get the platform.

Treat agents like software.Because they are.

Three surfaces.One Developer Hub.

Endpoint scopes

Per-key rate limits

Zero-downtime rotation

The questions you'll be asked.The answers, on hand.

Three ways to call AI.Only one is governable.

Direct provider APIs

Build your own gateway

Katonic Public API

Beyond the API,where it connects.

MCP Server

SDKs

Build an agent

Get a key.Ship hello world in 5 minutes.

Every UI feature.
One API away.

Two AIs is one too many.
Build it once.

Every capability
has a URL.

From cURL
to streaming answer.

Change one line.
Get the platform.

Treat agents like software.
Because they are.

Three surfaces.
One Developer Hub.

The questions you'll be asked.
The answers, on hand.

Three ways to call AI.
Only one is governable.

Beyond the API,
where it connects.

Get a key.
Ship hello world in 5 minutes.

Every UI feature.
One API away.

Two AIs is one too many.
Build it once.

Every capability
has a URL.

From cURL
to streaming answer.

Change one line.
Get the platform.

Treat agents like software.
Because they are.

Three surfaces.
One Developer Hub.

The questions you'll be asked.
The answers, on hand.

Three ways to call AI.
Only one is governable.

Beyond the API,
where it connects.

Get a key.
Ship hello world in 5 minutes.