ENTERPRISE · SOVEREIGN · ON YOUR INFRASTRUCTUREDocsTrust centerAboutContact sales →

← Developers§ Fine-tuning · Supervised + RL · On your GPUs

Developers · Fine-tuning · NeMo + OpenPipe ART

Your data.
Your GPUs. Your model.

Supervised via NVIDIA NeMo and reinforcement learning via OpenPipe ART - in one workspace, on your GPUs.

Both deploy straight to the AI Gateway. Zero data egress.

Start a fine-tuning job →Read the docs

§ 01 · Why most teams give up on fine-tuningThe infrastructure tax · 90% of the cost

Most teams
never get past the setup.

Fine-tuning sounds simple in a paper. In practice it means provisioning GPUs, picking a framework, wiring up a training loop, hosting the artifact, deploying behind a router that handles tokens and rate limits. Then doing the whole thing again when you want to try a different approach. By the time the first model is live, the team has burned weeks before they've evaluated whether the fine-tune actually helps.

PROBLEM 01INFRA WALL

GPU procurement is the gate

Get the GPUs first. Then schedule them. Then make sure prod doesn't get starved when training kicks off. Most teams stop right here.

PROBLEM 02FRAMEWORK SPRAWL

Two frameworks for two problems

SFT teaches knowledge; RL shapes behaviour. They use different tools, different formats, different deployment paths. Most platforms pick one and tax you for the other.

PROBLEM 03LAST-MILE GAP

Deployment is its own project

After the artifact lands, you still need a serving runtime, a routing layer, a billing dimension, and a way for agents to actually call the new model. Months of plumbing.

§ 02 · Inside the platformThe actual Fine-tuning page

Pick a base.
Pick a path. Click start.

This is what your team sees at /studio/fine-tuning. Two tabs - Supervised (NeMo) and RL (OpenPipe) - with the same shape. Hyperparameter form on top, job list with live status below. No notebook, no DevOps ticket, no GPU procurement form.

app.your-org.katonic.ai/studio/fine-tuningStudio

Fine-tuning

Train custom models on your data. Two paths: Supervised (NeMo) for knowledge, RL (OpenPipe) for behaviour.

3 GPUs available

New Supervised Fine-Tune

Dataset Path / URI

/data/customer-support-2024.jsonl

Base Model

llama-3.1-8b▾

Learning Rate

0.00002

Epochs

3

Batch Size

8

Supervised Jobs · 5

1 deployed · 1 ready · 1 training · 1 queued · 1 failed

Job ID	Model	Status	Created
ft_nemo_a3f8c2	llama-3.1-8b	● deployed	Mar 14, 2024
ft_nemo_b9d4e1	llama-3.1-8b	● completed	Mar 18, 2024
ft_nemo_c2e7a9	qwen-2.5-7b	● training67%	Mar 22, 2024
ft_nemo_d8f3b6	llama-3.1-8b	● queued	Mar 22, 2024
ft_nemo_e1a5c7	mistral-7b-v3	● failed	Mar 20, 2024

This is the actual productThe screenshots above are not concepts. /studio/fine-tuning renders this in your sandbox today, including the two-tab structure, the live job table with status colors, and the deploy-to-Gateway button.

§ 03 · Which pathSFT for knowledge · RL for behaviour

Different problems.
Different paths.

Most teams default to whichever method they read about first. The picks should follow the problem. SFT teaches a model what the answer looks like. RL teaches a model how to behave when there's no single right answer - just better and worse outcomes measured by reward.

PATH 01 · SFT

Supervised (NeMo)

When you have labeled data.

You have input/output pairs. Customer support ticket → ideal response. Document → summary in your house style. Medical chart → SOAP note format.

Best for

·Domain knowledge transfer
·Style and tone matching
·Output format enforcement
·Translation to internal vocab
·Compression of long prompts

PATH 02 · RL

Reinforcement (OpenPipe ART)

When you have outcomes, not answers.

You have multi-turn agent trajectories with reward signals. Sales conversations that closed vs didn't. Tool-call sequences that resolved vs failed. Long horizons where the right next step depends on what came before.

Best for

·Multi-step agent behaviour
·Tool-use sequencing
·Multi-turn negotiation/sales
·Reasoning-step optimisation
·Conditioning on rewards

§ 04 · The lifecycleSix stages · Same workspace

Dataset to deployment.
Six stages, one workspace.

The fine-tuning loop runs end to end inside the platform. Data preparation, training, deployment, and evaluation all land in the same workspace. No exporting weights to a notebook, no separate hosting layer, no cron job watching for completion.

STAGE 01📂

Prepare data

JSONL for SFT (input/output pairs) or trajectory JSON for RL (turns + reward). Datasets versioned in the platform's data lake. Quality gates check for size, balance, leakage.

STAGE 02⚙

Pick base + hyperparams

Choose from the registered open-weight models in your AI Gateway. Default hyperparams cover 80% of cases - learning rate, epochs, batch size. Override for advanced tuning.

STAGE 03▢

Schedule on your GPUs

Job lands on the KAI scheduler with your team's GPU quota. Spot priority for non-critical, guaranteed for production fine-tunes. No external GPU procurement needed.

STAGE 04↻

Train + validate

Live status updates: queued → running → training → completed. Loss curves and validation metrics surface in real time. Failed jobs include log access and root-cause hints.

STAGE 05↗

Deploy to Gateway

One click registers the fine-tuned model with the AI Gateway. It becomes selectable from the model dropdown in any agent config. Same auth, same routing, same audit.

STAGE 06✓

Eval + promote

Run the model against your eval datasets before promoting. Eval gates can block promotion if scores drop. Rollback re-points the active version - one click.

§ 05 · By roleFour conversations · Four answers

The questions you'll be asked.
The answers, on hand.

ML EngineerQ1

Can I bring my own framework?

Yes. The fine-tuning page exposes NeMo and OpenPipe ART as default paths because most teams want one of those two. If you have a custom training script, you can run it as a job in a Code Builder workspace with your team's GPU quota. The deploy-to-Gateway step works the same way.

Data ScientistQ2

How do I version datasets?

Datasets are first-class objects with versions. Upload a JSONL or trajectory file, the platform versions it, and every fine-tuning job records the dataset hash it trained on. Reproducible runs by default. The dataset object also tracks who can read/write it via the same RBAC as agents and knowledge.

Platform EngineerQ3

What about GPU isolation?

Jobs run under the KAI scheduler with team-level quotas and per-job priority. Production agents have guaranteed GPU; fine-tuning lands on spot priority by default but can be promoted. Failed or cancelled jobs free their GPUs immediately. No leaked tenants.

ComplianceQ4

Where does the data go?

Nowhere. The dataset, the training job, the resulting weights, and the deployment all stay inside your infrastructure. NeMo and OpenPipe run in your VPC. The Gateway routes to local model servers. Zero data egress is architectural, not a setting.

§ 06 · vs the alternativesThree fine-tuning strategies

Three fine-tuning strategies.
One that respects your data.

✗ GAP01

Notebook + cloud GPUs

Free-form. Slow. Hard to share.

Spin up a Colab or SageMaker notebook, write the training loop, host it manually, wire up a router. Each fine-tune is a one-off project. Re-running it next quarter means remembering the exact recipe.

+Manual GPU procurement
+Bring your own deployment
+No shared dataset versioning
+No quality gates
+No team RBAC on artifacts

○ PARTIAL02

Vendor-only fine-tuning API

Easy to start. Locked in.

Send your training data to the model vendor's hosted fine-tuning API. They train, they host, they price. Your data leaves your infrastructure. The fine-tune is opaque, non-portable, and tied to one model family.

+Data leaves your VPC
+Single model family
+No SFT vs RL choice
+Pricing tied to vendor
+Cannot export weights

✓ COMPLETE03

Katonic Fine-tuning

Two paths. One workspace. Your GPUs.

Supervised via NeMo or RL via OpenPipe ART. Both run on your infrastructure. Both deploy to the AI Gateway with one click. Datasets versioned, jobs tracked, weights yours to export.

+Supervised (NeMo) and RL (OpenPipe)
+Runs on your KAI-scheduled GPUs
+Auto-deploy to AI Gateway
+Eval gates before promotion
+0% data egress · weights yours

§ 07 · The positionTwo paths · Your data · Your weights

Most platforms force a choice: ship your data to a vendor and lose control, or build the whole fine-tuning stack yourself and lose six months. We picked: integrate the two best open-source fine-tuning frameworks - NeMo for supervised, OpenPipe ART for RL - and run them on the GPUs you already have. The data never moves. The weights are yours. The deployment is one click. Fine-tuning should be a feature of the platform, not a separate project.
Prem Naraindas
Founder & CEO, Katonic AI
Read the founder\'s letter →

§ 08 · ExploreAdjacent surfaces

Beyond the fine-tune,
where it connects.

§ A→

AI Gateway

Where your fine-tuned models land. The Gateway routes between 2,600+ stock models and your custom weights with the same API.

§ B→

Evaluation

Run the fine-tuned model against your eval datasets before promotion. The eval gate enforces quality thresholds.

§ C→

SDKs

Trigger fine-tuning jobs and poll status from the Katonic SDK. Wire fine-tuning into CI alongside evaluation gates.

§ 09 · Next stepsSandbox · sample dataset · GPU quota

Train your first model.
On your data. Today.

Sandbox access in 24 hours. Comes with a sample dataset (1,000 customer support tickets), GPU quota for one fine-tune, and a pre-configured base model. From dataset upload to deployed model in an afternoon.

Then bring your own data and run for real.

Request sandbox →Read the Fine-tuning docs

Ready to get started?

Deploy sovereign AI on your infrastructure - in weeks, not months.

Book a demo →

← Developers§ Fine-tuning · Supervised + RL · On your GPUs

Developers · Fine-tuning · NeMo + OpenPipe ART

Your data.
Your GPUs. Your model.

Supervised via NVIDIA NeMo and reinforcement learning via OpenPipe ART - in one workspace, on your GPUs.

Both deploy straight to the AI Gateway. Zero data egress.

Start a fine-tuning job →Read the docs

§ 01 · Why most teams give up on fine-tuningThe infrastructure tax · 90% of the cost

Most teams
never get past the setup.

PROBLEM 01INFRA WALL

GPU procurement is the gate

Get the GPUs first. Then schedule them. Then make sure prod doesn't get starved when training kicks off. Most teams stop right here.

PROBLEM 02FRAMEWORK SPRAWL

Two frameworks for two problems

SFT teaches knowledge; RL shapes behaviour. They use different tools, different formats, different deployment paths. Most platforms pick one and tax you for the other.

PROBLEM 03LAST-MILE GAP

Deployment is its own project

After the artifact lands, you still need a serving runtime, a routing layer, a billing dimension, and a way for agents to actually call the new model. Months of plumbing.

§ 02 · Inside the platformThe actual Fine-tuning page

Pick a base.
Pick a path. Click start.

app.your-org.katonic.ai/studio/fine-tuningStudio

Fine-tuning

Train custom models on your data. Two paths: Supervised (NeMo) for knowledge, RL (OpenPipe) for behaviour.

3 GPUs available

New Supervised Fine-Tune

Dataset Path / URI

/data/customer-support-2024.jsonl

Base Model

llama-3.1-8b▾

Learning Rate

0.00002

Epochs

3

Batch Size

8

Supervised Jobs · 5

1 deployed · 1 ready · 1 training · 1 queued · 1 failed

Job ID	Model	Status	Created
ft_nemo_a3f8c2	llama-3.1-8b	● deployed	Mar 14, 2024
ft_nemo_b9d4e1	llama-3.1-8b	● completed	Mar 18, 2024
ft_nemo_c2e7a9	qwen-2.5-7b	● training67%	Mar 22, 2024
ft_nemo_d8f3b6	llama-3.1-8b	● queued	Mar 22, 2024
ft_nemo_e1a5c7	mistral-7b-v3	● failed	Mar 20, 2024

§ 03 · Which pathSFT for knowledge · RL for behaviour

Different problems.
Different paths.

PATH 01 · SFT

Supervised (NeMo)

When you have labeled data.

You have input/output pairs. Customer support ticket → ideal response. Document → summary in your house style. Medical chart → SOAP note format.

Best for

·Domain knowledge transfer
·Style and tone matching
·Output format enforcement
·Translation to internal vocab
·Compression of long prompts

PATH 02 · RL

Reinforcement (OpenPipe ART)

When you have outcomes, not answers.

Best for

·Multi-step agent behaviour
·Tool-use sequencing
·Multi-turn negotiation/sales
·Reasoning-step optimisation
·Conditioning on rewards

§ 04 · The lifecycleSix stages · Same workspace

Dataset to deployment.
Six stages, one workspace.

STAGE 01📂

Prepare data

JSONL for SFT (input/output pairs) or trajectory JSON for RL (turns + reward). Datasets versioned in the platform's data lake. Quality gates check for size, balance, leakage.

STAGE 02⚙

Pick base + hyperparams

Choose from the registered open-weight models in your AI Gateway. Default hyperparams cover 80% of cases - learning rate, epochs, batch size. Override for advanced tuning.

STAGE 03▢

Schedule on your GPUs

Job lands on the KAI scheduler with your team's GPU quota. Spot priority for non-critical, guaranteed for production fine-tunes. No external GPU procurement needed.

STAGE 04↻

Train + validate

Live status updates: queued → running → training → completed. Loss curves and validation metrics surface in real time. Failed jobs include log access and root-cause hints.

STAGE 05↗

Deploy to Gateway

One click registers the fine-tuned model with the AI Gateway. It becomes selectable from the model dropdown in any agent config. Same auth, same routing, same audit.

STAGE 06✓

Eval + promote

Run the model against your eval datasets before promoting. Eval gates can block promotion if scores drop. Rollback re-points the active version - one click.

§ 05 · By roleFour conversations · Four answers

The questions you'll be asked.
The answers, on hand.

ML EngineerQ1

Can I bring my own framework?

Data ScientistQ2

How do I version datasets?

Platform EngineerQ3

What about GPU isolation?

ComplianceQ4

Where does the data go?

§ 06 · vs the alternativesThree fine-tuning strategies

Three fine-tuning strategies.
One that respects your data.

✗ GAP01

Notebook + cloud GPUs

Free-form. Slow. Hard to share.

Spin up a Colab or SageMaker notebook, write the training loop, host it manually, wire up a router. Each fine-tune is a one-off project. Re-running it next quarter means remembering the exact recipe.

+Manual GPU procurement
+Bring your own deployment
+No shared dataset versioning
+No quality gates
+No team RBAC on artifacts

○ PARTIAL02

Vendor-only fine-tuning API

Easy to start. Locked in.

+Data leaves your VPC
+Single model family
+No SFT vs RL choice
+Pricing tied to vendor
+Cannot export weights

✓ COMPLETE03

Katonic Fine-tuning

Two paths. One workspace. Your GPUs.

Supervised via NeMo or RL via OpenPipe ART. Both run on your infrastructure. Both deploy to the AI Gateway with one click. Datasets versioned, jobs tracked, weights yours to export.

+Supervised (NeMo) and RL (OpenPipe)
+Runs on your KAI-scheduled GPUs
+Auto-deploy to AI Gateway
+Eval gates before promotion
+0% data egress · weights yours

§ 07 · The positionTwo paths · Your data · Your weights

Most platforms force a choice: ship your data to a vendor and lose control, or build the whole fine-tuning stack yourself and lose six months. We picked: integrate the two best open-source fine-tuning frameworks - NeMo for supervised, OpenPipe ART for RL - and run them on the GPUs you already have. The data never moves. The weights are yours. The deployment is one click. Fine-tuning should be a feature of the platform, not a separate project.
Prem Naraindas
Founder & CEO, Katonic AI
Read the founder\'s letter →

§ 08 · ExploreAdjacent surfaces

Beyond the fine-tune,
where it connects.

§ A→

Train your first model.
On your data. Today.

Then bring your own data and run for real.

Request sandbox →Read the Fine-tuning docs

Your data.Your GPUs. Your model.

Most teamsnever get past the setup.

GPU procurement is the gate

Two frameworks for two problems

Deployment is its own project

Pick a base.Pick a path. Click start.

New Supervised Fine-Tune

Supervised Jobs · 5

Different problems.Different paths.

Supervised (NeMo)

Reinforcement (OpenPipe ART)

Dataset to deployment.Six stages, one workspace.

Prepare data

Pick base + hyperparams

Schedule on your GPUs

Train + validate

Deploy to Gateway

Eval + promote

The questions you'll be asked.The answers, on hand.

Three fine-tuning strategies.One that respects your data.

Notebook + cloud GPUs

Vendor-only fine-tuning API

Katonic Fine-tuning

Beyond the fine-tune,where it connects.

AI Gateway

Evaluation

SDKs

Train your first model.On your data. Today.

Your data.Your GPUs. Your model.

Most teamsnever get past the setup.

GPU procurement is the gate

Two frameworks for two problems

Deployment is its own project

Pick a base.Pick a path. Click start.

New Supervised Fine-Tune

Supervised Jobs · 5

Different problems.Different paths.

Supervised (NeMo)

Reinforcement (OpenPipe ART)

Dataset to deployment.Six stages, one workspace.

Prepare data

Pick base + hyperparams

Schedule on your GPUs

Train + validate

Deploy to Gateway

Eval + promote

The questions you'll be asked.The answers, on hand.

Three fine-tuning strategies.One that respects your data.

Notebook + cloud GPUs

Vendor-only fine-tuning API

Katonic Fine-tuning

Beyond the fine-tune,where it connects.

AI Gateway

Evaluation

SDKs

Train your first model.On your data. Today.

Your data.
Your GPUs. Your model.

Most teams
never get past the setup.

Pick a base.
Pick a path. Click start.

Different problems.
Different paths.

Dataset to deployment.
Six stages, one workspace.

The questions you'll be asked.
The answers, on hand.

Three fine-tuning strategies.
One that respects your data.

Beyond the fine-tune,
where it connects.

Train your first model.
On your data. Today.

Your data.
Your GPUs. Your model.

Most teams
never get past the setup.

Pick a base.
Pick a path. Click start.

Different problems.
Different paths.

Dataset to deployment.
Six stages, one workspace.

The questions you'll be asked.
The answers, on hand.

Three fine-tuning strategies.
One that respects your data.

Beyond the fine-tune,
where it connects.

Train your first model.
On your data. Today.