6.4 Fine-Tuning and Model Customization

Key Takeaways

  • Fine-tuning trains an existing model on your custom dataset to improve performance for specific tasks, domains, or output formats.
  • Azure OpenAI supports fine-tuning of GPT-4o, GPT-4o mini, and GPT-3.5 Turbo models with custom training data in JSONL format.
  • Training data format: each line is a JSON object with a messages array containing system, user, and assistant roles.
  • Fine-tuning is recommended when prompt engineering alone is insufficient — for consistent output format, domain-specific knowledge, or reducing prompt length.
  • Fine-tuned models are deployed as new deployments and billed at a higher per-token rate than base models.
Last updated: March 2026

Fine-Tuning and Model Customization

Quick Answer: Fine-tuning customizes a model on your data for specific tasks. Use JSONL format with system/user/assistant messages. Fine-tune when prompt engineering is insufficient: for consistent output format, domain knowledge, or reducing prompt length. Minimum 10 examples, recommended 50-100+.

When to Fine-Tune vs. When NOT To

Fine-Tune When:

  • You need a consistent output format that prompt engineering can't reliably produce
  • You have domain-specific knowledge that the base model doesn't have
  • You need to reduce prompt length (move few-shot examples into the model)
  • You need to match a specific style or tone consistently
  • Cost optimization: Fine-tuned models can produce good results with shorter prompts

DON'T Fine-Tune When:

  • Prompt engineering can achieve the desired result
  • RAG can provide the needed context
  • You have fewer than 10 quality examples
  • You need the model to learn new facts (use RAG instead)
  • You need real-time knowledge (fine-tuning is a point-in-time snapshot)

Training Data Format (JSONL)

Each line is a complete training example:

{"messages": [{"role": "system", "content": "You are a customer support agent for Contoso. Respond professionally and concisely."}, {"role": "user", "content": "How do I reset my password?"}, {"role": "assistant", "content": "To reset your password: 1) Go to contoso.com/reset 2) Enter your email 3) Click 'Reset Password' 4) Check your email for the reset link. The link expires in 24 hours."}]}
{"messages": [{"role": "system", "content": "You are a customer support agent for Contoso. Respond professionally and concisely."}, {"role": "user", "content": "What is your refund policy?"}, {"role": "assistant", "content": "Our refund policy allows returns within 30 days of purchase for a full refund. Items must be in original condition. To start a return, visit contoso.com/returns or call 1-800-CONTOSO."}]}

Training Data Requirements

RequirementMinimumRecommended
Examples1050-100+
FormatJSONL (one JSON per line)Consistent format across examples
QualityCorrect, representative examplesGold-standard human-written responses
File sizeNo minimumUp to 512 MB
Validation setOptional (auto-split if not provided)10-20% of training data

Fine-Tuning Workflow

Step 1: Upload Training Data

# Upload training file
training_file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)
print(f"File ID: {training_file.id}")

Step 2: Create Fine-Tuning Job

# Start fine-tuning
job = client.fine_tuning.jobs.create(
    training_file=training_file.id,
    model="gpt-4o-mini-2024-07-18",
    hyperparameters={
        "n_epochs": 3,
        "batch_size": "auto",
        "learning_rate_multiplier": "auto"
    }
)
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")

Step 3: Monitor Training

# Check job status
job = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {job.status}")

# List events
events = client.fine_tuning.jobs.list_events(job.id)
for event in events.data:
    print(f"{event.created_at}: {event.message}")

Step 4: Deploy the Fine-Tuned Model

Once training completes, deploy the fine-tuned model:

az cognitiveservices account deployment create \
    --name my-openai-service \
    --resource-group rg-ai-prod \
    --deployment-name my-custom-model \
    --model-name <fine-tuned-model-id> \
    --model-format OpenAI \
    --sku-name Standard \
    --sku-capacity 10

Hyperparameters

ParameterDescriptionDefault
n_epochsNumber of times to iterate over training dataauto (typically 3-4)
batch_sizeNumber of examples per training stepauto
learning_rate_multiplierScaling factor for the learning rateauto

On the Exam: Questions about fine-tuning typically test: (1) when to fine-tune vs. use prompt engineering or RAG, (2) the JSONL training data format, and (3) the fine-tuning workflow (upload → create job → monitor → deploy). Know that fine-tuned models cost more per token than base models.

Evaluation

After fine-tuning, evaluate the model:

  1. Training loss curve: Should decrease and stabilize (not increase)
  2. Validation loss: Should track training loss (large gap = overfitting)
  3. Human evaluation: Test with real-world queries not in the training set
  4. A/B comparison: Compare fine-tuned model vs. base model with prompt engineering
Test Your Knowledge

What file format is required for Azure OpenAI fine-tuning training data?

A
B
C
D
Test Your Knowledge

A company wants their AI to always respond in a specific JSON format with domain-specific terminology. Prompt engineering produces inconsistent formats. What should they do?

A
B
C
D
Test Your Knowledge

When should you use RAG instead of fine-tuning to add domain knowledge?

A
B
C
D