6.4 Fine-Tuning and Model Customization
Key Takeaways
- Fine-tuning trains an existing model on your custom dataset to improve performance for specific tasks, domains, or output formats.
- Azure OpenAI supports fine-tuning of GPT-4o, GPT-4o mini, and GPT-3.5 Turbo models with custom training data in JSONL format.
- Training data format: each line is a JSON object with a messages array containing system, user, and assistant roles.
- Fine-tuning is recommended when prompt engineering alone is insufficient — for consistent output format, domain-specific knowledge, or reducing prompt length.
- Fine-tuned models are deployed as new deployments and billed at a higher per-token rate than base models.
Fine-Tuning and Model Customization
Quick Answer: Fine-tuning customizes a model on your data for specific tasks. Use JSONL format with system/user/assistant messages. Fine-tune when prompt engineering is insufficient: for consistent output format, domain knowledge, or reducing prompt length. Minimum 10 examples, recommended 50-100+.
When to Fine-Tune vs. When NOT To
Fine-Tune When:
- You need a consistent output format that prompt engineering can't reliably produce
- You have domain-specific knowledge that the base model doesn't have
- You need to reduce prompt length (move few-shot examples into the model)
- You need to match a specific style or tone consistently
- Cost optimization: Fine-tuned models can produce good results with shorter prompts
DON'T Fine-Tune When:
- Prompt engineering can achieve the desired result
- RAG can provide the needed context
- You have fewer than 10 quality examples
- You need the model to learn new facts (use RAG instead)
- You need real-time knowledge (fine-tuning is a point-in-time snapshot)
Training Data Format (JSONL)
Each line is a complete training example:
{"messages": [{"role": "system", "content": "You are a customer support agent for Contoso. Respond professionally and concisely."}, {"role": "user", "content": "How do I reset my password?"}, {"role": "assistant", "content": "To reset your password: 1) Go to contoso.com/reset 2) Enter your email 3) Click 'Reset Password' 4) Check your email for the reset link. The link expires in 24 hours."}]}
{"messages": [{"role": "system", "content": "You are a customer support agent for Contoso. Respond professionally and concisely."}, {"role": "user", "content": "What is your refund policy?"}, {"role": "assistant", "content": "Our refund policy allows returns within 30 days of purchase for a full refund. Items must be in original condition. To start a return, visit contoso.com/returns or call 1-800-CONTOSO."}]}
Training Data Requirements
| Requirement | Minimum | Recommended |
|---|---|---|
| Examples | 10 | 50-100+ |
| Format | JSONL (one JSON per line) | Consistent format across examples |
| Quality | Correct, representative examples | Gold-standard human-written responses |
| File size | No minimum | Up to 512 MB |
| Validation set | Optional (auto-split if not provided) | 10-20% of training data |
Fine-Tuning Workflow
Step 1: Upload Training Data
# Upload training file
training_file = client.files.create(
file=open("training_data.jsonl", "rb"),
purpose="fine-tune"
)
print(f"File ID: {training_file.id}")
Step 2: Create Fine-Tuning Job
# Start fine-tuning
job = client.fine_tuning.jobs.create(
training_file=training_file.id,
model="gpt-4o-mini-2024-07-18",
hyperparameters={
"n_epochs": 3,
"batch_size": "auto",
"learning_rate_multiplier": "auto"
}
)
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
Step 3: Monitor Training
# Check job status
job = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {job.status}")
# List events
events = client.fine_tuning.jobs.list_events(job.id)
for event in events.data:
print(f"{event.created_at}: {event.message}")
Step 4: Deploy the Fine-Tuned Model
Once training completes, deploy the fine-tuned model:
az cognitiveservices account deployment create \
--name my-openai-service \
--resource-group rg-ai-prod \
--deployment-name my-custom-model \
--model-name <fine-tuned-model-id> \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 10
Hyperparameters
| Parameter | Description | Default |
|---|---|---|
| n_epochs | Number of times to iterate over training data | auto (typically 3-4) |
| batch_size | Number of examples per training step | auto |
| learning_rate_multiplier | Scaling factor for the learning rate | auto |
On the Exam: Questions about fine-tuning typically test: (1) when to fine-tune vs. use prompt engineering or RAG, (2) the JSONL training data format, and (3) the fine-tuning workflow (upload → create job → monitor → deploy). Know that fine-tuned models cost more per token than base models.
Evaluation
After fine-tuning, evaluate the model:
- Training loss curve: Should decrease and stabilize (not increase)
- Validation loss: Should track training loss (large gap = overfitting)
- Human evaluation: Test with real-world queries not in the training set
- A/B comparison: Compare fine-tuned model vs. base model with prompt engineering
What file format is required for Azure OpenAI fine-tuning training data?
A company wants their AI to always respond in a specific JSON format with domain-specific terminology. Prompt engineering produces inconsistent formats. What should they do?
When should you use RAG instead of fine-tuning to add domain knowledge?