5.3 Prompt Engineering
Key Takeaways
- Prompt engineering is the practice of designing effective prompts (instructions) to get the best results from generative AI models.
- A well-structured prompt includes a system message (role/behavior), user message (the actual request), and optionally few-shot examples.
- Key techniques include zero-shot (no examples), few-shot (with examples), chain-of-thought (step-by-step reasoning), and providing context/constraints.
- The system message sets the AI's persona, tone, format, and boundaries — it is the most powerful tool for controlling model behavior.
- Grounding provides the model with specific factual data to base its response on, reducing hallucinations and increasing accuracy.
Prompt Engineering
Quick Answer: Prompt engineering is designing effective instructions for generative AI models. Key techniques include system messages (set behavior), few-shot prompting (provide examples), chain-of-thought (step-by-step reasoning), and grounding (provide factual data). A well-crafted prompt can dramatically improve the quality, accuracy, and relevance of AI responses.
What Is Prompt Engineering?
Prompt engineering is the art and science of crafting effective prompts (instructions) to get desired outputs from generative AI models. Since you are not modifying the model itself, the prompt is your primary tool for controlling behavior.
Think of it this way: the generative AI model is a highly capable assistant, and your prompt is the instructions you give that assistant. Better instructions lead to better results.
Anatomy of a Prompt
Chat Completion Message Structure
Azure OpenAI uses a chat completion API with three message roles:
| Role | Purpose | Example |
|---|---|---|
| System | Set the AI's persona, behavior, constraints, and format | "You are a helpful customer service agent for a bank. Always be polite and never provide financial advice." |
| User | The actual question or request from the end user | "What are your hours of operation?" |
| Assistant | Previous AI responses (for conversation history) | "Our branches are open Monday to Friday, 9 AM to 5 PM." |
The System Message
The system message is the most powerful tool for controlling AI behavior. It sets:
- Persona: Who the AI should act as
- Tone: How the AI should communicate
- Format: How responses should be structured
- Constraints: What the AI should NOT do
- Knowledge: What context the AI should use
Example system message: "You are an expert Azure certification tutor. Explain concepts clearly using simple language. Always provide examples. Format answers with bullet points when listing items. If you do not know the answer, say so honestly instead of guessing. Never provide actual exam questions."
Key Prompt Engineering Techniques
1. Zero-Shot Prompting (No Examples)
Give the model a task with NO examples — rely on its pre-trained knowledge.
Prompt: "Classify the following text as positive, negative, or neutral: 'The movie was absolutely fantastic!'"
Response: "Positive"
When to use: Simple tasks where the model understands the task from the instruction alone.
2. Few-Shot Prompting (With Examples)
Provide examples of the desired input-output pattern before the actual task.
Prompt: "Classify the sentiment of each review:
Review: 'Great product, fast shipping!' → Positive Review: 'Terrible quality, waste of money' → Negative Review: 'It works fine, nothing special' → Neutral Review: 'The customer service team was incredibly helpful and resolved my issue within minutes' →"
Response: "Positive"
When to use: When you want the model to follow a specific pattern or format, or when zero-shot results are inconsistent.
3. Chain-of-Thought Prompting
Ask the model to think step by step, showing its reasoning process.
Prompt: "A store has 47 apples. They sell 23 and receive a new shipment of 35. How many apples do they have now? Think through this step by step."
Response: "Step 1: Start with 47 apples. Step 2: Subtract 23 sold: 47 - 23 = 24 apples. Step 3: Add 35 from new shipment: 24 + 35 = 59 apples. Answer: The store has 59 apples."
When to use: Math problems, multi-step reasoning, complex analysis. Adding "think step by step" or "explain your reasoning" significantly improves accuracy on reasoning tasks.
4. Grounding with Context
Provide the model with specific data or documents to base its response on, rather than relying on its pre-trained knowledge.
Prompt: "Based on the following document, answer the question.
DOCUMENT: 'Azure AI-900 exam has 40-60 questions, costs $165, and requires a score of 700/1000 to pass. The exam duration is 45 minutes.'
QUESTION: How much does the AI-900 exam cost?"
Response: "The AI-900 exam costs $165."
When to use: When accuracy is critical and you want to reduce hallucinations. The model answers from the provided data, not from its potentially outdated or incorrect pre-trained knowledge.
Prompt Engineering Best Practices
| Best Practice | Description | Example |
|---|---|---|
| Be specific | Clearly state what you want | "Summarize in 3 bullet points" vs. "Summarize" |
| Set the format | Specify the desired output format | "Respond in JSON format" or "Use a markdown table" |
| Provide context | Give relevant background information | Include the document, data, or situation |
| Use constraints | Tell the model what NOT to do | "Do not include personal opinions" |
| Iterate | Refine your prompt based on results | Try different wordings if results are not ideal |
| Break down complex tasks | Split into smaller, sequential prompts | Step 1: summarize. Step 2: extract key dates. Step 3: format as timeline. |
Temperature and Other Parameters
| Parameter | Range | Effect | Use Case |
|---|---|---|---|
| Temperature | 0 to 2 | Controls randomness (0 = focused, 1+ = creative) | 0 for factual; 0.7 for creative |
| Top P | 0 to 1 | Controls diversity of token selection | Alternative to temperature |
| Max tokens | 1 to model limit | Maximum response length | Control response size |
| Frequency penalty | -2 to 2 | Reduces repetition of tokens | Prevent repetitive responses |
| Presence penalty | -2 to 2 | Encourages new topics | More diverse content |
On the Exam: Temperature is the most commonly tested parameter. Low temperature (0-0.3) = deterministic, factual, consistent responses. High temperature (0.7-1.0) = creative, varied, unpredictable responses. Know when to use each setting.
What is the purpose of a system message in Azure OpenAI?
A developer provides three example input-output pairs before asking the model to classify a new text. What prompt engineering technique is this?
Which temperature setting would produce the MOST creative and varied responses?
What is the purpose of "grounding" in prompt engineering?
Adding the phrase "Think step by step" to a prompt is an example of which technique?