Cheat sheet

ISTQB CT-GenAI Cheat Sheet

GenAI Foundations

12%of exam

AI spectrumLLM building blocksLLM test capabilitiesTokenization + embeddingsFoundation vs reasoningMultimodal models

Prompt Engineering

45%of exam

Prompt componentsPrompting techniquesPrompting for test tasksZero/one/few-shotSystem vs user promptMeta prompting

Managing GenAI Risks

20%of exam

LLM Test Infrastructure

13%of exam

LLM test architectureRAG + agentsFine-tuning + LLMOpsVector databaseRelational vs vectorOrchestration

Organizational Adoption

10%of exam

Shadow AIAdoption phasesGenAI strategySelecting modelsPrompt patternsEvolving test roles

Quick Facts

Exam
CT-GenAI Specialist
Credential
Testing with Generative AI
Questions
40 multiple choice
Time
60 min (75 non-native)
Pass
65% (26/40)
Level
Specialist
Prerequisite
ISTQB Foundation (CTFL)
K-levels
K1, K2, K3
Weights
By syllabus teaching time

Five Syllabus Chapters

Foundations | Prompting | Risks | Infrastructure | Adoption

Ch1: foundationsCh2: promptingCh3: risksCh4: infrastructureCh5: adoption

Foundation vs Reasoning LLM

Foundation LLM

  • General-purpose base
  • Broad pretraining
  • Needs task adaptation

Reasoning LLM

  • Multi-step logic
  • Chain-of-thought
  • High-cognitive tasks

Base vs deep reasoning

AI Spectrum

Symbolic AI
Rule-based logic systems
Machine learning
Data-driven pattern models
Deep learning
Neural network layers
Generative AI
Creates new content
Foundation LLM
General-purpose pretrained base
Instruction-tuned LLM
Follows user instructions
Reasoning LLM
Multi-step chain-of-thought

AI Spectrum

Symbolic | ML | Deep | Generative

Symbolic: rulesML: data-drivenDeep: neural netsGenerative: creates content

Chatbot vs LLM Application

AI chatbot

  • Conversational UI
  • Direct prompting
  • Ad-hoc tasks

LLM application

  • API-integrated
  • Automated tasks
  • Scalable, embedded

Chat vs integrated

LLM Building Blocks

Tokenization
Text split into tokens
Embedding
Token as numeric vector
Transformer
Attention-based architecture
Context window
Max tokens per call
Temperature
Controls output randomness
Non-determinism
Same input, varying output
SLM
Small language model
Multimodal model
Text, image, audio

LLM Test Capabilities

Requirements analysis
Spot gaps, ambiguities
Test case creation
Generate cases, objectives
Test oracle
Generate expected results
Test data
Datasets, boundary values
Automation support
Generate, improve scripts
Result analysis
Summarize, classify anomalies
Testware creation
Plans, reports, docs

Six Prompt Components

Role | Context | Instruction | Input | Constraints | Output

Role: personaContext: backgroundInstruction: taskInput: dataConstraints: limitsOutput: format

System vs User Prompt

System prompt

  • Developer-defined
  • Sets rules
  • Stays constant

User prompt

  • Per-interaction input
  • Changes each turn
  • Visible request

Rules vs request

Prompting Technique Picker

  1. No examples neededZero-shot(Simple task)
  2. Show one exampleOne-shot
  3. Show several examplesFew-shot(Consistent format)
  4. Complex multi-step taskPrompt chaining(Verify each step)
  5. Improve the promptMeta prompting
  6. Set persistent rulesSystem prompt
  7. Assign a personaRole component

Six Prompt Components

Role
Persona for model
Context
Background test information
Instruction
The task directive
Input data
Stories, code, examples
Constraints
Restrictions and rules
Output format
Expected response structure

Three Core Techniques

Prompt chaining | Few-shot | Meta prompting

Chaining: step-by-stepFew-shot: examplesMeta: refine prompts

Zero-shot vs Few-shot

Zero-shot

  • No examples
  • Relies on pretraining
  • Simple tasks

Few-shot

  • Several examples
  • In-context guidance
  • Consistent format

None vs examples

Test Task Picker

  1. Find test-basis gapsTest analysis
  2. Create test casesTest design
  3. Need expected resultTest oracle
  4. Generate test inputsTest data generation
  5. Build test scriptsAutomation support
  6. Summarize test resultsResult analysis

Prompting Techniques

Zero-shot
No examples given
One-shot
Single example given
Few-shot
Several examples given
Prompt chaining
Break into verified steps
Meta prompting
LLM refines its prompts
System prompt
Developer-set behavior rules
User prompt
Per-interaction input

Prompting for Test Tasks

Test analysis
Conditions, coverage, defects
Test design
Cases from user stories
Regression testing
Keyword-driven scripts
Test monitoring
Metrics from test data
Boundary value
Suggested test technique
Gherkin style
Given-When-Then conditions
Prioritization
Rank by risk

NIST AI RMF

Govern | Map | Measure | Manage

Govern: cultureMap: contextMeasure: analyzeManage: respond

Hallucination vs Reasoning Error

Hallucination

  • Fabricated content
  • Unsupported facts
  • Invented criteria

Reasoning error

  • Faulty logic
  • Misread structure
  • Wrong conclusion

False fact vs bad logic

Risk Mitigation Picker

  1. Output looks wrongCross-verification(Check sources)
  2. Outputs keep varyingLower temperature(Set random seed)
  3. Complex reasoning failsPrompt chaining
  4. Handling sensitive dataAnonymize input(Data minimization)
  5. Untrusted input dataSecure environment
  6. Need AI governanceNIST AI RMF

GenAI Defects

Hallucination
Confident but wrong output
Reasoning error
Faulty logic, inference
Bias
Skewed training-data output
Cross-verification
Check known sources
Consistency check
Outputs must agree
Output testing
Run generated testware

Non-Determinism Control

Temperature
Lower for consistency
Random seed
Reproducible sampling
Complete context
Reduces hallucination risk
Divide prompts
Chain into steps
Compare models
Cross-check multiple LLMs

Attack Vectors

Data exfiltration
Extract training data
Request manipulation
Disrupt model output
Data poisoning
Corrupt training data
Malicious code
Hidden backdoors, calls
Data exposure
Leak sensitive information
GDPR
EU data protection

AI Standards + Regulations

ISO/IEC 42001
AI management system
ISO/IEC 23053
AI systems framework
EU AI Act
Risk-tier regulation
NIST AI RMF
Govern, map, measure, manage
CO2 emissions
GenAI environmental impact

RAG vs Fine-Tuning

RAG

  • Adds retrieved context
  • Model unchanged
  • Update the index

Fine-tuning

  • Retrains weights
  • Teaches style
  • Costlier to update

Retrieve vs retrain

Architecture Picker

  1. Need fresh enterprise dataRAG(Vector database)
  2. Teach domain styleFine-tuning(Retrain weights)
  3. Multi-step automationLLM agent(Tool use)
  4. Structured test dataRelational database
  5. Semantic retrievalVector database
  6. Run in productionLLMOps

LLM Test Architecture

Front-end
Tester query interface
Back-end
Auth, retrieval, prompts
Integrated LLM
API or in-house
Relational database
Structured test data
Vector database
Semantic embedding retrieval
Post-processing
Refine raw output

Autonomous vs Semi-Autonomous

Autonomous agent

  • Minimal oversight
  • Self-directed
  • Higher risk

Semi-autonomous agent

  • Periodic human check
  • Guards critical tasks
  • Lower risk

Independent vs supervised

RAG + Agents

RAG
Retrieval plus generation
Chunking
256-512 token splits
Grounded response
Rooted in retrieval
LLM agent
Uses tools to act
Autonomous agent
Minimal human oversight
Orchestration
Multi-agent collaboration

Fine-Tuning + LLMOps

Fine-tuning
Retrain on domain data
Overfitting
Too specialized, brittle
Opacity
Hard to explain
LLMOps
Operate LLMs in production
LLM-as-a-Service
Hosted vendor model
In-house model
Self-hosted control

Three Adoption Phases

Discovery | Initiation | Utilization

Discovery: awarenessInitiation: use casesUtilization: full integration

Adoption + Governance

Shadow AI
Unapproved tool use
IP dispute
Unclear licensing risk
GenAI strategy
Objectives, models, compliance
Prompt pattern
Reusable prompt template
Quality gate
Review generated testware
Transparency
Disclose GenAI use

Selecting Models

Model performance
Benchmark for tasks
Fine-tuning potential
Domain adaptability
Recurring cost
Licensing plus operations
Community support
Docs and troubleshooting

Common Traps

Hallucination vs reasoning error

Hallucination = false fact Reasoning error = bad logic

System vs user prompt

System = fixed rules User = each request

RAG vs fine-tuning

RAG = add context Fine-tune = retrain weights

Zero vs few-shot

Zero-shot = no examples Few-shot = several examples

Temperature effect

Low = consistent output High = varied output

Fluent vs correct

Fluent = reads well Correct = actually true

Relational vs vector DB

Relational = structured data Vector = semantic retrieval

Chatbot vs application

Chatbot = conversational Application = API-integrated

Last Minute

  1. 1.40 questions, 60 minutes
  2. 2.75 minutes if non-native
  3. 3.Pass 65%, 26 of 40
  4. 4.K1, K2, K3 only
  5. 5.CTFL Foundation required first
  6. 6.Five chapters; prompting is largest
  7. 7.Plausible output can be wrong
  8. 8.Hallucination = false fact
  9. 9.Lower temperature for consistency
  10. 10.RAG adds context, no retrain
  11. 11.Fine-tuning retrains model weights
  12. 12.Know NIST AI RMF functions
  13. 13.Anonymize sensitive data always
  14. 14.Lifetime certificate, no renewal
Same family resources

Explore More ISTQB Certifications

Continue into nearby exams from the same family. Each card keeps practice questions, study guides, flashcards, videos, and articles in one place.