GenAI Foundations
12%of exam
Prompt Engineering
45%of exam
Managing GenAI Risks
20%of exam
LLM Test Infrastructure
13%of exam
Organizational Adoption
10%of exam
Quick Facts
- Exam
- CT-GenAI Specialist
- Credential
- Testing with Generative AI
- Questions
- 40 multiple choice
- Time
- 60 min (75 non-native)
- Pass
- 65% (26/40)
- Level
- Specialist
- Prerequisite
- ISTQB Foundation (CTFL)
- K-levels
- K1, K2, K3
- Weights
- By syllabus teaching time
Five Syllabus Chapters
Foundations | Prompting | Risks | Infrastructure | Adoption
Foundation vs Reasoning LLM
Foundation LLM
- General-purpose base
- Broad pretraining
- Needs task adaptation
Reasoning LLM
- Multi-step logic
- Chain-of-thought
- High-cognitive tasks
Base vs deep reasoning
AI Spectrum
- Symbolic AI
- Rule-based logic systems
- Machine learning
- Data-driven pattern models
- Deep learning
- Neural network layers
- Generative AI
- Creates new content
- Foundation LLM
- General-purpose pretrained base
- Instruction-tuned LLM
- Follows user instructions
- Reasoning LLM
- Multi-step chain-of-thought
AI Spectrum
Symbolic | ML | Deep | Generative
Chatbot vs LLM Application
AI chatbot
- Conversational UI
- Direct prompting
- Ad-hoc tasks
LLM application
- API-integrated
- Automated tasks
- Scalable, embedded
Chat vs integrated
LLM Building Blocks
- Tokenization
- Text split into tokens
- Embedding
- Token as numeric vector
- Transformer
- Attention-based architecture
- Context window
- Max tokens per call
- Temperature
- Controls output randomness
- Non-determinism
- Same input, varying output
- SLM
- Small language model
- Multimodal model
- Text, image, audio
LLM Test Capabilities
- Requirements analysis
- Spot gaps, ambiguities
- Test case creation
- Generate cases, objectives
- Test oracle
- Generate expected results
- Test data
- Datasets, boundary values
- Automation support
- Generate, improve scripts
- Result analysis
- Summarize, classify anomalies
- Testware creation
- Plans, reports, docs
Six Prompt Components
Role | Context | Instruction | Input | Constraints | Output
System vs User Prompt
System prompt
- Developer-defined
- Sets rules
- Stays constant
User prompt
- Per-interaction input
- Changes each turn
- Visible request
Rules vs request
Prompting Technique Picker
- No examples needed→Zero-shot(Simple task)
- Show one example→One-shot
- Show several examples→Few-shot(Consistent format)
- Complex multi-step task→Prompt chaining(Verify each step)
- Improve the prompt→Meta prompting
- Set persistent rules→System prompt
- Assign a persona→Role component
Six Prompt Components
- Role
- Persona for model
- Context
- Background test information
- Instruction
- The task directive
- Input data
- Stories, code, examples
- Constraints
- Restrictions and rules
- Output format
- Expected response structure
Three Core Techniques
Prompt chaining | Few-shot | Meta prompting
Zero-shot vs Few-shot
Zero-shot
- No examples
- Relies on pretraining
- Simple tasks
Few-shot
- Several examples
- In-context guidance
- Consistent format
None vs examples
Test Task Picker
- Find test-basis gaps→Test analysis
- Create test cases→Test design
- Need expected result→Test oracle
- Generate test inputs→Test data generation
- Build test scripts→Automation support
- Summarize test results→Result analysis
Prompting Techniques
- Zero-shot
- No examples given
- One-shot
- Single example given
- Few-shot
- Several examples given
- Prompt chaining
- Break into verified steps
- Meta prompting
- LLM refines its prompts
- System prompt
- Developer-set behavior rules
- User prompt
- Per-interaction input
Prompting for Test Tasks
- Test analysis
- Conditions, coverage, defects
- Test design
- Cases from user stories
- Regression testing
- Keyword-driven scripts
- Test monitoring
- Metrics from test data
- Boundary value
- Suggested test technique
- Gherkin style
- Given-When-Then conditions
- Prioritization
- Rank by risk
NIST AI RMF
Govern | Map | Measure | Manage
Hallucination vs Reasoning Error
Hallucination
- Fabricated content
- Unsupported facts
- Invented criteria
Reasoning error
- Faulty logic
- Misread structure
- Wrong conclusion
False fact vs bad logic
Risk Mitigation Picker
- Output looks wrong→Cross-verification(Check sources)
- Outputs keep varying→Lower temperature(Set random seed)
- Complex reasoning fails→Prompt chaining
- Handling sensitive data→Anonymize input(Data minimization)
- Untrusted input data→Secure environment
- Need AI governance→NIST AI RMF
GenAI Defects
- Hallucination
- Confident but wrong output
- Reasoning error
- Faulty logic, inference
- Bias
- Skewed training-data output
- Cross-verification
- Check known sources
- Consistency check
- Outputs must agree
- Output testing
- Run generated testware
Non-Determinism Control
- Temperature
- Lower for consistency
- Random seed
- Reproducible sampling
- Complete context
- Reduces hallucination risk
- Divide prompts
- Chain into steps
- Compare models
- Cross-check multiple LLMs
Attack Vectors
- Data exfiltration
- Extract training data
- Request manipulation
- Disrupt model output
- Data poisoning
- Corrupt training data
- Malicious code
- Hidden backdoors, calls
- Data exposure
- Leak sensitive information
- GDPR
- EU data protection
AI Standards + Regulations
- ISO/IEC 42001
- AI management system
- ISO/IEC 23053
- AI systems framework
- EU AI Act
- Risk-tier regulation
- NIST AI RMF
- Govern, map, measure, manage
- CO2 emissions
- GenAI environmental impact
RAG vs Fine-Tuning
RAG
- Adds retrieved context
- Model unchanged
- Update the index
Fine-tuning
- Retrains weights
- Teaches style
- Costlier to update
Retrieve vs retrain
Architecture Picker
- Need fresh enterprise data→RAG(Vector database)
- Teach domain style→Fine-tuning(Retrain weights)
- Multi-step automation→LLM agent(Tool use)
- Structured test data→Relational database
- Semantic retrieval→Vector database
- Run in production→LLMOps
LLM Test Architecture
- Front-end
- Tester query interface
- Back-end
- Auth, retrieval, prompts
- Integrated LLM
- API or in-house
- Relational database
- Structured test data
- Vector database
- Semantic embedding retrieval
- Post-processing
- Refine raw output
Autonomous vs Semi-Autonomous
Autonomous agent
- Minimal oversight
- Self-directed
- Higher risk
Semi-autonomous agent
- Periodic human check
- Guards critical tasks
- Lower risk
Independent vs supervised
RAG + Agents
- RAG
- Retrieval plus generation
- Chunking
- 256-512 token splits
- Grounded response
- Rooted in retrieval
- LLM agent
- Uses tools to act
- Autonomous agent
- Minimal human oversight
- Orchestration
- Multi-agent collaboration
Fine-Tuning + LLMOps
- Fine-tuning
- Retrain on domain data
- Overfitting
- Too specialized, brittle
- Opacity
- Hard to explain
- LLMOps
- Operate LLMs in production
- LLM-as-a-Service
- Hosted vendor model
- In-house model
- Self-hosted control
Three Adoption Phases
Discovery | Initiation | Utilization
Adoption + Governance
- Shadow AI
- Unapproved tool use
- IP dispute
- Unclear licensing risk
- GenAI strategy
- Objectives, models, compliance
- Prompt pattern
- Reusable prompt template
- Quality gate
- Review generated testware
- Transparency
- Disclose GenAI use
Selecting Models
- Model performance
- Benchmark for tasks
- Fine-tuning potential
- Domain adaptability
- Recurring cost
- Licensing plus operations
- Community support
- Docs and troubleshooting
Common Traps
Hallucination vs reasoning error
Hallucination = false fact ≠ Reasoning error = bad logic
System vs user prompt
System = fixed rules ≠ User = each request
RAG vs fine-tuning
RAG = add context ≠ Fine-tune = retrain weights
Zero vs few-shot
Zero-shot = no examples ≠ Few-shot = several examples
Temperature effect
Low = consistent output ≠ High = varied output
Fluent vs correct
Fluent = reads well ≠ Correct = actually true
Relational vs vector DB
Relational = structured data ≠ Vector = semantic retrieval
Chatbot vs application
Chatbot = conversational ≠ Application = API-integrated
Last Minute
- 1.40 questions, 60 minutes
- 2.75 minutes if non-native
- 3.Pass 65%, 26 of 40
- 4.K1, K2, K3 only
- 5.CTFL Foundation required first
- 6.Five chapters; prompting is largest
- 7.Plausible output can be wrong
- 8.Hallucination = false fact
- 9.Lower temperature for consistency
- 10.RAG adds context, no retrain
- 11.Fine-tuning retrains model weights
- 12.Know NIST AI RMF functions
- 13.Anonymize sensitive data always
- 14.Lifetime certificate, no renewal
Explore More ISTQB Certifications
Continue into nearby exams from the same family. Each card keeps practice questions, study guides, flashcards, videos, and articles in one place.
More From This Family
Videos and articles for deeper review.
