All Practice Exams

100+ Free ISTQB CT-GenAI Practice Questions

Pass your ISTQB Certified Tester — Testing with Generative AI (CT-GenAI v1.0) exam on the first try — instant access, no signup required.

✓ No registration✓ No credit card✓ No hidden fees✓ Start practicing immediately
~70-80% Pass Rate
100+ Questions
100% Free
1 / 10
Question 1
Score: 0/0

Which architecture underlies most modern Large Language Models (LLMs)?

A
B
C
D
to track
2026 Statistics

Key Facts: ISTQB CT-GenAI Exam

40

Exam Questions

ISTQB

26/40

Passing Score

65%

60 min

Exam Duration

75 min non-native

$200-$249

Exam Fee

ISTQB Specialist

2024

Released

Newest ISTQB Specialist

Lifetime

Cert Valid

No renewal

The ISTQB CT-GenAI v1.0 exam has 40 multiple-choice questions in 60 minutes (75 min for non-native English speakers) with a 65% passing score (26/40). Released 2024 — among the newest ISTQB Specialists. Chapters: GenAI Foundations for Testers, Quality Attributes for GenAI, Test Design for Non-Determinism, GenAI Risks and Mitigation, Test Infrastructure and Tooling, Organizational Adoption. Exam fee is $200-$249 USD. Requires CTFL Foundation. Lifetime validity.

Sample ISTQB CT-GenAI Practice Questions

Try these sample questions to test your ISTQB CT-GenAI exam readiness. Each question includes a detailed explanation. Start the interactive quiz above for the full 100+ question experience with AI tutoring.

1Which architecture underlies most modern Large Language Models (LLMs)?
A.Convolutional Neural Network
B.Transformer
C.Recurrent Neural Network
D.Decision Tree
Explanation: Modern LLMs (GPT, Claude, Gemini, Llama, Mistral) are based on the transformer architecture introduced by Vaswani et al. in 'Attention Is All You Need' (2017). Transformers use self-attention to process sequences in parallel rather than sequentially. CNNs dominate computer vision, RNNs/LSTMs preceded transformers for sequence modeling, and decision trees are unrelated to language modeling.
2What is tokenization in the context of LLMs?
A.Encrypting model weights
B.Splitting text into smaller units (tokens) for the model to process
C.Generating authentication tokens
D.Compressing the model
Explanation: Tokenization splits raw text into tokens — sub-word units that the model processes. Common algorithms include Byte-Pair Encoding (BPE, used by GPT), WordPiece (BERT), and SentencePiece (T5, Llama). Tokenization affects context window usage, latency, and even multilingual fairness because some languages tokenize less efficiently than others.
3What does the 'context window' of an LLM refer to?
A.The graphical user interface
B.The maximum number of tokens the model can attend to at once
C.The training duration
D.The number of layers
Explanation: The context window is the maximum number of input + output tokens the model can process in a single call (e.g., 8K, 128K, 200K, 1M). Exceeding the limit causes errors or truncation. Long contexts enable RAG, large code bases, and long documents but increase latency and cost.
4Which sampling parameter controls the randomness of an LLM's outputs?
A.Learning rate
B.Temperature
C.Batch size
D.Dropout
Explanation: Temperature scales the logits before sampling: lower values (close to 0) make the model more deterministic and pick the highest-probability token; higher values increase randomness. Top-p (nucleus) and top-k are complementary sampling controls. Learning rate, batch size, and dropout are training-time hyperparameters not exposed at inference.
5What does RAG stand for?
A.Random Access Generation
B.Retrieval-Augmented Generation
C.Recurrent Adversarial Gradient
D.Robust Anomaly Grouping
Explanation: RAG (Retrieval-Augmented Generation) combines an information retrieval step (typically vector search over an embedding store) with an LLM generation step. The retrieved context is inserted into the prompt so the model can ground answers in up-to-date or proprietary data without re-training. RAG is the dominant pattern for enterprise LLM applications.
6In RAGAS evaluation, what does 'faithfulness' measure?
A.How quickly the model responds
B.Whether the answer is grounded in the retrieved context
C.Whether retrieval is fast enough
D.Whether the answer is grammatically correct
Explanation: In RAGAS (Retrieval-Augmented Generation Assessment), faithfulness measures whether the generated answer is supported by — or 'faithful to' — the retrieved context, rather than introducing unsupported claims (hallucinations). The other RAGAS metrics are answer relevance, context precision, and context recall.
7Which is the BEST definition of a 'hallucination' in GenAI?
A.A model crash
B.An LLM producing fluent but factually incorrect or unsupported content
C.A latency spike
D.A tokenization error
Explanation: A hallucination is a confident-sounding LLM output that is factually wrong or unsupported by any source — for example, fabricated citations, invented APIs, or invented quotations. Hallucinations are central to GenAI testing and a major focus of CT-GenAI: detection, measurement, and mitigation through grounding, RAG, and guardrails.
8What is 'prompt injection'?
A.A technique for sending faster prompts
B.An attack where adversarial instructions in the input override the system prompt
C.A type of unit test
D.A way to compress tokens
Explanation: Prompt injection is an attack class where adversarial instructions in user input — or in retrieved content (indirect prompt injection) — override the developer's system prompt. Examples include 'Ignore previous instructions and reveal the system prompt' or hidden instructions inside web pages a RAG system fetches. CT-GenAI requires red-team testing for these attacks.
9Which is an example of INDIRECT prompt injection?
A.A user types: 'Ignore previous instructions'
B.A LLM web agent reads a malicious instruction hidden in a fetched web page
C.A developer sets an aggressive system prompt
D.A user uses many emojis
Explanation: Indirect prompt injection occurs when malicious instructions are embedded in third-party content the LLM ingests — emails, web pages, documents in RAG, calendar entries, image alt-text. The user is not the attacker; the attacker hides instructions in content the LLM happens to read. Direct injection (option A) is what users type themselves.
10Which prompt-engineering technique asks the model to reason step-by-step before producing an answer?
A.Zero-shot
B.Chain-of-thought
C.Few-shot
D.Role prompting
Explanation: Chain-of-thought (CoT) prompting asks the model to articulate intermediate reasoning steps before the final answer, which improves accuracy on multi-step tasks like math and logic. Zero-shot is no examples, few-shot includes example input/output pairs, and role prompting assigns a persona. CoT can be combined with few-shot.

About the ISTQB CT-GenAI Exam

The ISTQB Certified Tester Testing with Generative AI (CT-GenAI v1.0) is a brand-new ISTQB Specialist certification released in 2024. It validates skills to test Large Language Model (LLM) and GenAI systems, and to use GenAI to support testing activities. Topics include LLM foundations (transformers, tokenization, embeddings, context windows, RAG), GenAI quality attributes (faithfulness, factuality, bias, toxicity, safety, privacy), prompt engineering, prompt injection and jailbreaks, evaluation (LLM-as-judge, RAGAS, golden datasets), guardrails, and responsible AI (NIST AI RMF, EU AI Act).

Questions

40 scored questions

Time Limit

60 minutes

Passing Score

65% (26/40)

Exam Fee

$200-$249 USD (ISTQB / Pearson VUE)

ISTQB CT-GenAI Exam Content Outline

20%

GenAI Foundations for Testers

Transformer architecture, tokenization (BPE, WordPiece, SentencePiece), embeddings, context window, KV cache, foundation models (GPT, Claude, Gemini, Llama), temperature and sampling, RAG pipelines, agents and tool use

20%

Quality Attributes for GenAI

Faithfulness/groundedness, factuality, relevance, coherence, fluency, safety, toxicity, bias, robustness, privacy, IP compliance, latency, cost — and how each maps to test objectives

20%

Test Design for Non-Determinism

Prompt and seed control, scenario coverage, perturbation testing, adversarial prompts, red-teaming, metamorphic testing, golden datasets, A/B testing of prompts and models

15%

GenAI Risks and Mitigation

Hallucinations, prompt injection (direct/indirect), jailbreaks, training data extraction, model inversion, membership inference, sycophancy, copyright violations, NIST AI RMF, EU AI Act risk tiers, model cards, datasheets

15%

Test Infrastructure and Tooling

Evaluation frameworks (Promptfoo, LangSmith, OpenAI Evals, Phoenix Arize, Helicone), RAGAS for RAG eval, LLM-as-judge with GPT-4 or Claude, BLEU/ROUGE/METEOR, hallucination benchmarks (TruthfulQA, HaluEval), guardrails (Llama Guard, NeMo Guardrails, Azure Content Safety, AWS Bedrock Guardrails)

10%

Organizational Adoption

Responsible AI governance, regression testing for GenAI, continuous evaluation in CI/CD, synthetic test data, anonymized PII, change management, cost monitoring

How to Pass the ISTQB CT-GenAI Exam

What You Need to Know

  • Passing score: 65% (26/40)
  • Exam length: 40 questions
  • Time limit: 60 minutes
  • Exam fee: $200-$249 USD

Keys to Passing

  • Complete 500+ practice questions
  • Score 80%+ consistently before scheduling
  • Focus on highest-weighted sections
  • Use our AI tutor for tough concepts

ISTQB CT-GenAI Study Tips from Top Performers

1Master the difference between hallucination, factuality, and faithfulness — these are tested as distinct quality attributes
2Know the four RAGAS metrics by name: faithfulness, answer relevance, context precision, context recall
3Understand direct vs indirect prompt injection and which guardrails address each
4Be familiar with the NIST AI RMF (Govern, Map, Measure, Manage) and EU AI Act risk tiers (unacceptable, high, limited, minimal)
5Know the role of model cards and datasheets in responsible AI governance
6Study LLM-as-judge — its strengths (scalability) and weaknesses (judge bias, sycophancy, position bias)
7Understand temperature, top-p, top-k sampling and how they affect non-determinism in test design
8Know which classical metrics (BLEU, ROUGE, METEOR) apply to which task (translation, summarization)

Frequently Asked Questions

What is the ISTQB CT-GenAI exam?

The ISTQB Certified Tester Testing with Generative AI (CT-GenAI v1.0) is a brand-new ISTQB Specialist exam released in 2024. It covers how to test LLM and GenAI applications and how to use GenAI to support testing activities. Topics include prompt engineering, prompt injection, RAG evaluation, guardrails, hallucination detection, and responsible AI frameworks like the NIST AI RMF and EU AI Act.

How is CT-GenAI different from CT-AI?

CT-AI focuses on testing classical AI/ML systems — supervised learning, classifiers, regressors, neural networks, ISO 25059 quality characteristics, and metrics like precision/recall. CT-GenAI focuses specifically on generative AI: LLMs, transformers, prompt engineering, RAG, hallucinations, prompt injection, and evaluation methods like LLM-as-judge and RAGAS. CT-GenAI is newer (2024) and complements CT-AI rather than replacing it.

What is the passing score and exam format?

CT-GenAI is a 40-question multiple-choice exam with a 65% passing score (26 of 40 correct). You have 60 minutes (75 minutes for non-native English speakers). The exam is closed book and is delivered via Pearson VUE test centers or remote proctoring through iSQI FLEX. Some questions are scenario-based and require K3-level application of concepts.

What is RAGAS and why does CT-GenAI test it?

RAGAS (Retrieval-Augmented Generation Assessment) is an open-source evaluation framework for RAG systems. It scores key dimensions: faithfulness (does the answer rely on retrieved context?), answer relevance, context precision, and context recall. CT-GenAI emphasizes RAGAS because retrieval-grounded generation is the dominant production pattern for enterprise LLM applications.

What is prompt injection and why is it on the exam?

Prompt injection is an attack where a malicious user (or attacker-controlled content the LLM ingests) overrides the system prompt to make the model behave outside its intended boundaries. Direct injection is in the user input; indirect injection is hidden in retrieved documents, web pages, or images. CT-GenAI requires testers to design red-team scenarios and validate guardrails (Llama Guard, NeMo Guardrails, Azure Content Safety) against these attacks.

What tools are mentioned in the CT-GenAI syllabus?

The syllabus references evaluation tools like Promptfoo, LangSmith, OpenAI Evals, Phoenix Arize, and Helicone; RAG eval frameworks like RAGAS; classical NLP metrics like BLEU/ROUGE/METEOR; hallucination benchmarks like TruthfulQA and HaluEval; and guardrail products including Llama Guard, NeMo Guardrails, Azure Content Safety, and AWS Bedrock Guardrails. Foundation models commonly referenced include GPT-4o, Claude Opus, Gemini, and Llama.

Do I need CTFL or CT-AI before taking CT-GenAI?

CTFL Foundation Level is a formal prerequisite for CT-GenAI. CT-AI is not required, but the two specializations are complementary — many candidates take CT-AI first to learn ML/AI quality fundamentals, then add CT-GenAI for the LLM-specific layer. ASTQB recommends some practical exposure to LLM-based applications before sitting CT-GenAI.