100+ Free Alibaba ACA GenAI Practice Questions

Pass your Alibaba Cloud Certified Associate Generative AI Engineer (ACA-GenAI / GAA-C01) exam on the first try — instant access, no signup required.

✓ No registration✓ No credit card✓ No hidden fees✓ Start practicing immediately

Not published Pass Rate

100+ Questions

100% Free

1 / 100

Question 1

Score: 0/0

Which architectural component introduced in the 2017 Vaswani et al paper is the foundation of modern large language models such as Alibaba's Qwen series?

Recurrent neural network (RNN)

Transformer with self-attention

Long short-term memory (LSTM)

Convolutional neural network (CNN)

to track

2026 Statistics

Key Facts: Alibaba ACA GenAI Exam

Total Questions

Alibaba Cloud

90 min

Exam Duration

Alibaba Cloud

$200

Exam Fee (USD)

Alibaba Cloud

2 Years

Certification Validity

Alibaba Cloud

Associate

Difficulty Tier

Alibaba Cloud ACA

Multi-format

Question Types

Single, multi, true/false

The Alibaba ACA Generative AI Engineer (ACA-GenAI / GAA-C01) is a 50-question, 90-minute associate-level exam that costs US$200 and is valid for 2 years. Items are a mix of single-choice, multi-choice and true/false. The exam tests Qwen model selection, DashScope APIs, PAI-EAS/DLC/DSW/Lingjun, RAG with Hologres pgvector, AnalyticDB-Vector and Tair-Vector, prompt engineering, agents (AgentScope), evaluation and guardrails. Alibaba Cloud does not publish a passing score or pass rate.

Sample Alibaba ACA GenAI Practice Questions

Try these sample questions to test your Alibaba ACA GenAI exam readiness. Each question includes a detailed explanation. Start the interactive quiz above for the full 100+ question experience with AI tutoring.

1Which architectural component introduced in the 2017 Vaswani et al paper is the foundation of modern large language models such as Alibaba's Qwen series?

A.Recurrent neural network (RNN)

B.Transformer with self-attention

C.Long short-term memory (LSTM)

D.Convolutional neural network (CNN)

Explanation: The Transformer architecture, introduced in 'Attention Is All You Need' (Vaswani et al, 2017), replaced recurrence with self-attention so all tokens in a sequence can be processed in parallel. Every modern LLM, including Qwen, GPT, Llama and Claude, is built on this design. Self-attention lets the model weigh the relevance of every other token when encoding a given token, which captures long-range dependencies better than RNNs or LSTMs.

2In a Transformer self-attention block, what role do the Query, Key and Value projections play?

A.Query and Key compute attention weights, which are then used to combine Values into the output

B.Query produces the output, Key and Value are discarded after training

C.Query stores the vocabulary, Key stores positions, Value stores embeddings

D.All three are identical copies of the input embedding

Explanation: Self-attention computes a dot product between each Query and every Key, scales it by the square root of the head dimension and applies softmax to produce attention weights. Those weights are then used to take a weighted sum of the Value vectors. Q controls 'what am I looking for', K controls 'what do I contain' and V controls 'what do I contribute'. This is fundamental for understanding why context length and head count drive memory in Qwen and other LLMs.

3Which Transformer variant is best suited for autoregressive text generation as used by Qwen-Chat?

A.Encoder-only (BERT-style)

B.Decoder-only with causal masking

C.Encoder-decoder (T5-style)

D.Bidirectional encoder with pooling

Explanation: Decoder-only Transformers use a causal (lower-triangular) attention mask so each token can only attend to previous tokens. This matches the autoregressive next-token prediction objective used by GPT-style models, including Qwen and Qwen-Chat. Encoder-only models like BERT are designed for classification and embedding, while encoder-decoder models like T5 or original Qwen-Audio are stronger for sequence-to-sequence tasks such as translation.

4What does Byte Pair Encoding (BPE) do during LLM tokenization?

A.Encrypts user prompts before sending them to the model

B.Iteratively merges the most frequent adjacent character or subword pairs to build a fixed vocabulary

C.Splits text strictly on whitespace and punctuation

D.Converts each Unicode character into its ASCII equivalent

Explanation: BPE starts from individual bytes or characters and repeatedly merges the most frequent adjacent pairs until a target vocabulary size is reached. The result is a subword vocabulary where common words become single tokens and rare words break into subword pieces. Qwen uses a BPE-based tokenizer (tiktoken-style) that handles Chinese, English and code well. BPE is purely a lossless text-encoding step; it is not encryption and it ignores nothing about Unicode.

5An embedding model maps a sentence to a 1024-dimensional dense vector. Which downstream task is the most natural use for this vector?

A.Storing in a vector database for semantic similarity search

B.Decoding back into the original sentence character by character

C.Replacing the model's tokenizer at inference time

D.Serving as a system prompt

Explanation: Sentence embeddings place semantically similar text close together in the vector space, so the canonical use is similarity search: store embeddings in a vector database (Hologres pgvector, AnalyticDB-Vector, Tair-Vector, OpenSearch Vector) and retrieve nearest neighbours by cosine similarity. Embeddings are not designed to be decoded back to text, are unrelated to tokenizers and are not prompts themselves. They underpin retrieval-augmented generation (RAG) over enterprise knowledge bases.

6Which statement best describes a foundation model in the context of generative AI?

A.A small task-specific classifier trained from scratch on labelled data

B.A large model pre-trained on broad data that can be adapted to many downstream tasks

C.A rule-based expert system with hand-crafted prompts

D.A model that only runs on edge devices

Explanation: Foundation models are large neural networks pre-trained on broad, diverse datasets so they capture general patterns of language, code or images. They are then adapted to specific downstream tasks via prompt engineering, fine-tuning, RAG or function calling. Qwen, GPT-4 and Claude are foundation models. The other options describe narrow classifiers, symbolic systems, or edge inference, none of which are the defining property of a foundation model.

7Scaling laws for LLMs (Kaplan, Chinchilla) predict that, holding compute fixed, what most improves test loss?

A.Increasing batch size only

B.Balancing model parameters and training tokens roughly proportionally

C.Always making the model larger regardless of data

D.Reducing the number of layers

Explanation: The Chinchilla paper (DeepMind 2022) refined Kaplan's earlier work and showed that for a fixed compute budget the optimal model size and training-token count grow roughly together, giving about 20 tokens per parameter. Earlier 'just scale parameters' approaches (e.g., Gopher, GPT-3) were found to be undertrained. Modern Qwen2 and similar models follow a Chinchilla-style data/parameter balance.

8What is meant by 'emergent abilities' in large language models?

A.Capabilities that appear only at sufficient model scale and are not present in smaller models

B.Skills that emerge during fine-tuning but disappear at inference

C.Hard-coded heuristics shipped by the model vendor

D.Features that only work in Chinese

Explanation: Emergent abilities are tasks where performance is near-random for small models and then jumps sharply once the model crosses a scale threshold (parameters, data or compute). Examples include multi-step arithmetic, in-context learning and chain-of-thought reasoning. They are not unique to a language and are not artefacts of fine-tuning alone. Understanding emergence helps explain why the Qwen2-72B variant solves problems Qwen2-1.5B cannot.

9Which sequence correctly describes the typical lifecycle of a chat-tuned LLM such as Qwen-Chat?

A.Fine-tuning then pre-training then RLHF

B.Pre-training then supervised fine-tuning then preference alignment

C.RLHF then BPE then pre-training

D.Quantization then pre-training then SFT

Explanation: The standard recipe is: (1) self-supervised pre-training on trillions of tokens to learn next-token prediction, (2) supervised fine-tuning (SFT, also called instruction tuning) on curated prompt-response pairs, and (3) preference alignment via RLHF or DPO to make outputs helpful, honest and harmless. Pre-training must come first because both SFT and RLHF require a competent base model.

10In RLHF, what is the role of the reward model?

A.It generates the next token directly

B.It assigns a scalar quality score to a response that the policy is then trained to maximise via PPO

C.It encrypts the policy weights

D.It tokenizes user input

Explanation: Reinforcement Learning from Human Feedback first trains a reward model on pairs of responses ranked by human annotators. The reward model outputs a scalar quality score. The base policy LLM is then optimised, typically with Proximal Policy Optimization (PPO), to maximise that reward while a KL penalty keeps it close to the SFT model so it does not collapse. The reward model never produces user-visible text.

About the Alibaba ACA GenAI Exam

The Alibaba Cloud Certified Associate (ACA) Generative AI Engineer (exam code ACA-GenAI, also seen as GAA-C01) validates foundational engineering skills for building generative AI applications on Alibaba Cloud. The exam covers transformer and LLM fundamentals, the Tongyi Qianwen (Qwen) family, DashScope model invocation, the PAI platform (EAS, DLC, DSW, Designer, Lingjun), prompt engineering, retrieval-augmented generation (RAG), agents, evaluation, guardrails and deployment patterns. It is the entry point into Alibaba's GenAI certification track and is targeted at developers, ML engineers and solution architects.

Assessment

Multi-format: single-choice, multi-choice and true/false items

Time Limit

1 hour 30 minutes

Passing Score

Not publicly disclosed by Alibaba Cloud (associate-tier minimum)

Exam Fee

$200 USD (Alibaba Cloud)

Alibaba ACA GenAI Exam Content Outline

20%

GenAI & LLM Fundamentals

Transformer architecture, attention, encoder/decoder, BPE tokenization, embeddings, foundation models, scaling laws and emergent abilities

15%

Training Pipeline & Customisation

Pre-training, SFT/instruction tuning, RLHF, DPO, LoRA, QLoRA, prefix tuning, fine-tuning vs RAG decision framework

20%

Tongyi Qianwen (Qwen) & Alibaba GenAI Services

Qwen / Qwen2 / Qwen-VL / Qwen-Audio / Qwen-Coder, DashScope API, Model Studio (Bailian), ModelScope, Wanxiang

15%

PAI Platform for AI

PAI-EAS serving, PAI-DLC distributed training, PAI-DSW notebooks, PAI-Designer low-code, PAI-AutoML, PAI-Lingjun RDMA infrastructure

15%

Prompt Engineering, RAG & Agents

Zero/few-shot, CoT, ReAct, self-consistency, chunking, embedding models, vector DBs (Hologres pgvector, AnalyticDB, Tair, OpenSearch), reranking, AgentScope

15%

Evaluation, Guardrails & Deployment

BLEU/ROUGE/BERTScore/perplexity/LLM-as-Judge, prompt injection, jailbreak, PII redaction, quantization, vLLM, KV cache, multi-LoRA serving

How to Pass the Alibaba ACA GenAI Exam

What You Need to Know

Passing score: Not publicly disclosed by Alibaba Cloud (associate-tier minimum)
Assessment: Multi-format: single-choice, multi-choice and true/false items
Time limit: 1 hour 30 minutes
Exam fee: $200 USD

Keys to Passing

Complete 500+ practice questions
Score 80%+ consistently before scheduling
Focus on highest-weighted sections
Use our AI tutor for tough concepts

Alibaba ACA GenAI Study Tips from Top Performers

1Master the Qwen family map: Qwen vs Qwen2, base vs Chat/Instruct, and the multimodal variants Qwen-VL, Qwen-Audio and Qwen-Coder

2Memorise the PAI sub-products and what each is for: EAS = serving, DLC = distributed training, DSW = notebooks, Designer = low-code, Lingjun = RDMA infrastructure, AutoML = HPO

3Learn the four main Alibaba Cloud vector stores (Hologres pgvector, AnalyticDB-Vector, Tair-Vector, OpenSearch Vector Search) and when to choose each

4Practice the customisation ladder: prompt engineering -> RAG -> LoRA / QLoRA -> full fine-tuning -> continued pre-training, with the cost and use-case trade-offs

5Drill the RAG pipeline: chunking strategies, bge embeddings, hybrid BM25+vector retrieval, bge-reranker cross-encoder, then the LLM call with citations

6Understand inference economics: KV cache, continuous batching with vLLM, INT8/INT4 quantization, multi-LoRA serving and speculative decoding

Frequently Asked Questions

What is the Alibaba Cloud ACA Generative AI Engineer exam?

The Alibaba Cloud Certified Associate (ACA) Generative AI Engineer exam (code ACA-GenAI, also referred to as GAA-C01) is the associate-tier credential in Alibaba's GenAI certification track. It validates the engineering skills needed to build generative AI applications on Alibaba Cloud, including selecting Qwen models, calling them through DashScope, training and serving on PAI, building RAG with Hologres or AnalyticDB-Vector, and applying prompt engineering, agents and guardrails. It is positioned as the entry point before higher-level Alibaba GenAI certifications.

How many questions are on the ACA-GenAI exam and how long is it?

The exam has 50 questions delivered in a mix of formats: single-choice, multi-choice and true/false. Candidates have 90 minutes to complete it, which works out to roughly 108 seconds per item. Multi-choice questions usually require all correct options to be selected for credit, so pacing and careful reading matter.

How much does the ACA Generative AI Engineer exam cost?

The exam fee is US$200, payable when you schedule through the Alibaba Cloud Academy portal. Pricing can vary by region and is occasionally discounted during Alibaba Cloud certification promotions. Retake fees are the full US$200 each attempt.

How long is the ACA Generative AI Engineer certification valid?

The credential is valid for 2 years from the pass date. To stay certified you must either retake the current version of the exam or pass a higher-level Alibaba Cloud GenAI certification before expiry. Recertification keeps the credential current with new Qwen models, DashScope features and PAI updates.

What topics should I focus on for the ACA-GenAI exam?

Focus on the Qwen family (when to choose Qwen-7B, Qwen-72B, Qwen-VL, Qwen-Audio, Qwen-Coder), DashScope API patterns (streaming, function calling, JSON mode), the PAI sub-products (EAS for serving, DLC for distributed training, DSW for notebooks, Designer for low-code, Lingjun for RDMA infrastructure), RAG architectures using Hologres pgvector / AnalyticDB-Vector / Tair-Vector / OpenSearch Vector Search, prompt engineering patterns (zero-shot, few-shot, CoT, ReAct, self-consistency), agents and AgentScope, evaluation metrics (BLEU, ROUGE, BERTScore, perplexity, pass@k, faithfulness), and guardrails (prompt injection, jailbreak, PII redaction, content moderation).

Do I need coding experience for the ACA-GenAI exam?

Yes. Unlike the AWS AI Practitioner foundational exam, ACA-GenAI is an engineer-track associate certification. You should be comfortable reading Python pseudocode for DashScope SDK calls, understanding training and inference workflows on PAI, configuring vector indexes in Hologres or AnalyticDB, and reasoning about LoRA/QLoRA fine-tuning and quantization trade-offs. Hands-on experience with at least one Qwen model and one Alibaba Cloud vector store is strongly recommended.

How does ACA-GenAI compare to the AWS AI Practitioner certification?

Both are entry-level GenAI certifications, but ACA-GenAI is engineer-track and AWS AIF-C01 is foundational/business-track. ACA-GenAI assumes coding ability and dives deeper into model serving, distributed training, vector databases and agent frameworks (AgentScope), all on Alibaba Cloud. AIF-C01 is non-coding and emphasises Bedrock service knowledge, responsible AI and use cases. If your stack is Alibaba Cloud, choose ACA-GenAI; if it is AWS, choose AIF-C01 (then optionally ML Engineer Associate).

How long should I study for the ACA-GenAI exam?

Most engineers study for 3-6 weeks, investing 30-60 hours total. Plan to: (1) build at least one DashScope chat application with streaming and function calling, (2) deploy a small Qwen model on PAI-EAS, (3) build a RAG pipeline with bge embeddings and Hologres pgvector, (4) experiment with one LoRA fine-tune on PAI-DLC or PAI-DSW, and (5) take 100+ practice questions and consistently score 80%+ before scheduling.