An application reads the same small set of DynamoDB items thousands of times per second and needs microsecond latency, but the data only changes a few times per hour. Which addition requires the least code change and best fits?

Amazon DynamoDB Accelerator (DAX). DAX is the DynamoDB-specific in-memory cache that delivers microsecond, eventually consistent reads with minimal code change (point the DAX client at the cluster). Memcached works but requires custom cache-aside logic. Strong consistency increases cost and cannot be served from DAX. CloudFront caches HTTP/web content, not direct DynamoDB API calls.

A CPU-bound Lambda function configured at 256 MB runs for 4 seconds and is slow. The team wants to reduce duration and possibly cost. What should they try first?

Increase the memory allocation, since CPU scales with memory, and benchmark with Lambda Power Tuning. Lambda allocates CPU proportionally to memory, so raising memory often shortens a CPU-bound function's duration and can keep total cost flat or lower. Lowering memory makes it slower. Moving init inside the handler defeats connection reuse. Reserved concurrency caps capacity but does not change per-invocation CPU.

A DynamoDB table throttles on writes during a daily batch even though provisioned write capacity is well above the aggregate request rate. The partition key is the order's status ("PENDING", "SHIPPED", "DONE"). What is the best fix?

Redesign the partition key to a high-cardinality value (or add a write-sharding suffix) to spread traffic across partitions. Three status values are very low cardinality, so writes concentrate on a few partitions and throttle regardless of total capacity. The fix is partition-key design: use a high-cardinality key or add a sharding suffix. Read consistency, an LSI, and visibility timeout do not address a hot write partition.

Performance & Caching — Free Study Guide 2026

Pick the cache by what you are accelerating

The exam loves cache-selection questions, and the trap is choosing a cache that sits at the wrong layer. Anchor on the data you are speeding up.

Cache	Accelerates	Notes
API Gateway cache	HTTP responses	Enabled per stage, keyed by request parameters, TTL up to 3600 s, billed by cache size (0.5 GB-237 GB)
CloudFront	Static + dynamic content at the edge	700+ global POPs, lowers origin load and latency, honors `Cache-Control`/TTL
ElastiCache	Any data source	General-purpose in-memory; Redis/Valkey for persistence, replication, pub/sub, sorted sets; Memcached for simple, multi-threaded, horizontally scaled caching
DAX	DynamoDB reads only	Microsecond reads, eventually consistent, write-through, near-zero code change

DAX specifics

Amazon DynamoDB Accelerator (DAX) keeps an item cache (for GetItem/BatchGetItem) and a query cache (for Query/Scan), each with its own TTL. It returns eventually consistent data, so it does not serve strongly consistent reads — those bypass DAX and hit DynamoDB directly. Writes are write-through: DAX writes to the table first, then populates the cache, so it never accelerates writes. For read-heavy, repeat-read workloads it cuts both latency (from single-digit milliseconds to microseconds) and consumed read capacity. The code change is minimal: point the DAX client at the cluster endpoint.

Lambda performance

Memory drives CPU. Lambda allocates vCPU proportionally to configured memory across the 128 MB-10,240 MB range; at ~1,769 MB a function gets the equivalent of one full vCPU. More memory can make a CPU-bound function finish faster, often at the same or lower total cost. Use AWS Lambda Power Tuning (a Step Functions state machine) to find the cost/speed sweet spot empirically.
Cold starts occur on the first invocation of a new execution environment (the init phase that loads the runtime and your imports). Provisioned concurrency pre-initializes a set number of environments so latency-sensitive paths skip cold starts; reserved concurrency caps or guarantees a function's share of the 1,000-per-region default but does not pre-warm.
Connection reuse: initialize SDK clients, HTTP keep-alive, and database connections outside the handler so warm invocations reuse them instead of reconnecting. For relational databases behind Lambda, add Amazon RDS Proxy to pool and share connections and avoid exhausting the database's connection limit during bursts.

DynamoDB performance

Every partition delivers up to 3,000 read capacity units (RCU) and 1,000 write capacity units (WCU). Exceed that on a single key and you throttle even when table-level capacity is plentiful.

Hot partitions arise when a partition key has low cardinality or one very popular value, concentrating traffic on a few partitions. Fix with a high-cardinality partition key or write sharding (append a random or calculated suffix), not just more provisioned capacity. Adaptive capacity helps but cannot fully rescue a fundamentally skewed key.
GSI design: a Global Secondary Index serves a new access pattern with its own partition/sort keys and its own capacity. Under-provisioning a GSI in provisioned mode throttles the base table writes, a frequent gotcha. Project only the attributes you query to keep the index lean and cheap.
Batch and parallelism: BatchGetItem/BatchWriteItem reduce round trips; a parallel Scan with Segment/TotalSegments speeds full-table reads but burns RCU — prefer Query whenever the access pattern allows.

Worked example: when DAX is wrong

A leaderboard reads each player's current rank thousands of times per second but updates ranks every second and requires the latest value. DAX looks tempting for the read volume, but because it serves only eventually consistent data and ranks change every second, stale reads are unacceptable — DAX is the wrong tool here. The correct answers are strongly consistent reads on a well-sharded key, or moving the hot counter to ElastiCache for Redis with application-managed freshness. The exam tests exactly this trade-off: DAX wins on repeat reads of slowly changing data, and loses the moment strong consistency is required.

Lambda cost-vs-speed math

Lambda bills on GB-seconds (allocated memory times duration). Suppose a function runs 4 s at 256 MB (1 GB-s). Double memory to 512 MB and, because CPU doubles, a CPU-bound function may finish in ~2 s — still 1 GB-s, same cost but half the latency. Push to 1,024 MB and it might run 1 s (also 1 GB-s) while feeling four times faster to the caller. This is why "increase memory" is frequently both faster and cost-neutral, and why Lambda Power Tuning charts a U-shaped cost curve you can read off directly.

Caching strategies & API Gateway throttling

Beyond picking a layer, know two cache patterns: cache-aside (the app checks the cache, loads from the source on a miss, and writes back — used with ElastiCache/Memcached) versus write-through (every write updates the cache, as DAX does). Cache-aside risks stale data and a thundering-herd on cold start; write-through keeps the cache fresh at the cost of write latency. Separately, API Gateway protects backends with throttling (a steady-state rate and a burst bucket) and usage plans keyed to API keys — these shape load but are not a substitute for a cache or for authentication.

AWS Certified Developer – Associate

AWS Developer

4.2 Performance & Caching

Key Takeaways

Pick the cache by what you are accelerating

DAX specifics

Lambda performance

DynamoDB performance

Worked example: when DAX is wrong

Lambda cost-vs-speed math

Caching strategies & API Gateway throttling

AWS Certified Developer – Associate

1Introduction: DVA-C02 Overview & Study Plan

2Chapter 1: Development with AWS Services

3Chapter 2: Security

4Chapter 3: Deployment

5Chapter 4: Troubleshooting, Optimization & Exam Strategy

AWS Developer

4.2 Performance & Caching

Key Takeaways

Pick the cache by what you are accelerating

DAX specifics

Lambda performance

DynamoDB performance

Worked example: when DAX is wrong

Lambda cost-vs-speed math

Caching strategies & API Gateway throttling