1.7 AWS SDK & API Best Practices
Key Takeaways
- SDKs resolve credentials via a provider chain (environment variables, shared config/credentials files, then IAM roles); prefer roles and temporary credentials over long-lived access keys.
- SDKs automatically retry throttled and transient errors using exponential backoff with jitter; treat HTTP 429, ProvisionedThroughputExceeded, and 5xx errors as retryable.
- List/Query/Scan APIs return paginated results via a continuation token (NextToken or LastEvaluatedKey); loop until none is returned, or use SDK paginators.
- Design idempotent operations so retries and at-least-once delivery do not cause duplicate side effects, using de-duplication keys, client request tokens, or conditional writes.
- AppConfig manages and safely rolls out application configuration and feature flags; retrieve plain config from SSM Parameter Store and rotating secrets from Secrets Manager at runtime.
Building Resilient Clients
The SDK best-practices items reward defensive coding: resolve credentials securely, retry transient failures correctly, page through every result, and make operations safe to repeat. These principles show up across the exam's Security (26%), Deployment (24%), and Troubleshooting (18%) domains, not just Development.
Credentials and the Provider Chain
The AWS SDK locates credentials through a default credential provider chain, evaluated in order (this is the common precedence; exact order varies slightly by SDK):
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN) - Shared config/credentials files (
~/.aws/credentials, named profiles) - IAM roles — EC2 instance profile, ECS task role, or Lambda execution role, served via the instance/container metadata service
On AWS compute, attach a role so the SDK automatically retrieves and rotates temporary credentials behind the scenes. The security rule the exam enforces: never hard-code long-lived access keys in code, config, or environment variables — use roles and sts:AssumeRole for cross-account access.
Exponential Backoff and Jitter
SDKs automatically retry throttled and transient errors using exponential backoff — each retry waits longer (for example 100 ms, 200 ms, 400 ms) — plus jitter, a randomized delay that prevents many clients from retrying in lockstep and creating a synchronized retry storm. Retryable signals include HTTP 429 Too Many Requests, ProvisionedThroughputExceededException, ThrottlingException, and 5xx server errors. You can tune the max attempts and the retry mode (legacy, standard, adaptive). Do not retry non-retryable 4xx errors such as AccessDenied or ValidationException.
Pagination
List-style APIs return one page of results plus a continuation token — NextToken for most services or LastEvaluatedKey for DynamoDB Query/Scan. You must loop, passing the token back on each call, until none is returned, or use a built-in paginator that handles the loop for you. Ignoring the token silently returns only the first page (often capped near 1 MB or 1,000 items) — a classic production bug where code "only sees the first thousand records."
Idempotency
Because SQS, asynchronous Lambda, EventBridge, and SDK retries can all deliver or invoke more than once, design operations to be idempotent — producing the same result whether run one time or many. Techniques:
- De-duplication keys / client request tokens (many APIs accept a
clientRequestToken/idempotencyToken) - Conditional writes (DynamoDB
attribute_not_exists(pk)so an insert applies only once) - Tracking processed message IDs in a store and skipping repeats
Without idempotency, at-least-once delivery causes duplicate side effects such as double charges, duplicate emails, or repeated orders.
AppConfig, Feature Flags, and Parameters
| Service | Stores | Key feature |
|---|---|---|
| AWS AppConfig (part of Systems Manager) | Config + feature flags | Validators, gradual rollout, automatic rollback on a CloudWatch alarm |
| SSM Parameter Store | Plain config + secrets | Free standard tier; SecureString encrypts with KMS |
| AWS Secrets Manager | Rotating secrets | Built-in automatic rotation for DB credentials, higher cost |
AppConfig deploys configuration separately from code, supports percentage-based gradual rollout, and automatically rolls back when a linked CloudWatch alarm fires — all polled at runtime with no redeploy. Retrieve plain config and SecureString values from Parameter Store and rotating secrets from Secrets Manager at runtime; never bake secrets into the deployment package or container image.
Parameter Store vs Secrets Manager
The two stores overlap, so the exam tests the choice. Parameter Store is free for the standard tier and ideal for plain configuration and SecureString values, but it has no built-in rotation. Secrets Manager costs per secret per month yet provides native automatic rotation (with managed Lambda rotation for RDS, Redshift, and DocumentDB) and cross-account/cross-Region replication. Rule of thumb: if the scenario emphasizes automatic credential rotation, choose Secrets Manager; if it emphasizes free, simple config storage, choose Parameter Store.
Both encrypt SecureString/secret values with KMS and are fetched at runtime, never embedded in code.
X-Ray, Caching Clients, and Common Traps
Instrument applications with AWS X-Ray to trace requests across services, identify latency bottlenecks, and view a service map; the SDK can be wrapped so downstream AWS calls become subsegments automatically. Cache infrequently changing config and reuse SDK clients across warm Lambda invocations rather than recreating them per request.
Frequent exam traps: retrying a non-retryable 4xx like AccessDenied; forgetting pagination and processing only the first page; assuming SDK retries make code idempotent (they do not — you must design idempotency); hard-coding access keys instead of using a role; and choosing CloudFormation or a redeploy when AppConfig's runtime feature flags and gradual rollout are the intended answer.
It also helps to remember the credential-precedence order under pressure: explicit credentials passed in code win first, then environment variables, then the shared profile files, and only then the instance/container role — so a stale AWS_ACCESS_KEY_ID in the environment can silently shadow the correct role and cause confusing AccessDenied errors. For cross-account access, the resilient pattern is sts:AssumeRole to obtain temporary credentials scoped to the target account, never sharing long-lived keys between accounts.
When retries are tuned, the adaptive retry mode adds client-side rate limiting on top of standard backoff, which is useful for clients that frequently hit throttling, while the standard mode is the safe default for most applications. Throttled requests should be measured, not just retried blindly: surfacing a ThrottlingException rate metric tells you whether to request a quota increase, add caching, or batch calls rather than simply hammering the API harder with more retries.
A Lambda function lists thousands of objects but its code only ever sees the first ~1,000 results, and the list API response includes a NextToken. What is the bug?
A payment handler is invoked from an SQS queue with at-least-once delivery, so the same message can arrive twice. How do you prevent double charges?
A team wants to toggle a new feature on for 10% of users and automatically roll back if error rates spike, all without redeploying code. Which AWS service is designed for this?