A payments system must process transactions exactly once and in submission order per account, while still allowing different accounts to be processed in parallel. How should SQS be configured?

FIFO queue using the account ID as the Message Group ID. A FIFO queue delivers exactly-once and in order, and setting the Message Group ID to the account ID makes each account's stream strictly serial while letting separate accounts run concurrently. A single shared group would force everything into one global serial line, killing throughput; Standard queues cannot guarantee exactly-once ordering.

Messages that hit a code bug are being redelivered indefinitely, blocking healthy messages behind them. What is the correct remedy?

Attach a Dead-Letter Queue with maxReceiveCount set to 3. Attaching a Dead-Letter Queue with a maxReceiveCount of 3 moves a repeatedly failing poison-pill message off the main queue after three attempts, so it stops blocking healthy traffic and can be investigated later. A longer visibility timeout, FIFO, or long polling does nothing to isolate a message that always fails.

A team sees their SQS bill dominated by ReceiveMessage requests, most of which return no messages. Which change reduces cost without losing messages?

Set WaitTimeSeconds to 20 to enable long polling. Setting WaitTimeSeconds to 20 turns on long polling, so each ReceiveMessage call waits up to 20 seconds for a message instead of returning empty immediately, collapsing the flood of empty requests that drives the bill. Retention, FIFO, and deletion speed do not change the empty-poll request pattern.

Producers send bursts far faster than a downstream database can absorb, occasionally overwhelming it. Which SQS-based design protects the database?

Buffer in SQS and scale consumers on the ApproximateNumberOfMessagesVisible metric. Buffering in SQS and driving consumer Auto Scaling from the ApproximateNumberOfMessagesVisible (queue depth) metric is the backpressure pattern: SQS absorbs the burst and consumers process at a controlled rate, so the database is never flooded. Writing directly defeats the buffer, a single FIFO group throttles throughput, and oversizing the database wastes money during normal load.

Advanced SQS Patterns — DLQ, Backpressure, a | Free Guide 2026

Key Takeaways

A Dead-Letter Queue (DLQ) captures messages that fail after maxReceiveCount delivery attempts, isolating poison-pill messages so one bad record cannot block the whole queue.
Standard queues give at-least-once delivery and best-effort ordering at nearly unlimited throughput; FIFO queues give exactly-once processing and strict ordering at 300 (or 3,000 batched) messages per second.
FIFO ordering is scoped to the Message Group ID: messages sharing a group are strictly serial, while different groups run in parallel, so the group ID is your parallelism key.
Long polling (WaitTimeSeconds up to 20 seconds) eliminates empty ReceiveMessage responses, cutting API request charges versus short polling.
Backpressure uses queue depth (ApproximateNumberOfMessagesVisible) to drive Auto Scaling of consumers, letting SQS absorb bursts so downstream systems are never overwhelmed.

Standard vs. FIFO: The First Decision

Every Amazon SQS question starts here. Standard queues deliver at-least-once (occasional duplicates) with best-effort ordering and effectively unlimited throughput. FIFO queues guarantee exactly-once processing and strict first-in-first-out ordering, but are capped at 300 messages/second per API action (send, receive, delete), or 3,000 messages/second with batching of 10 messages per call. High-throughput mode lifts this further per Region.

Feature	Standard	FIFO
Ordering	Best-effort	Strict, per Message Group ID
Delivery	At-least-once (duplicates possible)	Exactly-once
Throughput	Nearly unlimited	300/s, or 3,000/s batched
Name suffix	none	must end in `.fifo`

Exam trap: "financial transactions, no duplicates, in order per account" is FIFO. "Decouple a high-volume image pipeline, occasional reprocessing is fine" is Standard. Do not choose FIFO just because ordering sounds nice; its throughput cap can disqualify it.

Dead-Letter Queues and Redrive

A Dead-Letter Queue is an ordinary SQS queue you attach to a source queue via a redrive policy. The flow:

A message is received; the consumer fails to process it.
After the visibility timeout expires, the message becomes visible again.
Once the receive count hits maxReceiveCount (for example, 3), SQS moves the message to the DLQ instead of redelivering forever.
You alarm on DLQ depth, fix the bug, then use redrive to move messages back to the source queue.

Setting	Guidance
maxReceiveCount	Typically 3–5; too low loses transient-failure retries
DLQ retention	Set longer than the source (up to 14 days) so you have time to investigate
Queue-type match	Standard source needs a Standard DLQ; FIFO source needs a FIFO DLQ
Alarm	CloudWatch alarm on ApproximateNumberOfMessagesVisible of the DLQ

Worked example: A consumer crashes on a malformed record. Without a DLQ that one message is redelivered endlessly, blocking healthy traffic behind it (a "poison pill"). With maxReceiveCount of 3 and a DLQ, the bad record is sidelined after three tries and processing continues.

FIFO Internals: Group ID, Deduplication, Visibility

Message Group ID is the parallelism control. All messages sharing a group are processed strictly in order; messages in different groups process concurrently. Using a customer ID or device ID as the group ID gives per-entity ordering with cross-entity parallelism.

Goal	Group ID strategy	Result
Total global order	One shared group ID	Fully serial (slow)
Order per customer	Customer ID	Serial per customer, parallel across customers
Order per device	Device ID	Serial per device, parallel across devices

Deduplication prevents duplicate sends within a 5-minute window. Use content-based dedup (SQS hashes the body) when identical bodies mean duplicates, or supply an explicit MessageDeduplicationId when you control idempotency keys.

Visibility timeout is the silent culprit behind duplicate processing on Standard queues: if processing takes longer than the timeout, the message reappears and a second consumer grabs it. Tune the timeout above your worst-case processing time, or call ChangeMessageVisibility to extend it mid-flight.

Long Polling and Backpressure

Polling type	Behavior	Cost
Short polling	Returns immediately, often empty	Higher API charges
Long polling (WaitTimeSeconds 1–20)	Waits up to 20 s for a message	Lower, fewer empty calls

Backpressure decouples producer rate from consumer rate: producers push freely, SQS buffers the surge, and an Auto Scaling policy on the ApproximateNumberOfMessagesVisible metric adds consumers as the backlog grows and removes them as it drains, so a downstream database is never flooded.

Delivery Delays, Retention, and Choosing SNS, SQS, or EventBridge

Three timing knobs frequently appear in answer choices, and confusing them is a classic distractor.

Setting	Range	What it does
Visibility timeout	0 s to 12 hours (default 30 s)	Hides a received message from other consumers while it is processed
Message retention	1 minute to 14 days (default 4 days)	How long an unprocessed message survives in the queue
Delivery delay	0 to 15 minutes	Postpones first delivery of a new message
WaitTimeSeconds	0 to 20 seconds	Long-poll wait per receive call

A delay queue (delivery delay) is the right answer for "hold every message 5 minutes before processing," whereas per-message timers use the message timer attribute. Do not confuse delay with visibility timeout: delay applies before first delivery; visibility timeout applies after a message has been received.

Service-selection trap: SQS, SNS, and EventBridge all "decouple," so read the verb. "Buffer and process at the consumer's pace, possibly with many retries" is SQS. "Notify multiple subscribers instantly" is SNS. "Route different event types to different targets based on content rules, including SaaS sources" is EventBridge. For a message larger than the 256 KB SQS maximum, use the SQS Extended Client, which stores the payload in S3 and passes a pointer, rather than splitting the message.

Worked example: An order pipeline must guarantee no order is lost even if the processor is down for an hour. Set the queue retention to 14 days and let Auto Scaling restart consumers; SQS durably retains the backlog and processing resumes with zero data loss once capacity returns.

AWS Solutions Architect Associate

AWS Solutions Architect

7.2 Advanced SQS Patterns — DLQ, Backpressure, and FIFO

Key Takeaways

Standard vs. FIFO: The First Decision

Dead-Letter Queues and Redrive

FIFO Internals: Group ID, Deduplication, Visibility

Long Polling and Backpressure

Delivery Delays, Retention, and Choosing SNS, SQS, or EventBridge

AWS Solutions Architect Associate

1Introduction

2Domain 1: Design Secure Architectures (30%)

3Domain 2: Design Resilient Architectures (26%)

4Domain 3: Design High-Performing Architectures (24%)

5Domain 4: Design Cost-Optimized Architectures (20%)

6VPC and Networking Deep Dive

7Migration, Transfer, and Hybrid Services

8Serverless Architecture and Application Services

9Advanced Topics and Exam Scenarios

AWS Solutions Architect

7.2 Advanced SQS Patterns — DLQ, Backpressure, and FIFO

Key Takeaways

Standard vs. FIFO: The First Decision

Dead-Letter Queues and Redrive

FIFO Internals: Group ID, Deduplication, Visibility

Long Polling and Backpressure

Delivery Delays, Retention, and Choosing SNS, SQS, or EventBridge