Maximize Event-Driven Architectures with AWS SQS and SNS: Scalable Messaging in Practice
Designing a distributed system that absorbs bursty workloads without message loss or unintended duplication is rarely straightforward—especially at scale. While AWS SQS (queue) and SNS (notification) are often deployed separately, integrating the two enables architectures that withstand transactional spikes and cascades of events with minimal coupling between producers and consumers.
Below: the canonical pattern for fan-out messaging and resilient ingestion—SNS topic publishes to multiple SQS queues, each serviced by a separate consumer.
┌────────────┐ publish ┌─────────┐ push ┌─────────┐
│ Producer A │ ────────────────▶ │ SNS │ ─────────────▶ │ SQS Q1 │ ──▶ Worker 1
└────────────┘ │ Topic │ └─────────┘
└─────────┘ push ┌─────────┐
└───────▶ │ SQS Q2 │ ──▶ Worker 2
└─────────┘
Both time-coupling and backpressure are handled by SQS; horizontal scale and topic-based routing by SNS. Zero need for custom HTTP, polling, or multiplexing logic.
Typical Scenario: E-commerce Order Eventing
Consider an order platform generating OrderPlaced
events. Downstream, inventory and shipping services require these events, each with individual processing logic and reliability guarantees.
Instead of hand-rolling message distribution, SNS fans out notifications asynchronously to dedicated SQS queues per consumer:
- SNS Topic:
order-events
- SQS Queues:
inventory-events
,shipping-events
- Both consumers process at their own rate; spikes are buffered independently.
Implementation Steps
1. Provision an SNS Topic
- Console: SNS → Create topic → Standard (
order-events
) - Avoid FIFO unless you critically require ordering and idempotency; standard topics suffice in >90% of use cases.
2. Create SQS Queues
- Console: SQS → Create queue → Standard (
inventory-events
,shipping-events
) - FIFO queues require message group IDs; don’t start there unless absolutely necessary.
- Configure
VisibilityTimeout
carefully: too short? Redelivery churn; too long? Stuck messages on crash.
3. Configure SQS Queue Access Policies
SNS must be authorized to publish to the target SQS. Attach a resource policy (example below uses a dedicated SNS topic ARN):
{
"Version": "2012-10-17",
"Id": "sns-to-sqs-policy",
"Statement": [
{
"Sid": "OnlyAllowSNSSend",
"Effect": "Allow",
"Principal": { "Service": "sns.amazonaws.com" },
"Action": "sqs:SendMessage",
"Resource": "arn:aws:sqs:us-east-1:111122223333:inventory-events",
"Condition": {
"ArnEquals": {
"aws:SourceArn": "arn:aws:sns:us-east-1:111122223333:order-events"
}
}
}
]
}
Omitting this step typically results in AccessDenied
errors on message publish. Trace with AWS CloudTrail if unclear.
4. Subscribe SQS Queues to the SNS Topic
- SNS → order-events topic → Create subscription
- Protocol: SQS
- Endpoint: SQS queue ARN
This creates the “fan-out” wiring with minimal manual overhead.
Practical Examples
Orders Publish Event via SNS (Python 3.11 + boto3 ≥1.23.0)
import boto3, json
sns = boto3.client('sns')
topic_arn = 'arn:aws:sns:us-east-1:111122223333:order-events'
event = {
'order_id': 'A123',
'status': 'PLACED',
'line_items': [{'sku': 'PEN', 'qty': 10}]
}
sns.publish(
TopicArn=topic_arn,
Message=json.dumps(event),
MessageAttributes={
'eventType': {'DataType': 'String', 'StringValue': 'OrderPlaced'}
}
)
Plain JSON—no schema registry here. If you require schema, consider enforcing constraints at consumer level, or switch to EventBridge for schema enforcement.
Consumer Reads and Processes SQS Messages—Note SNS Envelope Wrapping
import boto3, json
sqs = boto3.client('sqs')
queue_url = 'https://sqs.us-east-1.amazonaws.com/111122223333/inventory-events'
resp = sqs.receive_message(
QueueUrl=queue_url,
MaxNumberOfMessages=5,
WaitTimeSeconds=10
)
if 'Messages' not in resp:
print("No messages to process.")
exit()
for msg in resp['Messages']:
try:
sns_body = json.loads(msg['Body'])
payload = json.loads(sns_body['Message'])
# Business logic here
print("Processing order:", payload.get('order_id'))
# Delete only after successful processing
sqs.delete_message(QueueUrl=queue_url, ReceiptHandle=msg['ReceiptHandle'])
except Exception as err:
print("Processing failed:", err)
# Message will be retried according to VisibilityTimeout
# Sample error output:
# Processing failed: Expecting value: line 1 column 1 (char 0)
Beware: SNS-to-SQS delivery wraps payloads; actual message is under ["Message"]
. Gotcha for first-time implementers.
Advanced Features & Tips
Dead-Letter Queues (DLQ):
Always attach DLQs to your SQS queues. This ensures poisoned messages (exceeded max receive count) don’t block the queue. The trade-off: silent message drops unless monitored. Pull DLQ metrics or set up alarms—it's easy to neglect.
SNS Filtering:
For large topics, filter subscriptions by message attributes. Example:
{
"eventType": ["OrderPlaced", "OrderCancelled"]
}
This ensures each SQS queue receives only relevant event types. Noise reduction is significant for high-volume domains.
FIFO Patterns:
Need guaranteed ordering? As of SNS 2019-10-23, FIFO topics and SQS FIFO queues support strict ordering, but configuring message group IDs and deduplication adds complexity. Confirm support for your SDK or IaC toolchain before attempting this path.
Latency and Throughput:
Observed: SNS-to-SQS publish incurs negligible delay (<200ms in us-east-1 under typical conditions). Under sustained load (tens of thousands of messages/min), standard queues with long polling remain stable—sporadic throttling, if any, will surface as ThrottlingException
in logs. Typical pitfall: Lambda triggers on SQS default to 5 concurrent batches. Tune accordingly.
Known Issues & Real-World Quirks
- Cross-account setups require extra configuration: both sides (SNS topic owner's and SQS queue owner's) policies must permit interactions. Missed this? Expect silent message nondelivery.
- Message size limit is 256KB. For larger payloads, use S3 pointer patterns—a classic workaround, albeit with added operational burden.
- SQS message retention defaults to 4 days. Missed processing windows are rare, but can happen during maintenance; monitor retention period based on expected consumer downtime profiles.
Summary & Next Steps
Pairing SNS with SQS establishes a proven, scalable pattern for event-driven AWS architectures—high fan-out, loose coupling, and de-duplication are all achieved without exotic components. This workflow is a foundation of reliable asynchronous processing in production systems I’ve supported (retail, logistics, analytics).
Terraform / CloudFormation?
Automate these links with resource blocks—wiring up SNS subscriptions to SQS via IaC avoids manual misconfiguration. Multi-region? SNS topics don’t natively span regions; use a cross-region replication strategy for global pub/sub.
Other teams implement with EventBridge or custom brokers, but SNS→SQS remains a low-ops default for most use cases.
Questions on specific IAM setups or IaC patterns? Drop them; there’s edge cases not covered here—especially when integrating with legacy stacks or hybrid clouds.