3.3 AWS Database Services
Key Takeaways
- Amazon RDS is a managed relational database service supporting MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora.
- Amazon DynamoDB is a fully managed NoSQL key-value database with single-digit millisecond latency at any scale.
- Amazon Aurora is a cloud-native relational database compatible with MySQL and PostgreSQL, up to 5x faster than standard MySQL.
- Amazon Redshift is a cloud data warehouse designed for analytical queries (OLAP) across petabytes of structured data.
- Amazon ElastiCache provides in-memory caching (Redis or Memcached) for sub-millisecond response times.
AWS Database Services
Quick Answer: AWS offers purpose-built databases for different needs: RDS/Aurora for relational data, DynamoDB for NoSQL key-value, Redshift for analytics/warehousing, ElastiCache for in-memory caching, and DocumentDB for document databases. Choose based on data model and access pattern.
Relational Databases
Amazon RDS (Relational Database Service)
Amazon RDS is a managed service that makes it easy to set up, operate, and scale relational databases in the cloud.
Supported engines:
| Engine | Description |
|---|---|
| Amazon Aurora | Cloud-native, MySQL/PostgreSQL-compatible |
| MySQL | Open-source relational database |
| PostgreSQL | Advanced open-source relational database |
| MariaDB | Community-developed fork of MySQL |
| Oracle | Enterprise relational database |
| SQL Server | Microsoft relational database |
What RDS manages for you:
- Hardware provisioning and OS patching
- Database engine patching
- Automated backups and point-in-time recovery
- Multi-AZ deployment for high availability
- Read replicas for read scaling
- Monitoring and metrics via CloudWatch
What you manage:
- Database schema design
- Query optimization
- IAM and database user management
- Choosing the instance size and storage type
- Encryption configuration
Amazon Aurora
Amazon Aurora is AWS's cloud-native relational database built for the cloud, compatible with MySQL and PostgreSQL.
| Feature | Detail |
|---|---|
| Performance | Up to 5x faster than standard MySQL, 3x faster than standard PostgreSQL |
| Durability | 6 copies of data across 3 AZs |
| Auto-scaling storage | Grows automatically from 10 GB up to 128 TB |
| High Availability | Automated failover with up to 15 read replicas |
| Aurora Serverless | Auto-scales compute capacity based on demand |
On the Exam: If a question mentions "cloud-native relational database" or "MySQL/PostgreSQL compatible with better performance," the answer is Aurora.
NoSQL Databases
Amazon DynamoDB
Amazon DynamoDB is a fully managed NoSQL database that delivers single-digit millisecond latency at any scale.
| Feature | Detail |
|---|---|
| Data Model | Key-value and document |
| Performance | Single-digit millisecond response at any scale |
| Scaling | Automatic scaling; handles 10+ trillion requests per day |
| Serverless | No servers to manage, patch, or maintain |
| Global Tables | Multi-Region, multi-active replication |
| DAX | DynamoDB Accelerator for microsecond reads (in-memory cache) |
| Pricing | Pay-per-request or provisioned capacity |
When to use DynamoDB:
- High-traffic web applications needing consistent low latency
- Mobile and gaming backends
- IoT data storage
- Session management
- Shopping carts
On the Exam: DynamoDB = NoSQL, serverless, single-digit millisecond latency, key-value/document store. If a question describes a NoSQL or key-value database need, DynamoDB is usually the answer.
Analytical Databases
Amazon Redshift
Amazon Redshift is a cloud data warehouse designed for analytical queries (OLAP) across large datasets.
| Feature | Detail |
|---|---|
| Type | Columnar data warehouse |
| Scaling | Petabyte-scale |
| Performance | 10x better than traditional data warehouses |
| Redshift Serverless | Analyze data without managing clusters |
| Redshift Spectrum | Query data directly in S3 without loading |
When to use Redshift:
- Business intelligence and reporting
- Complex analytical queries across large datasets
- Aggregation and summarization of historical data
Caching
Amazon ElastiCache
Amazon ElastiCache provides in-memory caching to improve application performance.
| Engine | Use Case |
|---|---|
| Redis | Advanced data structures, replication, pub/sub, persistence |
| Memcached | Simple caching, multi-threaded, no persistence needed |
Use ElastiCache when:
- Database queries are repetitive and results can be cached
- Session data needs to be shared across application servers
- Sub-millisecond response times are required
Other Database Services
| Service | Type | Use Case |
|---|---|---|
| Amazon DocumentDB | Document database (MongoDB-compatible) | Content management, catalogs, user profiles |
| Amazon Neptune | Graph database | Social networks, fraud detection, recommendation engines |
| Amazon Keyspaces | Wide-column (Cassandra-compatible) | Equipment maintenance, fleet management, route optimization |
| Amazon QLDB | Ledger database (immutable) | Financial transactions, supply chain, regulatory compliance |
| Amazon Timestream | Time-series database | IoT, DevOps monitoring, application metrics |
| Amazon MemoryDB for Redis | Redis-compatible durable database | Durable in-memory workloads |
Database Selection Guide
| Requirement | Best Service |
|---|---|
| Relational data, complex queries | Amazon RDS or Aurora |
| High-speed key-value lookups | Amazon DynamoDB |
| Data warehousing and analytics | Amazon Redshift |
| In-memory caching | Amazon ElastiCache |
| Document storage (MongoDB workloads) | Amazon DocumentDB |
| Graph relationships | Amazon Neptune |
| Immutable, verifiable transaction log | Amazon QLDB |
| Time-series data (IoT sensors) | Amazon Timestream |
Which AWS database service is a fully managed NoSQL database that provides single-digit millisecond latency at any scale?
A company needs a cloud-native relational database that is compatible with MySQL and provides up to 5x better performance. Which service should they use?
Which AWS service is designed for running complex analytical queries across petabytes of structured data?
A company needs a graph database to manage highly connected datasets for a social networking application. Which service should they choose?