Database on AWS

Choose the right database:

balanced workload or heavy-read or heavy-write?
throughput, does it fluctuate or something predictable
how much data to store, how long? average object size
data durability
latency requirement
data model
strong schema or flexible, SQL or NoSQL
License costs

Database types which are supported by AWS:

Relational Database
- RDS
  - Postgres
  - MySQL
  - MariaDB
  - Oracle
  - Microsoft SQL Server
- Aurora (* AWS Proprietary database)
NoSQL
- DynamoDB (~ JSON)
- ElastiCache (key-value pairs)
- Neptune (graphs) - no joins, no SQL
Object Store (“sort of”)
- S3
- Glacier
Data Warehouse
- Redshift
- Athena (query data in S3 using SQL, so “sort of”)

RDS Overview

RDS is a managed service:

Automated provisioning, OS patching
Continuous backups and restore to specific timestamp (Point in Time Restore)
Monitoring dashboard
Read replicas for improved read performance
Multi-AZ setup for Disaster Recovery
Maintenance windows for upgrades
Scaling capability (vertical and horizontal)
Storage backed by EBS (gp2 or io1)

BUT, you cannot SSH into your instances

There are two important features for RDS

Multi-AZ: this is for “Disaster Recovery”
Read Replicas: this is for “Improving Performance”

OLTP vs OLAP

OLAP is for Data Warehousing on AWS (a different type of architecture both from a database perspective and infrastructure layer)

ElasticCache

This is a service for a web app, it is easy to deploy, operate, and scale an in-memory cache in the cloud.

It speeds up the performance of existing databases’ frequent identical queries

RedShift

This is for BI / Data warehousing -> OLAP usage

Relational Database

RDS runs on a virtual machine, but you cannot ssh to it
RDS is NOT serverless, BUT Aurora Serverless IS Serverless

RDS - Backups, Multi-AZ & Read Replicas

Backups

There are two types of Backups for RDS

Automated Backups
Database Snapshots

Automated Backups

It allows users to recover your database to any point in time within a “Retention Period”, this is around 7 ~ 35 days.

Automated backup is enabled by default, the backup data is stored in S3, meanwhile, the size of your RDS is equal to the size of S3

Automated backups:

Daily full backup of the database (during the maintenance window)
Transaction logs are backed-up by RDS every 5 minutes
This gives the ability to restore to any point in time (from oldest back-up to 5 minutes ago)

Database Snapshot

Database Snapshot is stored even after you delete the original RDS Instance

But it is manually triggered by the user, and retention of backup for as long as you want

Multi-AZ

It has only one DNS name, automatic app failover to standby, no manual intervention in apps
This is an exact copy of your production database in another AZ
Automatically synchronized when your prod database is written to
In the event of the following:
- planned database maintenance
- DB instance failure
- AZ failure
- Network failure

This is only for Disaster Recovery, increase availability

Read Replicas

This allows you to have a read-only copy of your production database.

This is achieved by using asynchronously replication from the primary RDS Instance to the Read Replicas. It is better for those databases with heavy read workloads

Read replicas are used for SELECT (or read) the only kind of tasks, not INSERT, DELETE or UPDATE

Be careful of the difference between Read Replicas and the previous database backup methods

Read Replica is for scaling, for performance improvements
To use Read Replicas, you should turn on the setting for "Automated Backups"
You can have up to 5 Read replicas of any database
You can even have Read replicas of Read replicas
Each Read replicas has its Endpoint
Read replicas can be prompted to be their database, but replications won't work

Read Replicas - Network Cost

In AWS, there is a network cost when data goes from one AZ to another, so to reduce cost, you can have your Read Replicas in the same AZ

RDS Securities

Encryptions

This is achieved by using AWS KMS (Key Management Service), once the encryption is on, the followings are encrypted:

data underlying storage
automated backups
read replicas
database snapshots

Whenever you restore either an Automated Backup or Database Snapshot, the restored version of the database will be a new RDS instance with a new DNS endpoint

At rest encryption and In-flight encryption

At rest encryption

We can encrypt the primary DB and read replicas with AWS KMS - AES - 256 encryption, this has to be defined at launch time.

If the primary DB is not encrypted, the read replicas cannot be encrypted

or, unencrypted DB -> snapshot -> copy snapshot as encrypted -> create new DB from encrypted snapshot

Transparent Data Encryption (TDE) available for Oracle and SQL Server

In-flight encryption

This allows to use SSL Certificates to encrypt data to RDS in flight, you have to provide SSL options with trust certificate when connecting to the database

Network & IAM

Network Security

RDS databases are usually deployed with a private subnet, not in a public one
RDS security works by leveraging security groups, it controls which IP / security group can communicate with RDS

Access Management

IAM policies help control who can manage AWS RDS through the RDS API, like “who can create a read replica? etc..”
traditional username/password can be used to log into the database
IAM-based authentication can be used to login to RDS MySQL & PostgreSQL

IAM Authentication

works with MySQL & PostgreSQL
no need for a password, just a token obtained through IAM & RDS API call
the token has a lifetime of 15 minutes

the IAM Authentication has the following benefits:

Network in/out must be encrypted using SSL
IAM to centrally manage users instead of DB
Can leverage IAM Roles and EC2 Instance profiles for easy integration

Aurora

Aurora is the MySQL compatible, AWS solution to Relational Database

Aurora HA and Scalability

Aurora always maintains 2 copies of your data in each AZ, with a minimum of 3 AZ => which leads to 6 copies of your data

Among these 6 copies of your data across 3 AZ:

4 copies out of 6 needed for writes
3 copies out of 6 needed for reads
self-healing with peer-to-peer replication
storage is striped across 100s of volumes

There is a primary-secondary architecture, one Aurora instance will take writes (primary)

If there is a failover happened, the automated failover for primary DB will take effect in less than 30 seconds

You can have up to 15 Aurora Read Replicas serve reads, the read replicas support Cross Region Replication

Aurora Replicas

There are 2 types of Aurora Replicas

Aurora Replicas
MySQL Read Replicas

Automated failover is only available with Aurora Replicas

Backups for Aurora, also 2 types:

Automated Backups
Data Snapshots

To migrate data from MySQL to Aurora, you can do the following things

take a data snapshot and restore in Aurora
create an Aurora Read replica and then promote the Read Replica as a "database service"

Aurora DB Cluster

load balancing happens during connection

Aurora Serverless

Automated database instantiation and autoscaling based on actual usage
good for infrequent, intermittent, or unpredictable workloads
no capacity planning needed
pay per second, can be more cost-effective

Aurora Serverless will create Aurora instance, and the client will connect to Aurora with Proxy Fleet, with auto-scaling turned on, which means: if the QPS is large, then more Aurora instances will be created, otherwise, will be deleted

Global Aurora

Aurora Cross Region Read Replicas
- Useful for disaster recovery
- Simple to put in place
Aurora Global Database (recommended)
- 1 Primary Region (read / write)
- Up to 5 secondary (read-only) regions, replication lag is less than 1 second
- Up to 16 Read Replicas per secondary region, helps for decreasing latency
- Promoting another region (for disaster recovery) has an RTO (Recovery Time Objective) of < 1 minute

ElastiCache

This is the AWS solution to the web-service-based in-memory cache

ElasticCache Overview

The same way RDS is to get managed Relational Databases
ElastiCache is to get managed Redis or Memcached
Helps reduce the load off of databases for reading-intensive workloads
Make application stateless
Write Scaling using Sharding
Read Scaling using Read Replicas
Multi-AZ with failover Capability
AWS takes care of OS maintenance/patching, optimizations, setup, configuration, monitoring, failure recovery, and backups

DB Cache

Application queries ElastiCache first, if data is not available (cache miss) then get data from RDS and store in ElastiCache, so that for the later queries, it will reach cache hit

User Session Store

User login to any of the application
The application writes the session data into ElastiCache
The user hits another instance of our application
The instance retrieves the data and the user is already logged in

ElastiCache - Redis or Memcached

Redis	Memcached
Multi-AZ with Auto-Failover	Multi-node for the partitioning of data (sharding)
Read Replicas to scale reads and have high availability	Nonpersistent
Data Durability using AOF persistence	No backup and restore
Backup and restore feature	Multi-threaded architecture

If you need scale horizontally, you need to choose Memcached
If you need Multi-AZ, Backups, and Restores, you need to choose Redis

Right now, we have two methods of improving performance

ElasticCache -> Cache layer to speed up
Read Replicas -> Improved read/write on the database layer

ElastiCache - Cache Security

All caches in ElastiCache:

support SSL in-flight encryption
Do not support IAM authentication
IAM policies on ElastiCache are only used for AWS API-level security

Redis AUTH

You can set a “password/token” when you create a Redis Cluster
This is an extra level of security for your cache (on top of Security Group)

Memcached

Supports SASL-based authentication

ElastiCache for Solutions Architects

Patterns for ElastiCache

Lazy Loading: all the read data is cached, data can become stale in the cache
Write Through: Adds or Update data in the cache when written to a DB (no stale data)
Session Store: store temp session data in the cache (using TTL features)

DynamoDB

DynamoDB can do transactions now

DynamoDB is AWS solution for Serverless NoSQL datbase, it supports document and key-value pair data models

data is stored ion SSD Storage
High available with replication across 3 AZ (geographically distinct data centers)
Eventually Consistent Reads (Default)
Strongly Consistent Reads

Eventually Consistent Reads

Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data.

This is for "Best Read Performance", this setting is enabled by default

Strongly Consistent Reads

A Strongly Consistent Read returns result that reflects all writes that received a successful response before the read

DynamoDB - Basics

DynamoDB is made of tables
Each table has a primary key (must be decided at creation time)
Each table can have an infinite number of items (= rows)
Each item has attributes (can be added over time - can be null)
Maximum size of an item is 400 KB

DynamoDB - Provisioned Throuput

Table must have provisioned read and write capacity units
Read Capacity Units (RCU): throughput for reads
- 1 RCU: 1 strongly consistent read of 4 KB per second
- 1 RCU: 2 eventually consistent read of 4 KB per second
Write Capacity Units (WCU): throughput for writes
- 1 WCU: 1 write of 1 KB per second
Option to set up auto-scaling of throughput to meet demand
Throughput can be exceeded temporarily using “burst credit”
If burst credit is empty, you’ll get a "ProvisionedThroughputException"
It’s then advised to do an exponential back-off retry

DynamoDB - DAX

DAX is DynamoDB Accelerator, it provide seamless cache for DynamoDB, no application re-write needed. Writes go through DAX then to DynamoDB.

Multi-AZ is supported, 3 nodes minimum recommended for production

DynamoDB Streams

Changes in DynamoDB will end up in a DynamoDB Stream
This stream can be read by AWS Lambda, so we can do analytical jobs
Could implement cross-region replication using Streams
It has 24 hours of data retention

DynamoDB - Security

VPC Endpoints available to access DynamoDB without internet
Access fully controlled by IAM
Encryption: KMS and SSL/TLS

DynamoDB - Other Features

Global Tables:

Active-Active replication in many regions
Must enable DynamoDB Streams
It is Low-latency, for Disaster Recovery

Capacity Planning:

Planned capacity: provision WCU & RCU can enable auto-scaling
ON-demand capacity: get unlimited WCU & RCU, no throttle, more expensive

rare workloads - on-demand capacity
predictable workloads which can be auto-scaled - planned capacity

other features like, backup & restore, migration and you can even set up a local DYnamoDB for development

RedShift

This is the AWS solution to Data Warehousing service, it supports Massively Parallel Processing (MPP)

Backups for RedShift

this is enabled by default and it has 1 day retention period, and the maximum is 35 days
always attempt to maintain at least 3 copies of your data, the original or the replicas on the compute node, the backups are on AWS S3

But it is only available in one AZ

Neptune

Fully managed graph database

high relationship data
social network
knowledge graph

Highly available across 3 AZ, with up to 15 read replicas, with point-in-time recovery, continuous backup to Amazon S3. Support KMS encryption at rest + HTTPS