Rds To Redshift

Rds To Redshift

Reading time1 min
#Cloud#Analytics#Database#Redshift#AWS#DataMigration

Seamless Data Migration: Efficient RDS to Redshift Replication for Analytics

Operational databases handle rapid transactions, but they quickly bottleneck under analytical workloads—especially as source tables hit millions of rows and business leaders demand real-time insights. Dumping data from Amazon RDS straight into Redshift sounds simple, but the moment a manual process or batch ETL is involved, latency inflates and data quality suffers.

Consider a SaaS platform handling user actions in RDS (PostgreSQL 13.7). CFO requests “latest signups by region, accurate to the hour, cross-referenced with NPS scores.” The join easily overloads RDS, and ETL-glued data pipelines often leave hours of staleness. The solution: continuous replication via Change Data Capture (CDC), preferably with a managed service to reduce operational drag.


RDS–Redshift Replication: Approaches Compared

1. Batch ETL (“Old School”)

  • Export RDS tables. Typical: pg_dump --table users
  • Push flat files (CSV/Parquet) to S3.
  • Run COPY command on Redshift.
  • Trade-offs:
    • IO and CPU spikes.
    • Usually 10–120 minute lag.
    • Doesn’t handle deletes well.
  • Log Example:
    ERROR: Load into table 'users' failed. String data, right truncation.
    
    Mapping schema changes mid-run? Fragile at best.

2. Incremental ETL

  • “What changed since X?”—use updated_at or ID tracking.
  • Scheduled jobs extract new/changed rows.
  • More efficiency, but:
    • Complex merge/upsert logic.
    • Backlog if scheduler misses a run (see Airflow incident reports).

3. Federated Queries

  • Redshift Spectrum/Federated Query directly connects to RDS.
  • No data moved; queries hit production DB live.
  • Major gotcha: Analytical joins stress live RDS clusters.
    “SELECT count(*) FROM orders JOIN users …” → Not popular with SREs.

Modern Approach: Change Data Capture (CDC) with AWS DMS

Replicate only inserts, updates, and deletes—eliminate table-wide scans and minimize lag. AWS Database Migration Service (DMS) supports this for RDS PostgreSQL/MySQL as a managed service, with granular mapping and error handling.

DMS-Based CDC in Practice

Versions Used:

  • RDS PostgreSQL ≥12
  • Redshift RA3 or DS2 nodes
  • DMS v3.4.7

1. Enable Logical Replication on Source

Parameter group changes in AWS Console:

-- edit DB parameter group
rds.logical_replication = 1

-- On RDS:
CREATE USER dms_user WITH REPLICATION PASSWORD 'strong_password_here';
GRANT rds_replication TO dms_user;

Applies after DB restart (minor downtime).

2. Provision DMS Instance & Endpoints

  • Small workloads: dms.t3.medium (2 vCPU, 4GB).
  • Security Groups: Allow DMS’s ENIs to reach RDS+Redshift.
  • Set up source endpoint (RDS) and target endpoint (Redshift), using dms_user and Redshift's admin.

3. Configure & Launch Migration Task

  • Migration type: “Migrate existing data and replicate ongoing changes.”
  • Mapping rules: Explicit column mapping suggested (includeColumns, handle is_deleted or soft deletes up front).
  • CDC start point: “Now” or from a given LSN/SCN (advanced setups).

DMS UI tip:

  • De-select “Truncate table before load” for critical tables (prevents data loss).
  • Enable CloudWatch logs for LastFailureMessage.
  • Gotcha: DMS auto-maps numeric to Redshift FLOAT8; specify DECIMAL if precision critical.

4. Validate and Monitor

  • Use Redshift’s STL_LOAD_COMMITS and STL_ERROR for status.
  • DMS task dashboard: Watch replication lag (seconds to ~2 minutes typical).
  • Sample verification:
    -- Redshift check
    SELECT count(1) FROM users WHERE updated_at > (CURRENT_TIMESTAMP - INTERVAL '10 minutes');
    
  • Patch window delays can cause DMS tasks to stop, requiring manual intervention.

Open-Source/Third-Party Alternatives

Debezium + Kafka Connect:

  • Streams CDC events into Kafka; use Redshift sink for ingestion.
  • Pros: Multi-cloud, schema evolution configurable.
  • Cons: Extra infra; sync drift risk if consumers lag.

Other CDC services:

  • Airbyte, Fivetran, Matillion: UI-driven, integrated with role-based access.
  • Cost often nontrivial at scale; advisory for regulated-data workloads.

Checklist: RDS to Redshift CDC Migration

StepDetail
Enable logical replicationSet RDS parameter; confirm via SHOW rds.logical_replication;
Create DMS userCREATE USER dms_user ...; + replication grants
Provision DMSAllocate replication instance with right subnet + SGs
Source/target endpointsUse minimal-priv user for RDS; Redshift admin for target
Map schemas deliberatelyNormalize/denormalize as needed, handle deletes/soft deletes
Run full+CDC migration taskConfirm “ongoing replication” status without excessive lag
Validate with sample queriesUse count/aggregate/spot-check queries on Redshift, compare RDS

Practical Lessons & Caveats

  • DMS’s type-mapping is not always optimal—manual overrides needed for numeric/JSON/BLOB handling.
  • Large schema changes (ALTER TABLE) on source can cause task failure; automate DMS task restart via Lambda or similar.
  • For multi-TB tables, split migration by partitions (date-based), reduce lock contention.
  • Data-modeling: Redshift performs better with denormalized star schemas versus 3NF structures—reshape before ingesting.
  • Monitoring: Always set up CloudWatch alarms on replication lag and task failure, not just completion.

Non-Obvious Tip

Enable Redshift’s auto_commit OFF during bulk loads, then switch back post-load to avoid partial-data states on multi-table loads. Found this reduces reconciliation incidents during audit cycles.


Note: DMS does not natively preserve source-side triggers or user-defined functions. For these, use custom ETL or external orchestration if retention is required.


Efficient, near real-time RDS-to-Redshift replication makes analytics feasible without sacrificing database performance or developer time. The initial setup is moderately involved—especially with network/Security Group tuning and schema mapping—but ongoing maintenance is light compared to hand-rolled ETL pipelines.

There are edge cases: minor DMS glitches after RDS upgrades, lag spikes with bulk source-side updates, or staged loads when users require 100% up-to-the-minute reporting. Test with actual big-event volumes before production cutover—auto-throttling isn’t perfect.

For walkthroughs of specific RDS engines, DMS task JSON configurations or Redshift table design for audit compliance, contact directly or refer to AWS docs—details shift with each quarterly release.