Step-by-Step Guide: AWS RDS to Google Cloud SQL Migration

Neither compliance audits nor cost pressure justifies vendor lock-in. Replicating a production database from AWS RDS to Google Cloud SQL is feasible with careful planning and observation of edge cases—especially for organizations seeking to align with internal GCP-first roadmaps, integrate BigQuery/Vertex AI, or simply reduce OpEx.

Decision Table: Why Move Workloads?

Rationale	Notes
GCP-native integration (Security, IAM)	Service Account-based auth, Private Service Connect
BigQuery pipelines, ML/AI workloads	GCP’s pub/sub, batch ETL simplify data pipelines
Pricing volatility on AWS	Sustained use discounts on GCP may improve TCO
Geo-latency or compliance	Data residency, specific GCP regions

Prerequisites

Source DB: Confirm engine (MySQL/PostgreSQL/SQL Server), version (e.g., MySQL 8.0.32), max schema size.
Target: Pre-provision Google Cloud SQL instance, with storage >= source, version parity or superset. Prefer SSD; not all features (e.g., certain extensions) are available.
Networking: Temporary inbound firewall rules. If using VPC Peering or Interconnect, verify subnet overlap and route propagation.
Migration Tools: gcloud CLI (v465+), SQL clients (mysqldump, pg_dump, psql). For zero-downtime: Google Database Migration Service (DMS), with IAM roles configured (roles/cloudsql.admin, roles/datamigration.admin).

Migration Strategies

A. Google Database Migration Service (DMS) — Recommended for Minimal Downtime

Supports MySQL 5.6–8.0, PostgreSQL 9.6+ (with caveats around extension support).
Online mode: continuous logical replication; supports cutover testing.

B. Native Dump & Restore — Simpler, but Offline

Use mysqldump/mysql (MySQL) or pg_dump/pg_restore (PostgreSQL).
Suitable for sub-100 GB datasets, or where downtime is tolerable.

Gotcha: SQL Server not fully supported by Cloud SQL; if using SQL Server, expect additional constraints.

Source Prep

Enable binary logging (MySQL: binlog_format=ROW) in RDS parameter group. May require reboot.
Provision replication user with at least REPLICATION SLAVE privilege.

Example:

CREATE USER 'replicator'@'%' IDENTIFIED BY 'REDACTEDpw';
GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;

For PostgreSQL: set rds.logical_replication=1 and grant REPLICATION role/user.

Snapshot the RDS instance first—rollbacks can be messy.

Step 1: Establish Connectivity

Confirm network reachability: open temporary ingress (TCP 3306/5432) from Cloud SQL instance or DMS migration job workers to RDS endpoint/IP. For production, avoid public IP if possible.
Validate with telnet <rds-endpoint> 3306.

Step 2: Migration Pipeline

Option 1: Online Migration via DMS

Create Migration Job from GCP Console or gcloud:
- Select Source: RDS endpoint, engine/version.
- Select Destination: Cloud SQL instance.
- Provide replication user creds.
Validation Phase
- DMS will perform connectivity and privilege checks; error example:
```
ERROR 1045 (28000): Access denied for user 'replicator'...
```
  Fix privilege issues before proceeding.
Initial Sync + Change Replication
- DMS copies base data, then streams changes (logical replication).
- Monitor lag and throughput: GCP Console → DMS job details.
Cutover
- Application in read-only. Confirm zero replica lag.
- Promote Cloud SQL; update endpoint in application config.

Known issue: Large tables (>1TB) may hit DMS timeouts or performance bottlenecks. In such cases, staged dump/restore for large static tables, DMS only for critical transactional ones.

Option 2: Dump & Restore

MySQL Example:

mysqldump --set-gtid-purged=OFF \
          --single-transaction \
          --routines --events --triggers \
          --host=<RDS-ENDPOINT> \
          --user=admin --password='<PW>' \
          mydb > mydb.sql

gcloud sql import sql $CLOUDSQL_INSTANCE mydb.sql --database=mydb

PostgreSQL Example:

pg_dump -Fc \
        --host=<RDS-ENDPOINT> \
        --username=postgres \
        mydb > mydb.dump

gcloud sql import csv $CLOUDSQL_INSTANCE mydb.dump --database=mydb

*Note: Cloud SQL import supports limited formats; consult docs.

Step 3: Data Consistency Check

Always verify row count and checksums:

SELECT COUNT(*) FROM important_table;

For very large datasets, check with sampling or hash aggregation (e.g., SELECT MD5(STRING_AGG(col, ',')) ...).

Validate all objects: foreign keys, triggers, stored routines. Occasionally, certain triggers or procedures may require manual fix-up due to engine differences or permissions.

Step 4: Application Cutover

Freeze writes to RDS.
Finalize sync (DMS: wait for target lag = 0; Dump/Restore: ensure no app writes since start).
Update application configs (DATABASE_URL, dotenv, Secret Manager refs).
Monitor logs for ERROR 1045, connection limit issues, or increased query latency.

Tips and Caveats

Dry-run on Staging: Always simulate on a copy with prod-like size and topology.
Throughput bottlenecks: For >500GB, use high-throughput network links (e.g., VPN/Direct Interconnect); consider parallel table exports.
Extension support: Cloud SQL restricts some PostgreSQL extensions—test compatibility (e.g., postgis needs extra setup).
TLS enforcement: Google Cloud SQL generally requires SSL connections; update application drivers accordingly.
Timeouts/errors: Cloud SQL import jobs have hard limits (near 24 hours in some regions)—monitor duration and chunk large dumps.

Conclusion

Vendor-neutral database portability is achievable with either DMS for online replication or classic dump/restore for simpler cases. Set up robust validations—row counts, schema diffs, custom health checks—before updating production configs. Don’t skip the dry run. And, as always, be ready to rollback fast if post-migration latency or consistency issues emerge.

Reference Logs from the Field:

[2024-03-19T11:04:05Z] DMS_INITIAL_COPY_COMPLETE: Initial sync succeeded
[2024-03-19T12:13:52Z] DMS_REPLICATION_LAG_WARNING: Replica lag 15s on table 'orders'
[2024-03-19T12:20:51Z] DMS_CUTOVER_RECOMMENDED: Replica lag <1s, OK to cutover

For more, see GCP DMS documentation or ping your infrastructure team’s #cloud-sql-migration Slack channel.

Note: Some teams use native replication or dual-write proxy designs for zero-downtime cutovers. These introduce operational complexity outside typical migration scopes.

Migrate Aws Rds To Google Cloud Sql