Step-by-Step Guide to Seamless Data and Application Migration from Google Cloud to AWS
Workload migrations between public clouds often surface at moments of growth, cost scrutiny, or technical crossroads. Moving from Google Cloud Platform (GCP) to Amazon Web Services (AWS) remains among the most demanding exercises for any ops or engineering team, exposing differences in IAM semantics, resource organization, and service models. Documentation rarely spells out details around database state, IAM mapping quirks, or subtle network policy mismatches—the aspects that blow up your “simple” cutover at 2 AM.
Below is a field-tested migration process engineered for minimal downtime, predictable rollback, and post-migration cost visibility. Non-obvious lessons compiled throughout.
Is AWS Worth the Switch from GCP? A Brief Justification
Cost calculators only paint half the picture. Consider:
- Reserved Instance and Savings Plan benefits: AWS’s cost models, especially for EC2 and RDS, skew in your favor if workloads are predictable.
- Depth of managed service catalog: Services like Aurora (with PostgreSQL 15 since 2023) or Step Functions make re-architecting feasible where GCP equivalents may lag.
- Networking edge cases: AWS offers more granular cross-region VPC peering and direct connect regions (notably for compliance initiatives).
- Organizational inertia: If your existing ops automation and deployment pipelines already lean AWS, aligning platforms simplifies post-migration support.
1. Inventory and Classify GCP Assets (Reality Check #1)
First, run a comprehensive GCP asset dump, not just a list compute instances. Miss a managed secret or Pub/Sub subscription and expect breakage.
Tooling:
- Use
gcloud asset list
or export via Cloud Asset Inventory to BigQuery, then aggregate with SQL. - For multi-project orgs: script exports across all projects, don’t trust someone’s spreadsheet.
Table – Example “Lift” Assessment:
Type | GCP Resource | AWS Target | Count |
---|---|---|---|
Compute | gce-vm-std-4 (n2-standard-4, Debian 11) | EC2 c5.xlarge (Amazon Linux 2023) | 12 |
Kubernetes | GKE v1.26.2, Helm 3.9 | EKS v1.26.2, Helm 3.11 | 2 |
Object Storage | Cloud Storage, multi-regional | S3, 2x replication | 18 |
SQL | Cloud SQL MySQL 8.0 | RDS MySQL 8.0.32 | 3 |
Messaging | Pub/Sub (Pull), 12 topics | SQS (12 queues), SNS | as-is |
Note: Scrutinize hidden dependencies—Cloud Build triggers, scheduler jobs, and cross-project IAM roles frequently go unrecorded.
2. Service/Feature Mapping (And the Surprises)
Standard mappings exist, but uncovered differences can bite. For example, Cloud Functions default to public endpoints unless restricted by IAM. Lambda assumes private unless you explicitly configure API Gateway or VPC.
GCP Service | AWS Analog | Major Difference / Gotcha |
---|---|---|
Compute Engine | EC2 | Boot disk imaging is different; metadata APIs vary |
GKE | EKS | Consider network plugin differences—EKS defaults to VPC CNI; GKE supports Calico out of the box |
App Engine (Standard) | Beanstalk/Lambda | No direct equivalent. Rewrite/repackage required |
Cloud Storage | S3 | S3 has global namespace; Cloud Storage does not |
BigQuery | Redshift/Athena | Migration can mean schema transformation; costs spike if data pipe isn’t compressed |
Pub/Sub | SNS/SQS | Exactly-once semantics differ; out-of-order delivery possible |
Cloud SQL | RDS/Aurora | Proxy settings and SSL setup vary |
Secret Manager | Secrets Manager | Secret rotation APIs differ |
Non-obvious tip: Service account mapping: GCP’s metadata server and workload identity don’t port directly. Use an OIDC provider with IAM roles for Kubernetes pods on EKS.
3. Data Migration: No Silver Bullet
Migrating Cloud Storage to S3
Most articles suggest aws s3 sync
. That’s flawed at scale and with versioned buckets.
-
For up to ~2–3 TB:
gsutil -m rsync -r gs://my-bucket /local-tmp aws s3 sync /local-tmp s3://my-bucket --storage-class STANDARD_IA
-
For larger volumes or deltas: use AWS DataSync with native GCP support (since v1.38, AWS CLI).
DataSync handles metadata, preserves POSIX permissions, and is far faster for 10TB+ flows by parallelizing transfers across ENIs. Schedule DataSync jobs for periods of low traffic to avoid egress throttling from Google’s side.
Gotcha: GCP egress quotas per project — expect Rate exceeded
errors on high concurrency; throttle accordingly.
Migrating Databases – Example: Cloud SQL MySQL 8 to RDS
Two-stage recommended:
-
One-time logical dump (cold):
mysqldump --single-transaction --set-gtid-purged=OFF -u root -p -h <cloudsql-ip> dbname > /tmp/db.sql aws rds restore-db-instance-from-s3 ...
For >500GB or 24/7 workloads, not practical.
-
Minimal-downtime: AWS DMS (Database Migration Service)
- Use DMS with CDC (change data capture). See DMS version >= 3.4.5 for improved Cloud SQL connectivity.
- Snapshot, bootstrap, then run live replication. Expect “replication lag” on cutover, usually <5 min with careful tuning.
Failure scenario: If Cloud SQL has GTID-based replication enabled and foreign keys not enforced, DMS chokes with Error: Table definition mismatch
. Solution: clean up schema in advance.
4. Rebuild Infrastructure on AWS – Templated, Not Manual
Porting YAML manifests from GKE to EKS? Pin EKS versions. Networking defaults are not the same (EKS uses VPC-native networking).
- Infrastructure as Code: Use Terraform 1.5+ or AWS CDK v2 for provisioning. Example: KMS key setup on S3 is not default; you must code in server-side encryption.
- Secrets: Migrate GCP’s Secret Manager entries to AWS Secrets Manager using a script. Both CLI tools support JSON export/import.
- Networking: Implement VPC/subnet layouts that replicate your previous GCP segmentation, but remember AWS NACLs are stateless; Security Groups are stateful (opposite of GCP firewall paradigm).
Known issue: AWS, unlike GCP, needs explicit ENI allocation per AZ for high throughput EKS clusters—watch pod-to-node scheduling.
5. Rigorous Testing Before (and After) Cutover
Checklist:
- Data integrity: CRC and row counts on SQL;
aws s3api head-object
checksums on storage migration. - Staging mirrors: Deploy (ideally via IaC) and perform smoke tests using real traffic patterns, not just synthetic checks.
- Performance baselining: Compare latencies using EC2 CloudWatch metrics vs. GCP Stackdriver—networked DB calls can drift by 10–20ms.
- IAM validation: Use policy simulators (
aws iam simulate-principal-policy
) to catch under-permissioned roles early. - Rollback design: If possible, operate in read-only mode post-migration until cutover is proven stable.
6. Cutover Execution and Traffic Switch
- Final data sync: Automate latest-delta copy via DataSync or DMS CDC.
- DNS cutover: Use Route53 weighted policies for staged traffic shifting; set TTL to ≤60s a week beforehand.
- Monitoring: Use both CloudWatch and (temporarily) Stackdriver for overlapping observability. Prepare current dashboards in advance—delayed detection is common cause of downtime.
First hour post cutover: Watch for 502 Bad Gateway
on load balancers, sudden spike in S3 403s (often missing bucket policies, not lost files).
7. Aftermath & Optimization
- Cost Explorer: Immediately inspect for unexpectedly high S3 PUT or Data Transfer OUT charges.
- Security posture: Enable GuardDuty. Review default VPC rules—AWS ships some open by default.
- Serverless adaptation: Now’s the window to refactor legacy App Engine apps to Lambda or containerized microservices on Fargate.
- CI/CD impacts: If you used Cloud Build triggers, replicate them in CodePipeline or attach Github Actions runners to EC2.
Post-cutover tip: Set up daily diff reports for IAM and S3/EC2 usage—silent permission mismatches crop up in the first weeks.
Failures, Gaps, and Lessons
- Networking gotchas: GCP IAM for service networking doesn’t map to AWS’s ENI permissions. Routing tables also differ—GCP’s implied routes don’t exist in AWS. Triple-check custom routes via
aws ec2 describe-route-tables
. - Egress billing: Google frequently under-reports egress costs prior to migration. Actuals may be 10–20% higher due to object versioning or hidden data flows.
- Lift-and-shift myths: Directly lifting a stateful monolith rarely works. Expect at least moderate refactor—especially where persistent disks or regional storage semantics differ.
Closing Note
No migration is perfectly clean. Start with exhaustive discovery, enable double-write during cutover if the app design permits, and accept that some permissions or performance optimizations emerge only with usage.
Choose your downtime windows judiciously and communicate with stakeholders (infra, security, dev, and finance). Odds are, something unexpected will surface—plan for it.
Related reading: See also AWS’s GCP to AWS migration playbook (PDF), and for context, compare GCP’s AWS migration docs.
Got stuck on a weird quota or IAM rule during migration? Drop a story—those edge cases shape the real playbook.