Migrate From Google Cloud To Aws

Migrate From Google Cloud To Aws

Reading time1 min
#Cloud#AWS#Migration#GCP#CloudMigration#AWSMigration

Step-by-Step Guide to Seamless Data and Application Migration from Google Cloud to AWS

Workload migrations between public clouds often surface at moments of growth, cost scrutiny, or technical crossroads. Moving from Google Cloud Platform (GCP) to Amazon Web Services (AWS) remains among the most demanding exercises for any ops or engineering team, exposing differences in IAM semantics, resource organization, and service models. Documentation rarely spells out details around database state, IAM mapping quirks, or subtle network policy mismatches—the aspects that blow up your “simple” cutover at 2 AM.

Below is a field-tested migration process engineered for minimal downtime, predictable rollback, and post-migration cost visibility. Non-obvious lessons compiled throughout.


Is AWS Worth the Switch from GCP? A Brief Justification

Cost calculators only paint half the picture. Consider:

  • Reserved Instance and Savings Plan benefits: AWS’s cost models, especially for EC2 and RDS, skew in your favor if workloads are predictable.
  • Depth of managed service catalog: Services like Aurora (with PostgreSQL 15 since 2023) or Step Functions make re-architecting feasible where GCP equivalents may lag.
  • Networking edge cases: AWS offers more granular cross-region VPC peering and direct connect regions (notably for compliance initiatives).
  • Organizational inertia: If your existing ops automation and deployment pipelines already lean AWS, aligning platforms simplifies post-migration support.

1. Inventory and Classify GCP Assets (Reality Check #1)

First, run a comprehensive GCP asset dump, not just a list compute instances. Miss a managed secret or Pub/Sub subscription and expect breakage.

Tooling:

  • Use gcloud asset list or export via Cloud Asset Inventory to BigQuery, then aggregate with SQL.
  • For multi-project orgs: script exports across all projects, don’t trust someone’s spreadsheet.

Table – Example “Lift” Assessment:

TypeGCP ResourceAWS TargetCount
Computegce-vm-std-4 (n2-standard-4, Debian 11)EC2 c5.xlarge (Amazon Linux 2023)12
KubernetesGKE v1.26.2, Helm 3.9EKS v1.26.2, Helm 3.112
Object StorageCloud Storage, multi-regionalS3, 2x replication18
SQLCloud SQL MySQL 8.0RDS MySQL 8.0.323
MessagingPub/Sub (Pull), 12 topicsSQS (12 queues), SNSas-is

Note: Scrutinize hidden dependencies—Cloud Build triggers, scheduler jobs, and cross-project IAM roles frequently go unrecorded.


2. Service/Feature Mapping (And the Surprises)

Standard mappings exist, but uncovered differences can bite. For example, Cloud Functions default to public endpoints unless restricted by IAM. Lambda assumes private unless you explicitly configure API Gateway or VPC.

GCP ServiceAWS AnalogMajor Difference / Gotcha
Compute EngineEC2Boot disk imaging is different; metadata APIs vary
GKEEKSConsider network plugin differences—EKS defaults to VPC CNI; GKE supports Calico out of the box
App Engine (Standard)Beanstalk/LambdaNo direct equivalent. Rewrite/repackage required
Cloud StorageS3S3 has global namespace; Cloud Storage does not
BigQueryRedshift/AthenaMigration can mean schema transformation; costs spike if data pipe isn’t compressed
Pub/SubSNS/SQSExactly-once semantics differ; out-of-order delivery possible
Cloud SQLRDS/AuroraProxy settings and SSL setup vary
Secret ManagerSecrets ManagerSecret rotation APIs differ

Non-obvious tip: Service account mapping: GCP’s metadata server and workload identity don’t port directly. Use an OIDC provider with IAM roles for Kubernetes pods on EKS.


3. Data Migration: No Silver Bullet

Migrating Cloud Storage to S3

Most articles suggest aws s3 sync. That’s flawed at scale and with versioned buckets.

  • For up to ~2–3 TB:

    gsutil -m rsync -r gs://my-bucket /local-tmp
    aws s3 sync /local-tmp s3://my-bucket --storage-class STANDARD_IA
    
  • For larger volumes or deltas: use AWS DataSync with native GCP support (since v1.38, AWS CLI).

    DataSync handles metadata, preserves POSIX permissions, and is far faster for 10TB+ flows by parallelizing transfers across ENIs. Schedule DataSync jobs for periods of low traffic to avoid egress throttling from Google’s side.

Gotcha: GCP egress quotas per project — expect Rate exceeded errors on high concurrency; throttle accordingly.

Migrating Databases – Example: Cloud SQL MySQL 8 to RDS

Two-stage recommended:

  1. One-time logical dump (cold):

    mysqldump --single-transaction --set-gtid-purged=OFF -u root -p -h <cloudsql-ip> dbname > /tmp/db.sql
    aws rds restore-db-instance-from-s3 ...
    

    For >500GB or 24/7 workloads, not practical.

  2. Minimal-downtime: AWS DMS (Database Migration Service)

    • Use DMS with CDC (change data capture). See DMS version >= 3.4.5 for improved Cloud SQL connectivity.
    • Snapshot, bootstrap, then run live replication. Expect “replication lag” on cutover, usually <5 min with careful tuning.

Failure scenario: If Cloud SQL has GTID-based replication enabled and foreign keys not enforced, DMS chokes with Error: Table definition mismatch. Solution: clean up schema in advance.


4. Rebuild Infrastructure on AWS – Templated, Not Manual

Porting YAML manifests from GKE to EKS? Pin EKS versions. Networking defaults are not the same (EKS uses VPC-native networking).

  • Infrastructure as Code: Use Terraform 1.5+ or AWS CDK v2 for provisioning. Example: KMS key setup on S3 is not default; you must code in server-side encryption.
  • Secrets: Migrate GCP’s Secret Manager entries to AWS Secrets Manager using a script. Both CLI tools support JSON export/import.
  • Networking: Implement VPC/subnet layouts that replicate your previous GCP segmentation, but remember AWS NACLs are stateless; Security Groups are stateful (opposite of GCP firewall paradigm).

Known issue: AWS, unlike GCP, needs explicit ENI allocation per AZ for high throughput EKS clusters—watch pod-to-node scheduling.


5. Rigorous Testing Before (and After) Cutover

Checklist:

  • Data integrity: CRC and row counts on SQL; aws s3api head-object checksums on storage migration.
  • Staging mirrors: Deploy (ideally via IaC) and perform smoke tests using real traffic patterns, not just synthetic checks.
  • Performance baselining: Compare latencies using EC2 CloudWatch metrics vs. GCP Stackdriver—networked DB calls can drift by 10–20ms.
  • IAM validation: Use policy simulators (aws iam simulate-principal-policy) to catch under-permissioned roles early.
  • Rollback design: If possible, operate in read-only mode post-migration until cutover is proven stable.

6. Cutover Execution and Traffic Switch

  • Final data sync: Automate latest-delta copy via DataSync or DMS CDC.
  • DNS cutover: Use Route53 weighted policies for staged traffic shifting; set TTL to ≤60s a week beforehand.
  • Monitoring: Use both CloudWatch and (temporarily) Stackdriver for overlapping observability. Prepare current dashboards in advance—delayed detection is common cause of downtime.

First hour post cutover: Watch for 502 Bad Gateway on load balancers, sudden spike in S3 403s (often missing bucket policies, not lost files).


7. Aftermath & Optimization

  • Cost Explorer: Immediately inspect for unexpectedly high S3 PUT or Data Transfer OUT charges.
  • Security posture: Enable GuardDuty. Review default VPC rules—AWS ships some open by default.
  • Serverless adaptation: Now’s the window to refactor legacy App Engine apps to Lambda or containerized microservices on Fargate.
  • CI/CD impacts: If you used Cloud Build triggers, replicate them in CodePipeline or attach Github Actions runners to EC2.

Post-cutover tip: Set up daily diff reports for IAM and S3/EC2 usage—silent permission mismatches crop up in the first weeks.


Failures, Gaps, and Lessons

  • Networking gotchas: GCP IAM for service networking doesn’t map to AWS’s ENI permissions. Routing tables also differ—GCP’s implied routes don’t exist in AWS. Triple-check custom routes via aws ec2 describe-route-tables.
  • Egress billing: Google frequently under-reports egress costs prior to migration. Actuals may be 10–20% higher due to object versioning or hidden data flows.
  • Lift-and-shift myths: Directly lifting a stateful monolith rarely works. Expect at least moderate refactor—especially where persistent disks or regional storage semantics differ.

Closing Note

No migration is perfectly clean. Start with exhaustive discovery, enable double-write during cutover if the app design permits, and accept that some permissions or performance optimizations emerge only with usage.

Choose your downtime windows judiciously and communicate with stakeholders (infra, security, dev, and finance). Odds are, something unexpected will surface—plan for it.

Related reading: See also AWS’s GCP to AWS migration playbook (PDF), and for context, compare GCP’s AWS migration docs.


Got stuck on a weird quota or IAM rule during migration? Drop a story—those edge cases shape the real playbook.