Learn To Cloud

Learn To Cloud

Reading time1 min
#Cloud#Automation#IaC#Terraform#Infrastructure

Mastering Cloud Infrastructure Automation with Infrastructure as Code

Patching a broken cloud deployment at 2 AM typically means someone bypassed automation in favor of manual steps. Such incidents are avoidable—and often stem from the absence of Infrastructure as Code (IaC).


Eliminating Configuration Drift

Hand-configured AWS instances, untracked firewall rules, or ad-hoc storage buckets—these are recipes for inconsistency and difficult troubleshooting. In multi-account scenarios, mistakes compound. A mature team treats its infrastructure like application code: under version control, peer-reviewed, reproducible.

IaC frameworks such as Terraform, Pulumi, or AWS CloudFormation encode your infrastructure in declarative files. Every environment is derived from source, making "What’s running in production?" an answerable question. Critically, IaC enables you to audit every change and revert or redeploy with full traceability.


Practical Start: Terraform for Repeatable Builds

Terraform v1.6.0 (latest LTS as of this writing) will be used in the following example because of its stability and cross-cloud support.

1. Environment Preparation

  • Terraform installation: Binary downloads from https://www.terraform.io/downloads.html (Linux/amd64 preferred for CI runners).
  • Cloud credentials: Configure your provider CLI tools (aws configure, az login, or gcloud auth login). For AWS, confirm that your access keys are present in ~/.aws/credentials.

Note: For team use, avoid embedding credentials in code. Prefer IAM roles, service accounts, or OIDC integration.


2. Minimal, Self-Documenting IaC: AWS Example

Create a main.tf in a clean workspace:

terraform {
  required_version = ">= 1.6.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.34.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_instance" "bastion" {
  ami           = "ami-0c94855ba95c71c99"    # Amazon Linux 2, confirmed as of June 2024
  instance_type = "t2.micro"
  tags = {
    Name = "bastion"
    Environment = "dev"
  }
}

Known issue: AMI IDs vary by region and can become obsolete. Pin to a public SSM parameter if possible; otherwise, expect periodic AMI ID updates.


3. Lifecycle Control: Init, Plan, Apply

terraform init    # Downloads providers, initializes state
terraform plan    # Dry-run: Shows create/modify/destroy actions
terraform apply   # Provisions actual resources, prompts for confirmation

Sample output (excerpt):

Plan: 1 to add, 0 to change, 0 to destroy.
...
aws_instance.bastion: Creating...
aws_instance.bastion: Creation complete after 35s [id=i-0123456789abcdef0]

Errors such as Error: No valid credential sources found for AWS Provider indicate missing or misconfigured credentials.


4. Full Infra Lifecycle and Drift Management

  • To modify, update .tf files, re-run plan/apply.
  • To destroy everything created by this root module:
terraform destroy

Note: If you refactor resources into modules or import pre-existing cloud resources, be aware of state inconsistencies. Sometimes, manual state file surgery (terraform state mv ...) becomes necessary.


Going Beyond Basics

Variable Injection

Parameterize to support multiple environments:

variable "instance_type" {
  description = "EC2 instance size"
  type        = string
  default     = "t2.micro"
}

resource "aws_instance" "bastion" {
  ami           = "ami-0c94855ba95c71c99"
  instance_type = var.instance_type
}

Use terraform apply -var="instance_type=t3.small" for overrides.

Remote State

Store state remotely in S3 with DynamoDB locking to enable safe collaboration:

terraform {
  backend "s3" {
    bucket         = "iac-state-prod"
    key            = "bastion/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "iac-state-lock"
    encrypt        = true
  }
}

Gotcha: If state locking fails, concurrent applies can corrupt infrastructure.

Modular Architecture

Segment recurring patterns—VPCs, security groups, IAM roles—into reusable modules (./modules/vpc, etc.). This reduces copy-paste drift and enables isolated testing.


CI/CD Integration and Policy Enforcement

Automate IaC execution in CI pipelines (GitHub Actions, GitLab CI, Jenkins). Guardrails: run terraform fmt -check, terraform validate, then plan on pull requests. Apply via pipeline with protected credentials.

For compliance, integrate Sentinel or Open Policy Agent to enforce controls (e.g., block public S3 buckets):

deny[msg] {
  input.resource_type == "aws_s3_bucket"
  input.public == true
  msg = "S3 buckets cannot be public"
}

Non-Obvious Tip

When scaling teams: avoid monolithic state files. Instead, use multiple root modules for logical separation (e.g., networking, compute, monitoring). This speeds up plan cycles and reduces accidental cross-environment impact.


Closing Notes

IaC elevates infrastructure work to first-class engineering—observable, testable, and repeatable. Expect sharp edges: state file conflicts, provider bugs, and upstream cloud API quirks are inevitable. Always version-lock providers and audit your state storage.

Start small. Automate a single resource. Validate it. Then expand incrementally, applying patterns and learnings as you progress.

For more advanced coverage—modular patterns, multi-cloud architectures, and defense-in-depth—look out for forthcoming articles.


No more chasing ephemeral cloud changes. Shift your focus: let infrastructure code, not tribal knowledge, govern your platform.