Devops Topics To Learn

Devops Topics To Learn

Reading time1 min
#DevOps#Cloud#Infrastructure#IaC#Terraform#Automation

Mastering Infrastructure as Code (IaC) for Scalable and Reliable DevOps Pipelines

Manual setups do not scale. Consistency, audibility, and velocity in cloud operations demand Infrastructure as Code (IaC). For any production-grade environment—single cloud, multi-cloud, or on-prem—neglecting IaC is operational debt.


Why IaC Is Foundational in DevOps

Infrastructure state must be described in code: text files, checked in to Git, making drift and “works on my machine” problems traceable. Without it, reproducibility dissolves—think of a prod deployment at 3 AM that fails because someone changed a security group by hand last week.

Key advantages:

  • Reproducibility: Identical environments can be spun up on demand (dev, staging, prod).
  • Scalability: Versioned infrastructure handles growth and rollbacks predictably.
  • Reduced Configuration Drift: Drift is tracked. Unapproved changes become visible fast.
  • Faster Incident Response: Roll back infra alongside application code.
  • Collaboration: Infrastructure is reviewed and peer-audited, no tribal knowledge required.

Core Concepts: What Actually Matters

  • Declarative vs Imperative Tools

    • Declarative (HashiCorp Terraform ≥1.5.0, AWS CloudFormation, Pulumi) says what the infrastructure should look like.
    • Imperative (Ansible, scripts) describes every step—maintain order of operations yourself.
  • Idempotency

    • Crucial: terraform apply can be run multiple times; the end state is consistent. If your IaC isn't idempotent, expect subtle bugs.
  • State Management

    • For tools like Terraform, state files (terraform.tfstate) track what exists. Lose or corrupt state, and even “destroy” actions become dangerous.
    • Note: Use S3 with DynamoDB locking or Terraform Cloud to prevent concurrent state writes.
  • Modularity

    • Build modules: e.g., network, compute, IAM blocks. Rapidly assemble infra from vetted parts.

Example: Terraform to Provision a Private S3 Bucket

Production teams use Terraform as a standard. Example using AWS provider v5.22.0 (as of June 2024):

main.tf

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.22.0"
    }
  }
  required_version = ">= 1.5.0"
}

provider "aws" {
  region  = "us-east-1"
}

resource "aws_s3_bucket" "private_data" {
  bucket = "devops-example-bucket-324985"
  acl    = "private"
  force_destroy = true

  tags = {
    Environment = "dev"
    Project     = "iac-demo"
  }
}

resource "aws_s3_bucket_versioning" "versioning" {
  bucket = aws_s3_bucket.private_data.id
  versioning_configuration {
    status = "Enabled"
  }
}

Initialize and plan:

terraform init
terraform plan -out plan.out

Apply with confirmation:

terraform apply "plan.out"

Destroy resources:

terraform destroy -auto-approve

Gotcha: If aws configure is misconfigured or your ~/.aws/credentials are stale, expect authentication errors:

│ Error: error configuring Terraform AWS Provider: no valid credential sources found for AWS Provider

State Management in Real Environments

Local state (terraform.tfstate) is a liability in team setups. Always configure remote state for shared work:

backend.tf

terraform {
  backend "s3" {
    bucket         = "iac-state-prod"
    key            = "terraform/devops-demo.tfstate"
    region         = "us-east-1"
    dynamodb_table = "iac-lock-table"
    encrypt        = true
  }
}

Known issue: S3 state locks can cause timeouts under concurrent execution—set DynamoDB ConsistentRead, and prefer short-lived runs.


CI/CD Integration: Automating IaC Deployments

Integrate plans and applies into your pipeline (e.g., GitHub Actions, Jenkins):

  • Run terraform fmt and terraform validate in PR checks.
  • Use terraform plan for reviewable diffs.
  • Restrict terraform apply to trusted runners, gated via code review.

Example: GitHub Actions step

- name: Terraform Plan
  run: terraform plan -input=false -out=plan.out

Modularization and Encapsulation

Break infra into reusable modules. For instance, a VPC module parameterized by CIDR and public subnet count. Example file tree:

modules/
  vpc/
    main.tf
    variables.tf
    outputs.tf
environments/
  prod/
    main.tf
    backend.tf
  staging/
    main.tf
    backend.tf

Non-Obvious Best Practice

Automate Integration Tests Against Melted State

Tools like Terratest allow you to run actual terraform apply then test, e.g., that a port is closed or data does not leak. Run these in ephemeral test accounts, avoid running in prod.


IaC Tools: Beyond Terraform

Some scenarios demand alternatives:

ToolUsageStrengths
AWS CloudFormationAWS-native stacksDeep AWS integration
PulumiPython/Go/Node infra as codeFamiliar languages
AnsibleConfiguration post-provisioningFine-grained changes
Chef/PuppetImmutable server configurationLegacy compatibility
Bicep/ARMAzure-nativeModern Azure syntax

Note: Evaluate cost to switch. Teams typically support one main provisioning tool and one config management layer.


Final Notes

IaC is foundational—teams running manual setup lag behind on velocity and reliability. Pick a tool, get sample code in version control, hook it to CI/CD, and enforce drift detection.

Not everything will “just work.” Expect provider bugs, CLI edge cases, and merge conflicts in .tfstate files. Experience is troubleshooting drift on Friday at 5pm because someone hotfixed a console change. Treat infra code with the same rigor as application code.

If managing secrets, skip plaintext variables. Use AWS Secrets Manager or Vault to inject at runtime. Never check credentials into Git.

Most importantly: automate tests of your infra, and document trade-offs—future you will appreciate it.