🚀 Developer Cookbook - FASE 3: Infraestructura y DevOps

Recetas prácticas para construir, desplegar y operar infraestructura moderna

📚 Tabla de Contenidos

Contenedores y Orquestación
Cloud Computing
Infraestructura como Código (IaC)
CI/CD y Automatización

Contenedores y Orquestación

Receta 3.1: Docker - Containerización Básica

¿Qué es Docker? Plataforma para empaquetar aplicaciones con sus dependencias en contenedores aislados y portables.

Conceptos clave:

Imagen: Template inmutable (blueprint)
Contenedor: Instancia ejecutable de una imagen
Dockerfile: Receta para construir imágenes
Registry: Repositorio de imágenes (Docker Hub, ECR, GCR)

Dockerfile - Python API:

# ===== MULTI-STAGE BUILD =====
# Stage 1: Builder (dependencias pesadas)
FROM python:3.11-slim as builder

WORKDIR /app

# Instalar dependencias de build
RUN apt-get update && apt-get install -y \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# Copiar requirements primero (layer caching)
COPY requirements.txt .

# Instalar dependencias en virtualenv
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Runtime (imagen final mínima)
FROM python:3.11-slim

# Crear usuario no-root (security best practice)
RUN useradd -m -u 1000 appuser

WORKDIR /app

# Copiar virtualenv del builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Copiar código de la aplicación
COPY --chown=appuser:appuser . .

# Cambiar a usuario no-root
USER appuser

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD python -c "import requests; requests.get('http://localhost:8000/health')"

# Exponer puerto
EXPOSE 8000

# Comando de inicio
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "4", "app:app"]

Dockerfile - Node.js:

# ===== MULTI-STAGE BUILD =====
FROM node:18-alpine AS builder

WORKDIR /app

# Copiar package files
COPY package*.json ./

# Instalar dependencias
RUN npm ci --only=production

# Copiar código fuente
COPY . .

# Build (si usas TypeScript, webpack, etc.)
RUN npm run build

# ===== RUNTIME =====
FROM node:18-alpine

# Instalar dumb-init (mejor manejo de señales)
RUN apk add --no-cache dumb-init

# Usuario no-root
RUN addgroup -g 1000 nodeuser && \
    adduser -D -u 1000 -G nodeuser nodeuser

WORKDIR /app

# Copiar node_modules y build
COPY --from=builder --chown=nodeuser:nodeuser /app/node_modules ./node_modules
COPY --from=builder --chown=nodeuser:nodeuser /app/dist ./dist
COPY --chown=nodeuser:nodeuser package*.json ./

USER nodeuser

EXPOSE 3000

# Usar dumb-init como PID 1
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["node", "dist/index.js"]

Comandos Docker esenciales:

# ===== BUILD =====
# Build básico
docker build -t myapp:1.0 .

# Build con argumentos
docker build --build-arg NODE_ENV=production -t myapp:prod .

# Build sin cache
docker build --no-cache -t myapp:latest .

# Multi-platform build
docker buildx build --platform linux/amd64,linux/arm64 -t myapp:multiarch .

# ===== RUN =====
# Run básico
docker run -d -p 8000:8000 --name myapp myapp:1.0

# Run con variables de entorno
docker run -d \
  -e DATABASE_URL=postgresql://localhost/mydb \
  -e API_KEY=secret123 \
  --name myapp \
  myapp:1.0

# Run con volumes (persistencia)
docker run -d \
  -v $(pwd)/data:/app/data \
  -v myapp-logs:/var/log \
  --name myapp \
  myapp:1.0

# Run con limits (recursos)
docker run -d \
  --memory="512m" \
  --cpus="1.5" \
  --name myapp \
  myapp:1.0

# Run interactivo (debugging)
docker run -it --rm myapp:1.0 /bin/bash

# ===== INSPECT =====
# Ver logs
docker logs -f myapp

# Ver procesos
docker top myapp

# Ver stats en tiempo real
docker stats myapp

# Inspeccionar contenedor
docker inspect myapp

# Ejecutar comando en contenedor corriendo
docker exec -it myapp /bin/bash
docker exec myapp ls -la /app

# ===== CLEANUP =====
# Parar contenedor
docker stop myapp

# Remover contenedor
docker rm myapp

# Remover imagen
docker rmi myapp:1.0

# Limpiar todo (contenedores stopped, imágenes sin usar)
docker system prune -a

# Ver uso de disco
docker system df

Docker Compose - Multi-container App:

# docker-compose.yml
version: '3.8'

services:
  # API Backend
  api:
    build:
      context: ./api
      dockerfile: Dockerfile
      args:
        - NODE_ENV=production
    container_name: myapp-api
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/myapp
      - REDIS_URL=redis://redis:6379
      - NODE_ENV=production
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started
    volumes:
      - ./api/logs:/app/logs
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  # PostgreSQL Database
  db:
    image: postgres:15-alpine
    container_name: myapp-db
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres-data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

  # Redis Cache
  redis:
    image: redis:7-alpine
    container_name: myapp-redis
    command: redis-server --appendonly yes
    volumes:
      - redis-data:/data
    networks:
      - app-network
    restart: unless-stopped

  # Nginx Reverse Proxy
  nginx:
    image: nginx:alpine
    container_name: myapp-nginx
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./ssl:/etc/nginx/ssl:ro
    depends_on:
      - api
    networks:
      - app-network
    restart: unless-stopped

  # Worker (background jobs)
  worker:
    build:
      context: ./api
      dockerfile: Dockerfile
    container_name: myapp-worker
    command: ["npm", "run", "worker"]
    environment:
      - REDIS_URL=redis://redis:6379
      - DATABASE_URL=postgresql://postgres:password@db:5432/myapp
    depends_on:
      - redis
      - db
    networks:
      - app-network
    restart: unless-stopped

volumes:
  postgres-data:
    driver: local
  redis-data:
    driver: local

networks:
  app-network:
    driver: bridge

Comandos Docker Compose:

# Iniciar todos los servicios
docker-compose up -d

# Iniciar servicios específicos
docker-compose up -d api db

# Ver logs
docker-compose logs -f api

# Escalar servicio
docker-compose up -d --scale worker=3

# Rebuild y reiniciar
docker-compose up -d --build

# Parar servicios
docker-compose stop

# Parar y remover contenedores
docker-compose down

# Parar y remover contenedores + volumes
docker-compose down -v

# Ver servicios corriendo
docker-compose ps

# Ejecutar comando en servicio
docker-compose exec api npm test

# Ver variables de entorno
docker-compose config

Optimización de Imágenes:

# ❌ MAL: Imagen pesada (1.2 GB)
FROM ubuntu:latest
RUN apt-get update && apt-get install -y python3 python3-pip
COPY . /app
WORKDIR /app
RUN pip3 install -r requirements.txt
CMD ["python3", "app.py"]

# ✅ BIEN: Imagen optimizada (150 MB)
FROM python:3.11-slim
WORKDIR /app

# Instalar dependencias primero (caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copiar código después
COPY . .

# Usuario no-root
RUN useradd -m appuser
USER appuser

CMD ["python", "app.py"]

# ✅ MEJOR: Multi-stage build (80 MB)
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.11-alpine
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
RUN adduser -D appuser
USER appuser
CMD ["python", "app.py"]

Best Practices:

# 1. USAR .dockerignore
# .dockerignore
node_modules
npm-debug.log
.git
.env
*.md
.pytest_cache
__pycache__

# 2. LAYER CACHING - Copiar dependencias primero
COPY package*.json ./
RUN npm ci
COPY . .  # Código cambia frecuentemente, va al final

# 3. MINIMIZAR LAYERS - Combinar comandos
RUN apt-get update && \
    apt-get install -y curl wget && \
    rm -rf /var/lib/apt/lists/*

# 4. USAR IMÁGENES ALPINE cuando sea posible
FROM node:18-alpine  # 50 MB vs 300 MB

# 5. NO CORRER COMO ROOT
RUN adduser -D -u 1000 appuser
USER appuser

# 6. ESPECIFICAR VERSIONES
FROM node:18.16.0-alpine  # ✅
FROM node:latest          # ❌

# 7. HEALTH CHECKS
HEALTHCHECK --interval=30s CMD curl -f http://localhost/ || exit 1

# 8. METADATA
LABEL maintainer="dev@example.com"
LABEL version="1.0"
LABEL description="My application"

Receta 3.2: Kubernetes - Orquestación de Contenedores

¿Qué es Kubernetes? Sistema de orquestación para automatizar deployment, scaling y management de aplicaciones containerizadas.

Arquitectura básica:

Cluster: Conjunto de nodos (servidores)
Node: Máquina física/virtual que ejecuta contenedores
Pod: Unidad más pequeña, contiene 1+ contenedores
Service: Abstracción para exponer Pods
Deployment: Define estado deseado de los Pods

Deployment YAML:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-api
  namespace: production
  labels:
    app: myapp
    tier: backend
spec:
  # Número de replicas
  replicas: 3
  
  # Selector para identificar Pods
  selector:
    matchLabels:
      app: myapp
      tier: backend
  
  # Estrategia de deployment
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Pods extra durante update
      maxUnavailable: 0  # Pods mínimos disponibles
  
  # Template del Pod
  template:
    metadata:
      labels:
        app: myapp
        tier: backend
        version: v1.2.0
    spec:
      # Affinity: preferir nodos diferentes (HA)
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - myapp
              topologyKey: kubernetes.io/hostname
      
      # Init containers (ejecutan antes del main)
      initContainers:
      - name: wait-for-db
        image: busybox:1.28
        command: ['sh', '-c', 'until nc -z postgres 5432; do echo waiting for db; sleep 2; done']
      
      # Contenedores principales
      containers:
      - name: api
        image: myregistry.io/myapp:1.2.0
        imagePullPolicy: IfNotPresent
        
        # Puertos
        ports:
        - containerPort: 3000
          name: http
          protocol: TCP
        
        # Variables de entorno
        env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: myapp-secrets
              key: database-url
        - name: REDIS_HOST
          valueFrom:
            configMapKeyRef:
              name: myapp-config
              key: redis-host
        
        # Resource limits (CRITICAL)
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        
        # Probes (health checks)
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        
        readinessProbe:
          httpGet:
            path: /ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        
        # Volumes
        volumeMounts:
        - name: config
          mountPath: /app/config
          readOnly: true
        - name: logs
          mountPath: /app/logs
      
      # Volumes
      volumes:
      - name: config
        configMap:
          name: myapp-config
      - name: logs
        emptyDir: {}
      
      # Image pull secrets (registries privados)
      imagePullSecrets:
      - name: myregistry-secret

Service (LoadBalancer):

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-api-service
  namespace: production
  labels:
    app: myapp
spec:
  type: LoadBalancer  # ClusterIP, NodePort, LoadBalancer
  selector:
    app: myapp
    tier: backend
  ports:
  - port: 80          # Puerto del service
    targetPort: 3000  # Puerto del container
    protocol: TCP
    name: http
  sessionAffinity: ClientIP  # Sticky sessions

ConfigMap (configuración):

# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: myapp-config
  namespace: production
data:
  # Valores simples
  redis-host: "redis-service.production.svc.cluster.local"
  log-level: "info"
  
  # Archivos completos
  app-config.json: |
    {
      "api": {
        "rateLimit": 100,
        "timeout": 30
      },
      "features": {
        "newUI": true
      }
    }

Secret (credenciales):

# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: myapp-secrets
  namespace: production
type: Opaque
data:
  # Base64 encoded
  database-url: cG9zdGdyZXNxbDovL3VzZXI6cGFzc0BkYi9teWRi
  api-key: c2VjcmV0MTIz
stringData:
  # Plain text (auto-encoded)
  smtp-password: "mypassword123"

Horizontal Pod Autoscaler (HPA):

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  # Escalar basado en CPU
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  # Escalar basado en memoria
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  # Escalar basado en métrica custom
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Esperar 5min antes de scale down
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30

Ingress (routing HTTP/HTTPS):

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp-ingress
  namespace: production
  annotations:
    # Nginx Ingress Controller
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - api.myapp.com
    secretName: myapp-tls
  rules:
  - host: api.myapp.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: myapp-api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapp-frontend-service
            port:
              number: 80

PersistentVolumeClaim (almacenamiento):

# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
  namespace: production
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi

StatefulSet (bases de datos):

# statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: production
spec:
  serviceName: postgres-headless
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: postgres-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 50Gi

Comandos kubectl esenciales:

# ===== APPLY / CREATE =====
# Aplicar configuración
kubectl apply -f deployment.yaml

# Aplicar directorio completo
kubectl apply -f ./k8s/

# Crear namespace
kubectl create namespace production

# ===== GET / DESCRIBE =====
# Listar recursos
kubectl get pods
kubectl get deployments
kubectl get services
kubectl get all

# Con namespace específico
kubectl get pods -n production

# Con más detalles
kubectl get pods -o wide

# En todos los namespaces
kubectl get pods --all-namespaces

# Describir recurso
kubectl describe pod myapp-api-abc123
kubectl describe deployment myapp-api

# ===== LOGS =====
# Ver logs
kubectl logs myapp-api-abc123

# Follow logs
kubectl logs -f myapp-api-abc123

# Logs de contenedor específico
kubectl logs myapp-api-abc123 -c sidecar

# Logs previos (crashed container)
kubectl logs myapp-api-abc123 --previous

# ===== EXEC =====
# Ejecutar comando
kubectl exec myapp-api-abc123 -- ls -la

# Shell interactivo
kubectl exec -it myapp-api-abc123 -- /bin/bash

# ===== SCALE =====
# Escalar deployment
kubectl scale deployment myapp-api --replicas=5

# Autoscale
kubectl autoscale deployment myapp-api --min=3 --max=10 --cpu-percent=80

# ===== UPDATE =====
# Actualizar imagen
kubectl set image deployment/myapp-api api=myapp:1.3.0

# Rollout status
kubectl rollout status deployment/myapp-api

# Rollout history
kubectl rollout history deployment/myapp-api

# Rollback
kubectl rollout undo deployment/myapp-api
kubectl rollout undo deployment/myapp-api --to-revision=2

# ===== DELETE =====
# Eliminar pod
kubectl delete pod myapp-api-abc123

# Eliminar deployment
kubectl delete deployment myapp-api

# Eliminar por archivo
kubectl delete -f deployment.yaml

# Force delete (stuck resources)
kubectl delete pod myapp-api-abc123 --force --grace-period=0

# ===== DEBUG =====
# Port forward (debug local)
kubectl port-forward pod/myapp-api-abc123 8080:3000

# Ver eventos
kubectl get events --sort-by=.metadata.creationTimestamp

# Top pods (resource usage)
kubectl top pods
kubectl top nodes

# ===== CONTEXTOS =====
# Ver contextos (clusters)
kubectl config get-contexts

# Cambiar contexto
kubectl config use-context production-cluster

# Set namespace default
kubectl config set-context --current --namespace=production

Helm - Package Manager para K8s:

# Chart.yaml
apiVersion: v2
name: myapp
description: My application Helm chart
version: 1.0.0
appVersion: 1.2.0

# values.yaml
replicaCount: 3

image:
  repository: myregistry.io/myapp
  tag: "1.2.0"
  pullPolicy: IfNotPresent

service:
  type: LoadBalancer
  port: 80

resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 250m
    memory: 256Mi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

ingress:
  enabled: true
  host: api.myapp.com
  tls:
    enabled: true

# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "myapp.fullname" . }}
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "myapp.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      labels:
        {{- include "myapp.selectorLabels" . | nindent 8 }}
    spec:
      containers:
      - name: {{ .Chart.Name }}
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        ports:
        - containerPort: 3000
        resources:
          {{- toYaml .Values.resources | nindent 10 }}

Comandos Helm:

# Instalar chart
helm install myapp ./myapp-chart

# Install con custom values
helm install myapp ./myapp-chart -f values-prod.yaml

# Upgrade
helm upgrade myapp ./myapp-chart

# Rollback
helm rollback myapp 1

# Ver releases
helm list

# Uninstall
helm uninstall myapp

# Ver chart renderizado
helm template myapp ./myapp-chart

# Package chart
helm package myapp-chart

Cloud Computing

Receta 3.3: AWS - Servicios Principales

Arquitectura típica en AWS:

┌─────────────────────────────────────────────────────────┐
│                    Route 53 (DNS)                        │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│              CloudFront (CDN)                            │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│         Application Load Balancer (ALB)                  │
└─────┬──────────────────────────────┬────────────────────┘
      │                              │
┌─────▼──────┐              ┌────────▼────────┐
│  ECS/EKS   │              │   ECS/EKS       │
│  (API)     │              │   (API)         │
│ AZ 1       │              │   AZ 2          │
└─────┬──────┘              └────────┬────────┘
      │                              │
      └──────────┬───────────────────┘
                 │
        ┌────────▼────────┐
        │   RDS (DB)      │
        │   Multi-AZ      │
        └────────┬────────┘
                 │
        ┌────────▼────────┐
        │  ElastiCache    │
        │   (Redis)       │
        └─────────────────┘

Terraform - Infraestructura en AWS:

# main.tf

# Provider configuration
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
  
  # Remote state en S3
  backend "s3" {
    bucket         = "myapp-terraform-state"
    key            = "production/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-lock"
  }
}

provider "aws" {
  region = var.aws_region
  
  default_tags {
    tags = {
      Environment = var.environment
      Project     = "myapp"
      ManagedBy   = "Terraform"
    }
  }
}

# Variables
variable "aws_region" {
  default = "us-east-1"
}

variable "environment" {
  default = "production"
}

# VPC
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "myapp-vpc"
  }
}

# Subnets (Multi-AZ)
resource "aws_subnet" "public" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.${count.index}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]
  
  map_public_ip_on_launch = true
  
  tags = {
    Name = "myapp-public-${count.index + 1}"
  }
}

resource "aws_subnet" "private" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.${count.index + 10}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]
  
  tags = {
    Name = "myapp-private-${count.index + 1}"
  }
}

# Internet Gateway
resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
  
  tags = {
    Name = "myapp-igw"
  }
}

# NAT Gateway
resource "aws_eip" "nat" {
  count  = 2
  domain = "vpc"
}

resource "aws_nat_gateway" "main" {
  count         = 2
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = aws_subnet.public[count.index].id
  
  tags = {
    Name = "myapp-nat-${count.index + 1}"
  }
}

# Security Group - ALB
resource "aws_security_group" "alb" {
  name        = "myapp-alb-sg"
  description = "Security group for ALB"
  vpc_id      = aws_vpc.main.id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# Security Group - ECS
resource "aws_security_group" "ecs" {
  name        = "myapp-ecs-sg"
  description = "Security group for ECS tasks"
  vpc_id      = aws_vpc.main.id
  
  ingress {
    from_port       = 3000
    to_port         = 3000
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# Application Load Balancer
resource "aws_lb" "main" {
  name               = "myapp-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = aws_subnet.public[*].id
  
  enable_deletion_protection = true
  enable_http2              = true
  
  access_logs {
    bucket  = aws_s3_bucket.alb_logs.id
    enabled = true
  }
}

resource "aws_lb_target_group" "api" {
  name        = "myapp-api-tg"
  port        = 3000
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id
  target_type = "ip"
  
  health_check {
    enabled             = true
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 3
    timeout             = 5
    interval            = 30
    matcher             = "200"
  }
  
  deregistration_delay = 30
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port              = "443"
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS-1-2-2017-01"
  certificate_arn   = aws_acm_certificate.main.arn
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }
}

# ECS Cluster
resource "aws_ecs_cluster" "main" {
  name = "myapp-cluster"
  
  setting {
    name  = "containerInsights"
    value = "enabled"
  }
}

# ECS Task Definition
resource "aws_ecs_task_definition" "api" {
  family                   = "myapp-api"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = "512"
  memory                   = "1024"
  execution_role_arn       = aws_iam_role.ecs_execution.arn
  task_role_arn            = aws_iam_role.ecs_task.arn
  
  container_definitions = jsonencode([
    {
      name  = "api"
      image = "${aws_ecr_repository.api.repository_url}:latest"
      
      portMappings = [
        {
          containerPort = 3000
          protocol      = "tcp"
        }
      ]
      
      environment = [
        {
          name  = "NODE_ENV"
          value = "production"
        },
        {
          name  = "DATABASE_URL"
          value = "postgresql://${aws_db_instance.main.endpoint}/myapp"
        }
      ]
      
      secrets = [
        {
          name      = "DB_PASSWORD"
          valueFrom = "${aws_secretsmanager_secret.db_password.arn}"
        }
      ]
      
      logConfiguration = {
        logDriver = "awslogs"
        options = {
          "awslogs-group"         = "/ecs/myapp-api"
          "awslogs-region"        = var.aws_region
          "awslogs-stream-prefix" = "ecs"
        }
      }
      
      healthCheck = {
        command     = ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
        interval    = 30
        timeout     = 5
        retries     = 3
        startPeriod = 60
      }
    }
  ])
}

# ECS Service
resource "aws_ecs_service" "api" {
  name            = "myapp-api"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.api.arn
  desired_count   = 3
  launch_type     = "FARGATE"
  
  network_configuration {
    subnets          = aws_subnet.private[*].id
    security_groups  = [aws_security_group.ecs.id]
    assign_public_ip = false
  }
  
  load_balancer {
    target_group_arn = aws_lb_target_group.api.arn
    container_name   = "api"
    container_port   = 3000
  }
  
  deployment_configuration {
    maximum_percent         = 200
    minimum_healthy_percent = 100
  }
  
  depends_on = [aws_lb_listener.https]
}

# Auto Scaling
resource "aws_appautoscaling_target" "ecs" {
  max_capacity       = 10
  min_capacity       = 3
  resource_id        = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.api.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

resource "aws_appautoscaling_policy" "ecs_cpu" {
  name               = "myapp-api-cpu-scaling"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.ecs.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs.service_namespace
  
  target_tracking_scaling_policy_configuration {
    target_value       = 70.0
    scale_in_cooldown  = 300
    scale_out_cooldown = 60
    
    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
  }
}

# RDS PostgreSQL
resource "aws_db_instance" "main" {
  identifier             = "myapp-db"
  engine                 = "postgres"
  engine_version         = "15.3"
  instance_class         = "db.t3.medium"
  allocated_storage      = 100
  storage_type           = "gp3"
  storage_encrypted      = true
  
  db_name  = "myapp"
  username = "postgres"
  password = random_password.db_password.result
  
  multi_az               = true
  db_subnet_group_name   = aws_db_subnet_group.main.name
  vpc_security_group_ids = [aws_security_group.rds.id]
  
  backup_retention_period = 7
  backup_window          = "03:00-04:00"
  maintenance_window     = "sun:04:00-sun:05:00"
  
  enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]
  
  deletion_protection = true
  skip_final_snapshot = false
  final_snapshot_identifier = "myapp-db-final-snapshot"
}

# ElastiCache Redis
resource "aws_elasticache_cluster" "redis" {
  cluster_id           = "myapp-redis"
  engine               = "redis"
  engine_version       = "7.0"
  node_type            = "cache.t3.medium"
  num_cache_nodes      = 1
  parameter_group_name = "default.redis7"
  subnet_group_name    = aws_elasticache_subnet_group.main.name
  security_group_ids   = [aws_security_group.redis.id]
  
  snapshot_retention_limit = 5
  snapshot_window         = "03:00-05:00"
}

# S3 Bucket
resource "aws_s3_bucket" "assets" {
  bucket = "myapp-assets-${var.environment}"
}

resource "aws_s3_bucket_versioning" "assets" {
  bucket = aws_s3_bucket.assets.id
  
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "assets" {
  bucket = aws_s3_bucket.assets.id
  
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

# CloudFront Distribution
resource "aws_cloudfront_distribution" "main" {
  enabled             = true
  is_ipv6_enabled     = true
  price_class         = "PriceClass_100"
  
  origin {
    domain_name = aws_s3_bucket.assets.bucket_regional_domain_name
    origin_id   = "S3-${aws_s3_bucket.assets.id}"
    
    s3_origin_config {
      origin_access_identity = aws_cloudfront_origin_access_identity.main.cloudfront_access_identity_path
    }
  }
  
  default_cache_behavior {
    allowed_methods        = ["GET", "HEAD", "OPTIONS"]
    cached_methods         = ["GET", "HEAD"]
    target_origin_id       = "S3-${aws_s3_bucket.assets.id}"
    viewer_protocol_policy = "redirect-to-https"
    compress               = true
    
    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }
    
    min_ttl     = 0
    default_ttl = 3600
    max_ttl     = 86400
  }
  
  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }
  
  viewer_certificate {
    cloudfront_default_certificate = true
  }
}

# Outputs
output "alb_dns_name" {
  value = aws_lb.main.dns_name
}

output "rds_endpoint" {
  value     = aws_db_instance.main.endpoint
  sensitive = true
}

Receta 3.4: Serverless - AWS Lambda

¿Qué es Serverless? Ejecutar código sin gestionar servidores. Pagas solo por tiempo de ejecución.

Lambda Function (Python):

# lambda_function.py
import json
import boto3
from datetime import datetime

# Clientes AWS
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
sns = boto3.client('sns')

def lambda_handler(event, context):
    """
    Handler principal de Lambda
    
    Args:
        event: Evento que triggerea la función
        context: Contexto de ejecución (request_id, etc.)
    
    Returns:
        Response con statusCode y body
    """
    
    print(f"Received event: {json.dumps(event)}")
    
    try:
        # Parsear body si es API Gateway
        if 'body' in event:
            body = json.loads(event['body'])
        else:
            body = event
        
        # Procesar request
        result = process_request(body)
        
        # Response exitoso
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*'
            },
            'body': json.dumps({
                'message': 'Success',
                'data': result,
                'timestamp': datetime.utcnow().isoformat()
            })
        }
    
    except ValueError as e:
        return error_response(400, str(e))
    except Exception as e:
        print(f"Error: {str(e)}")
        return error_response(500, "Internal server error")

def process_request(data):
    """Lógica de negocio"""
    # Guardar en DynamoDB
    table = dynamodb.Table('myapp-data')
    table.put_item(Item={
        'id': data['id'],
        'timestamp': datetime.utcnow().isoformat(),
        'data': data
    })
    
    # Subir a S3
    s3.put_object(
        Bucket='myapp-bucket',
        Key=f"data/{data['id']}.json",
        Body=json.dumps(data),
        ContentType='application/json'
    )
    
    # Notificar via SNS
    sns.publish(
        TopicArn='arn:aws:sns:us-east-1:123456789:myapp-notifications',
        Subject='New data processed',
        Message=json.dumps(data)
    )
    
    return {'processed': True, 'id': data['id']}

def error_response(status_code, message):
    """Helper para responses de error"""
    return {
        'statusCode': status_code,
        'headers': {
            'Content-Type': 'application/json',
            'Access-Control-Allow-Origin': '*'
        },
        'body': json.dumps({
            'error': message,
            'timestamp': datetime.utcnow().isoformat()
        })
    }

Serverless Framework:

# serverless.yml
service: myapp-api

provider:
  name: aws
  runtime: python3.11
  stage: ${opt:stage, 'dev'}
  region: us-east-1
  memorySize: 512
  timeout: 30
  
  # Variables de entorno
  environment:
    STAGE: ${self:provider.stage}
    TABLE_NAME: ${self:custom.tableName}
  
  # IAM permissions
  iam:
    role:
      statements:
        - Effect: Allow
          Action:
            - dynamodb:PutItem
            - dynamodb:GetItem
            - dynamodb:Query
          Resource: "arn:aws:dynamodb:${self:provider.region}:*:table/${self:custom.tableName}"
        - Effect: Allow
          Action:
            - s3:PutObject
            - s3:GetObject
          Resource: "arn:aws:s3:::${self:custom.bucketName}/*"
        - Effect: Allow
          Action:
            - sns:Publish
          Resource: "*"

custom:
  tableName: myapp-data-${self:provider.stage}
  bucketName: myapp-bucket-${self:provider.stage}

functions:
  # HTTP API
  api:
    handler: lambda_function.lambda_handler
    events:
      - http:
          path: /process
          method: POST
          cors: true
  
  # S3 Trigger
  processImage:
    handler: image_processor.handler
    events:
      - s3:
          bucket: ${self:custom.bucketName}
          event: s3:ObjectCreated:*
          rules:
            - prefix: uploads/
            - suffix: .jpg
  
  # Scheduled (cron)
  dailyReport:
    handler: reports.daily_handler
    events:
      - schedule:
          rate: cron(0 8 * * ? *)  # Todos los días a las 8am UTC
          enabled: true
  
  # SQS Queue
  processQueue:
    handler: queue_processor.handler
    events:
      - sqs:
          arn:
            Fn::GetAtt:
              - ProcessQueue
              - Arn
          batchSize: 10
  
  # DynamoDB Stream
  streamProcessor:
    handler: stream_processor.handler
    events:
      - stream:
          type: dynamodb
          arn:
            Fn::GetAtt:
              - DataTable
              - StreamArn

resources:
  Resources:
    # DynamoDB Table
    DataTable:
      Type: AWS::DynamoDB::Table
      Properties:
        TableName: ${self:custom.tableName}
        BillingMode: PAY_PER_REQUEST
        AttributeDefinitions:
          - AttributeName: id
            AttributeType: S
        KeySchema:
          - AttributeName: id
            KeyType: HASH
        StreamSpecification:
          StreamViewType: NEW_AND_OLD_IMAGES
    
    # SQS Queue
    ProcessQueue:
      Type: AWS::SQS::Queue
      Properties:
        QueueName: myapp-process-queue
        VisibilityTimeout: 300
        MessageRetentionPeriod: 1209600  # 14 days
    
    # S3 Bucket
    AssetsBucket:
      Type: AWS::S3::Bucket
      Properties:
        BucketName: ${self:custom.bucketName}
        CorsConfiguration:
          CorsRules:
            - AllowedOrigins:
                - '*'
              AllowedMethods:
                - GET
                - PUT
                - POST
              AllowedHeaders:
                - '*'

plugins:
  - serverless-python-requirements
  - serverless-offline

package:
  exclude:
    - node_modules/**
    - venv/**
    - .git/**

Comandos Serverless:

# Deploy
serverless deploy --stage production

# Deploy solo función
serverless deploy function -f api

# Invoke local
serverless invoke local -f api -d '{"body": "{\"id\": \"123\"}"}'

# Invoke remoto
serverless invoke -f api -d '{"body": "{\"id\": \"123\"}"}'

# Logs
serverless logs -f api -t

# Remove (eliminar todo)
serverless remove

AWS SAM (Serverless Application Model):

# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: My Serverless App

Globals:
  Function:
    Timeout: 30
    Runtime: python3.11
    MemorySize: 512
    Environment:
      Variables:
        TABLE_NAME: !Ref DataTable

Resources:
  # API Gateway
  MyApi:
    Type: AWS::Serverless::Api
    Properties:
      StageName: Prod
      Cors:
        AllowOrigin: "'*'"
        AllowHeaders: "'*'"
      Auth:
        ApiKeyRequired: true
  
  # Lambda Function
  ProcessFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: lambda_function.lambda_handler
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref DataTable
      Events:
        ApiEvent:
          Type: Api
          Properties:
            RestApiId: !Ref MyApi
            Path: /process
            Method: POST
  
  # DynamoDB
  DataTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: myapp-data
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: id
          KeyType: HASH

Outputs:
  ApiUrl:
    Description: API Gateway endpoint URL
    Value: !Sub 'https://${MyApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/'

Best Practices Serverless:

# 1. MINIMIZAR COLD STARTS
import json

# ✅ Inicializar fuera del handler (reusado entre invocaciones)
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])

def lambda_handler(event, context):
    # Handler rápido
    return table.get_item(Key={'id': event['id']})

# 2. USAR LAYERS para dependencias compartidas
# Reduce tamaño del deployment package

# 3. CONFIGURAR RESERVED CONCURRENCY
# Evitar throttling y controlar costos

# 4. IMPLEMENTAR IDEMPOTENCIA
def lambda_handler(event, context):
    request_id = event['requestId']
    
    # Check si ya procesamos este request
    existing = table.get_item(Key={'id': request_id})
    if existing.get('Item'):
        return existing['Item']['result']
    
    # Procesar y guardar
    result = process()
    table.put_item(Item={'id': request_id, 'result': result})
    return result

# 5. MANEJAR TIMEOUTS
import signal

class TimeoutError(Exception):
    pass

def timeout_handler(signum, frame):
    raise TimeoutError("Function timeout")

def lambda_handler(event, context):
    # Set timeout 5 segundos antes del límite de Lambda
    signal.signal(signal.SIGALRM, timeout_handler)
    signal.alarm(context.get_remaining_time_in_millis() // 1000 - 5)
    
    try:
        return process()
    except TimeoutError:
        # Graceful degradation
        return partial_result()

# 6. ESTRUCTURAR LOGS para CloudWatch Insights
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    logger.info('Event received', extra={
        'event_type': event.get('type'),
        'user_id': event.get('userId'),
        'request_id': context.request_id
    })

Infraestructura como Código

Receta 3.5: Terraform Avanzado

Módulos reutilizables:

# modules/ecs-service/main.tf
variable "service_name" {
  type = string
}

variable "cluster_id" {
  type = string
}

variable "image" {
  type = string
}

variable "cpu" {
  type    = number
  default = 256
}

variable "memory" {
  type    = number
  default = 512
}

variable "desired_count" {
  type    = number
  default = 2
}

variable "environment_variables" {
  type    = map(string)
  default = {}
}

# Task Definition
resource "aws_ecs_task_definition" "this" {
  family                   = var.service_name
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = var.cpu
  memory                   = var.memory
  
  container_definitions = jsonencode([
    {
      name  = var.service_name
      image = var.image
      
      environment = [
        for key, value in var.environment_variables : {
          name  = key
          value = value
        }
      ]
      
      logConfiguration = {
        logDriver = "awslogs"
        options = {
          "awslogs-group"         = "/ecs/${var.service_name}"
          "awslogs-region"        = data.aws_region.current.name
          "awslogs-stream-prefix" = "ecs"
        }
      }
    }
  ])
}

# Service
resource "aws_ecs_service" "this" {
  name            = var.service_name
  cluster         = var.cluster_id
  task_definition = aws_ecs_task_definition.this.arn
  desired_count   = var.desired_count
  launch_type     = "FARGATE"
  
  network_configuration {
    subnets          = var.subnet_ids
    security_groups  = [aws_security_group.service.id]
    assign_public_ip = false
  }
}

output "service_name" {
  value = aws_ecs_service.this.name
}

Usar el módulo:

# main.tf
module "api_service" {
  source = "./modules/ecs-service"
  
  service_name  = "api"
  cluster_id    = aws_ecs_cluster.main.id
  image         = "myapp:latest"
  cpu           = 512
  memory        = 1024
  desired_count = 3
  
  environment_variables = {
    NODE_ENV     = "production"
    DATABASE_URL = var.database_url
  }
}

module "worker_service" {
  source = "./modules/ecs-service"
  
  service_name  = "worker"
  cluster_id    = aws_ecs_cluster.main.id
  image         = "myapp-worker:latest"
  cpu           = 256
  memory        = 512
  desired_count = 2
  
  environment_variables = {
    REDIS_URL = var.redis_url
  }
}

Workspaces (multi-environment):

# main.tf
locals {
  environment = terraform.workspace
  
  # Configuración por environment
  config = {
    dev = {
      instance_type = "t3.small"
      min_size      = 1
      max_size      = 2
    }
    staging = {
      instance_type = "t3.medium"
      min_size      = 2
      max_size      = 4
    }
    production = {
      instance_type = "t3.large"
      min_size      = 3
      max_size      = 10
    }
  }
  
  env_config = local.config[local.environment]
}

resource "aws_instance" "app" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = local.env_config.instance_type
  
  tags = {
    Name        = "app-${local.environment}"
    Environment = local.environment
  }
}

# Crear workspace
terraform workspace new production

# Listar workspaces
terraform workspace list

# Cambiar workspace
terraform workspace select production

# Deploy en workspace actual
terraform apply

Data Sources:

# Buscar AMI más reciente
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

# Buscar VPC existente
data "aws_vpc" "main" {
  tags = {
    Name = "main-vpc"
  }
}

# Availability Zones
data "aws_availability_zones" "available" {
  state = "available"
}

# Caller identity (account ID)
data "aws_caller_identity" "current" {}

# Region actual
data "aws_region" "current" {}

Conditionals y Loops:

# Count para crear múltiples recursos
resource "aws_subnet" "public" {
  count = 3
  
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.${count.index}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]
  
  tags = {
    Name = "public-${count.index + 1}"
  }
}

# for_each con map
variable "users" {
  type = map(object({
    role = string
  }))
  
  default = {
    alice = { role = "admin" }
    bob   = { role = "developer" }
  }
}

resource "aws_iam_user" "users" {
  for_each = var.users
  
  name = each.key
  
  tags = {
    Role = each.value.role
  }
}

# Conditional resource
resource "aws_instance" "bastion" {
  count = var.create_bastion ? 1 : 0
  
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

# Dynamic blocks
resource "aws_security_group" "web" {
  name = "web-sg"
  
  dynamic "ingress" {
    for_each = var.ingress_rules
    
    content {
      from_port   = ingress.value.port
      to_port     = ingress.value.port
      protocol    = "tcp"
      cidr_blocks = ingress.value.cidr_blocks
    }
  }
}

Outputs y Dependencies:

# Outputs
output "alb_dns" {
  value       = aws_lb.main.dns_name
  description = "DNS name of the load balancer"
}

output "db_endpoint" {
  value     = aws_db_instance.main.endpoint
  sensitive = true
}

# Explicit dependency
resource "aws_instance" "app" {
  # ...
  
  depends_on = [
    aws_db_instance.main,
    aws_elasticache_cluster.redis
  ]
}

# Prevent destroy
resource "aws_db_instance" "main" {
  # ...
  
  lifecycle {
    prevent_destroy = true
  }
}

# Create before destroy
resource "aws_launch_template" "app" {
  # ...
  
  lifecycle {
    create_before_destroy = true
  }
}

Remote State:

# Backend configuration
terraform {
  backend "s3" {
    bucket         = "myapp-terraform-state"
    key            = "production/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-lock"
  }
}

# Usar output de otro state
data "terraform_remote_state" "vpc" {
  backend = "s3"
  
  config = {
    bucket = "myapp-terraform-state"
    key    = "vpc/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.vpc.outputs.private_subnet_ids[0]
}

CI/CD y Automatización

Receta 3.6: GitHub Actions - Pipelines Completos

CI/CD Pipeline Completo:

# .github/workflows/ci-cd.yml
name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  NODE_VERSION: '18.x'
  AWS_REGION: us-east-1
  ECR_REPOSITORY: myapp
  ECS_SERVICE: myapp-api
  ECS_CLUSTER: myapp-cluster

jobs:
  # ===== LINTING & CODE QUALITY =====
  lint:
    name: Lint Code
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
      
      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'npm'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Run ESLint
        run: npm run lint
      
      - name: Run Prettier check
        run: npm run format:check
      
      - name: TypeScript check
        run: npm run type-check

  # ===== UNIT TESTS =====
  test:
    name: Unit Tests
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'npm'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Run tests with coverage
        run: npm run test:coverage
      
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage/lcov.info
          fail_ci_if_error: true
      
      - name: SonarCloud Scan
        uses: SonarSource/sonarcloud-github-action@master
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}

  # ===== INTEGRATION TESTS =====
  integration-tests:
    name: Integration Tests
    runs-on: ubuntu-latest
    
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: postgres
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379
    
    steps:
      - uses: actions/checkout@v4
      
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'npm'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Run migrations
        run: npm run migrate
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test
      
      - name: Run integration tests
        run: npm run test:integration
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test
          REDIS_URL: redis://localhost:6379

  # ===== SECURITY SCAN =====
  security:
    name: Security Scan
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Run Trivy vulnerability scanner
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          scan-ref: '.'
          format: 'sarif'
          output: 'trivy-results.sarif'
      
      - name: Upload Trivy results to GitHub Security
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: 'trivy-results.sarif'
      
      - name: NPM Audit
        run: npm audit --audit-level=high

  # ===== BUILD & PUSH DOCKER IMAGE =====
  build:
    name: Build and Push Image
    runs-on: ubuntu-latest
    needs: [lint, test, security]
    if: github.ref == 'refs/heads/main'
    
    outputs:
      image: ${{ steps.build-image.outputs.image }}
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.AWS_REGION }}
      
      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2
      
      - name: Build, tag, and push image
        id: build-image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build \
            --build-arg NODE_ENV=production \
            --cache-from $ECR_REGISTRY/$ECR_REPOSITORY:latest \
            -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \
            -t $ECR_REGISTRY/$ECR_REPOSITORY:latest \
            .
          
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest
          
          echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT
      
      - name: Scan image for vulnerabilities
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ steps.build-image.outputs.image }}
          format: 'table'
          exit-code: '1'
          severity: 'CRITICAL,HIGH'

  # ===== DEPLOY TO STAGING =====
  deploy-staging:
    name: Deploy to Staging
    runs-on: ubuntu-latest
    needs: build
    environment:
      name: staging
      url: https://staging.myapp.com
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.AWS_REGION }}
      
      - name: Download task definition
        run: |
          aws ecs describe-task-definition \
            --task-definition ${{ env.ECS_SERVICE }}-staging \
            --query taskDefinition > task-definition.json
      
      - name: Fill in new image ID
        id: task-def
        uses: aws-actions/amazon-ecs-render-task-definition@v1
        with:
          task-definition: task-definition.json
          container-name: api
          image: ${{ needs.build.outputs.image }}
      
      - name: Deploy to ECS
        uses: aws-actions/amazon-ecs-deploy-task-definition@v1
        with:
          task-definition: ${{ steps.task-def.outputs.task-definition }}
          service: ${{ env.ECS_SERVICE }}-staging
          cluster: ${{ env.ECS_CLUSTER }}-staging
          wait-for-service-stability: true
      
      - name: Run smoke tests
        run: |
          sleep 30
          curl -f https://staging.myapp.com/health || exit 1

  # ===== DEPLOY TO PRODUCTION =====
  deploy-production:
    name: Deploy to Production
    runs-on: ubuntu-latest
    needs: [build, deploy-staging]
    environment:
      name: production
      url: https://myapp.com
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.AWS_REGION }}
      
      - name: Download task definition
        run: |
          aws ecs describe-task-definition \
            --task-definition ${{ env.ECS_SERVICE }} \
            --query taskDefinition > task-definition.json
      
      - name: Fill in new image ID
        id: task-def
        uses: aws-actions/amazon-ecs-render-task-definition@v1
        with:
          task-definition: task-definition.json
          container-name: api
          image: ${{ needs.build.outputs.image }}
      
      - name: Deploy to ECS (Blue/Green)
        uses: aws-actions/amazon-ecs-deploy-task-definition@v1
        with:
          task-definition: ${{ steps.task-def.outputs.task-definition }}
          service: ${{ env.ECS_SERVICE }}
          cluster: ${{ env.ECS_CLUSTER }}
          wait-for-service-stability: true
          codedeploy-appspec: appspec.yaml
          codedeploy-application: myapp
          codedeploy-deployment-group: myapp-production
      
      - name: Notify Slack
        uses: 8398a7/action-slack@v3
        if: always()
        with:
          status: ${{ job.status }}
          text: 'Production deployment ${{ job.status }}'
          webhook_url: ${{ secrets.SLACK_WEBHOOK }}

Deployment Strategies:

# appspec.yaml (CodeDeploy - Blue/Green)
version: 0.0
Resources:
  - TargetService:
      Type: AWS::ECS::Service
      Properties:
        TaskDefinition: <TASK_DEFINITION>
        LoadBalancerInfo:
          ContainerName: "api"
          ContainerPort: 3000
        PlatformVersion: "LATEST"

Hooks:
  - BeforeInstall: "LambdaFunctionToValidateBeforeInstall"
  - AfterInstall: "LambdaFunctionToValidateAfterInstall"
  - AfterAllowTestTraffic: "LambdaFunctionToValidateAfterTestTraffic"
  - BeforeAllowTraffic: "LambdaFunctionToValidateBeforeTrafficShift"
  - AfterAllowTraffic: "LambdaFunctionToValidateAfterTrafficShift"

Feature Flags:

# feature_flags.py
from typing import Dict, Any
import os

class FeatureFlags:
    """Sistema de feature flags"""
    
    def __init__(self):
        # En producción: usar LaunchDarkly, Split.io, etc.
        self.flags = {
            'new_checkout_ui': {
                'enabled': True,
                'rollout_percentage': 50,  # 50% de usuarios
                'rollout_users': ['beta-user-1', 'beta-user-2']
            },
            'async_processing': {
                'enabled': True,
                'rollout_percentage': 100
            },
            'experimental_algorithm': {
                'enabled': False
            }
        }
    
    def is_enabled(self, flag_name: str, user_id: str = None, context: Dict[str, Any] = None) -> bool:
        """Verificar si feature está habilitado"""
        if flag_name not in self.flags:
            return False
        
        flag = self.flags[flag_name]
        
        # Flag completamente deshabilitado
        if not flag.get('enabled', False):
            return False
        
        # Usuarios específicos
        if user_id and user_id in flag.get('rollout_users', []):
            return True
        
        # Rollout por porcentaje
        rollout_pct = flag.get('rollout_percentage', 100)
        if rollout_pct < 100 and user_id:
            # Hash consistente del user_id
            import hashlib
            hash_value = int(hashlib.md5(user_id.encode()).hexdigest(), 16)
            return (hash_value % 100) < rollout_pct
        
        return rollout_pct == 100

# Uso
ff = FeatureFlags()

@app.route('/checkout')
def checkout():
    user_id = request.user.id
    
    if ff.is_enabled('new_checkout_ui', user_id):
        return render_template('checkout_v2.html')
    else:
        return render_template('checkout_v1.html')

¡Felicidades! 🎉

Has completado la FASE 3: Infraestructura y DevOps del roadmap.

Lo que has aprendido:

✅ Docker - Containerización y multi-stage builds
✅ Docker Compose - Aplicaciones multi-container
✅ Kubernetes - Orquestación a escala
✅ Helm - Package manager para K8s
✅ AWS - Servicios principales (ECS, RDS, S3, CloudFront)
✅ Serverless - Lambda, API Gateway, SAM
✅ Terraform - Infraestructura como código
✅ CI/CD - GitHub Actions pipelines completos
✅ Deployment strategies - Blue/Green, Canary
✅ Feature Flags - Despliegues graduales

Próximos pasos:

FASE 4: Seguridad y Observabilidad

Ciberseguridad para desarrolladores
Code review y análisis estático
Monitoreo y observabilidad
SLIs, SLOs, SLAs

Versión: 1.0
Fecha: 2024
Autor: Roadmap del Desarrollador del Futuro
Licencia: Uso educativo