πŸ”βŒ˜K

Start typing to search docs.

Deployment Runbook

1.0.0

Operational checklist for releasing platform changes.

Infrastructure Deployment Runbook

This runbook provides step-by-step procedures for deploying App Factory infrastructure across different environments and cloud providers.

🎯 Overview

This runbook covers:

  • Pre-deployment checklist
  • Environment-specific deployments
  • Post-deployment verification
  • Rollback procedures
  • Common deployment issues

πŸ“‹ Pre-Deployment Checklist

Prerequisites Verification

#!/bin/bash
# pre-deployment-check.sh

echo "πŸ” Pre-Deployment Verification"
echo "=============================="

# Check required tools
echo "πŸ› οΈ Checking required tools..."
command -v pulumi >/dev/null 2>&1 || { echo "❌ Pulumi CLI not found"; exit 1; }
command -v node >/dev/null 2>&1 || { echo "❌ Node.js not found"; exit 1; }
command -v pnpm >/dev/null 2>&1 || { echo "❌ pnpm not found"; exit 1; }

# Check Node.js version
NODE_VERSION=$(node --version | cut -d'v' -f2 | cut -d'.' -f1)
if [ "$NODE_VERSION" -lt 22 ]; then
    echo "❌ Node.js 22+ required, found: $(node --version)"
    exit 1
fi

# Check cloud provider tools
if [ "$PROVIDER" = "aws" ]; then
    command -v aws >/dev/null 2>&1 || { echo "❌ AWS CLI not found"; exit 1; }
    aws sts get-caller-identity >/dev/null 2>&1 || { echo "❌ AWS credentials not configured"; exit 1; }
elif [ "$PROVIDER" = "gcp" ]; then
    command -v gcloud >/dev/null 2>&1 || { echo "❌ Google Cloud SDK not found"; exit 1; }
    gcloud auth list --filter=status:ACTIVE --format="value(account)" | head -1 >/dev/null || { echo "❌ GCP credentials not configured"; exit 1; }
fi

echo "βœ… All prerequisites verified"

Configuration Validation

#!/bin/bash
# validate-config.sh

echo "βš™οΈ Configuration Validation"
echo "=========================="

# Required environment variables
REQUIRED_VARS=("APP_NAME" "ENVIRONMENT" "PROVIDER")

for var in "${REQUIRED_VARS[@]}"; do
    if [ -z "${!var}" ]; then
        echo "❌ Required environment variable $var is not set"
        exit 1
    fi
done

# Validate app name format
if [[ ! "$APP_NAME" =~ ^[a-z][a-z0-9-]*[a-z0-9]$ ]]; then
    echo "❌ APP_NAME must be lowercase, start with letter, contain only letters, numbers, and hyphens"
    exit 1
fi

# Validate environment
if [[ ! "$ENVIRONMENT" =~ ^(development|staging|production)$ ]]; then
    echo "❌ ENVIRONMENT must be one of: development, staging, production"
    exit 1
fi

# Validate provider
if [[ ! "$PROVIDER" =~ ^(aws|gcp)$ ]]; then
    echo "❌ PROVIDER must be one of: aws, gcp"
    exit 1
fi

echo "βœ… Configuration validated"

πŸš€ Deployment Procedures

AWS Deployment

Development Environment

#!/bin/bash
# deploy-aws-dev.sh

set -e

export APP_NAME="focus-ai"
export ENVIRONMENT="development"
export PROVIDER="aws"
export AWS_REGION="us-east-1"

echo "πŸš€ Deploying AWS Development Environment"
echo "======================================="

# Run pre-deployment checks
./pre-deployment-check.sh
./validate-config.sh

# Navigate to infrastructure directory
cd infra/pulumi-aws

# Install dependencies
echo "πŸ“¦ Installing dependencies..."
pnpm install

# Build TypeScript
echo "πŸ”¨ Building TypeScript..."
pnpm build

# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "πŸ“š Setting up Pulumi stack: $STACK_NAME"

if pulumi stack ls | grep -q "$STACK_NAME"; then
    pulumi stack select "$STACK_NAME"
else
    pulumi stack init "$STACK_NAME"
fi

# Configure stack
echo "βš™οΈ Configuring stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set aws:region "$AWS_REGION"

# Preview changes
echo "πŸ‘€ Previewing changes..."
pulumi preview

# Confirm deployment
read -p "πŸ€” Proceed with deployment? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
    echo "❌ Deployment cancelled"
    exit 1
fi

# Deploy infrastructure
echo "πŸš€ Deploying infrastructure..."
pulumi up --yes

# Verify deployment
echo "βœ… Running post-deployment verification..."
../scripts/verify-deployment.sh

echo "πŸŽ‰ AWS Development deployment complete!"

Production Environment

#!/bin/bash
# deploy-aws-prod.sh

set -e

export APP_NAME="focus-ai"
export ENVIRONMENT="production"
export PROVIDER="aws"
export AWS_REGION="us-east-1"

echo "πŸš€ Deploying AWS Production Environment"
echo "======================================"

# Enhanced pre-deployment checks for production
./pre-deployment-check.sh
./validate-config.sh

# Additional production checks
echo "πŸ”’ Production-specific checks..."

# Verify backup strategy
if [ ! -f "backup-strategy.md" ]; then
    echo "❌ Backup strategy documentation required for production"
    exit 1
fi

# Verify monitoring setup
if [ ! -f "monitoring-config.yaml" ]; then
    echo "❌ Monitoring configuration required for production"
    exit 1
fi

# Navigate to infrastructure directory
cd infra/pulumi-aws

# Install dependencies
echo "πŸ“¦ Installing dependencies..."
pnpm install

# Build TypeScript
echo "πŸ”¨ Building TypeScript..."
pnpm build

# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "πŸ“š Setting up Pulumi stack: $STACK_NAME"

if pulumi stack ls | grep -q "$STACK_NAME"; then
    pulumi stack select "$STACK_NAME"
else
    pulumi stack init "$STACK_NAME"
fi

# Configure stack with production settings
echo "βš™οΈ Configuring production stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set aws:region "$AWS_REGION"

# Production-specific configurations
pulumi config set app:deletionProtection true
pulumi config set app:backupRetentionDays 30
pulumi config set app:multiAz true

# Preview changes with detailed diff
echo "πŸ‘€ Previewing changes..."
pulumi preview --diff

# Require explicit confirmation for production
echo "⚠️ PRODUCTION DEPLOYMENT WARNING ⚠️"
echo "This will deploy to production environment."
echo "Ensure you have:"
echo "- Reviewed all changes"
echo "- Notified stakeholders"
echo "- Scheduled maintenance window"
echo ""

read -p "Type 'DEPLOY TO PRODUCTION' to confirm: " confirmation
if [ "$confirmation" != "DEPLOY TO PRODUCTION" ]; then
    echo "❌ Production deployment cancelled"
    exit 1
fi

# Deploy infrastructure
echo "πŸš€ Deploying production infrastructure..."
pulumi up --yes

# Enhanced verification for production
echo "βœ… Running comprehensive post-deployment verification..."
../scripts/verify-production-deployment.sh

echo "πŸŽ‰ AWS Production deployment complete!"

GCP Deployment

Development Environment

#!/bin/bash
# deploy-gcp-dev.sh

set -e

export APP_NAME="focus-ai"
export ENVIRONMENT="development"
export PROVIDER="gcp"
export GCP_PROJECT="your-dev-project"
export GCP_REGION="us-central1"

echo "πŸš€ Deploying GCP Development Environment"
echo "======================================="

# Run pre-deployment checks
./pre-deployment-check.sh
./validate-config.sh

# Verify GCP project
if [ -z "$GCP_PROJECT" ]; then
    echo "❌ GCP_PROJECT environment variable is required"
    exit 1
fi

# Navigate to infrastructure directory
cd infra/pulumi-gcp

# Install dependencies
echo "πŸ“¦ Installing dependencies..."
pnpm install

# Build TypeScript
echo "πŸ”¨ Building TypeScript..."
pnpm build

# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "πŸ“š Setting up Pulumi stack: $STACK_NAME"

if pulumi stack ls | grep -q "$STACK_NAME"; then
    pulumi stack select "$STACK_NAME"
else
    pulumi stack init "$STACK_NAME"
fi

# Configure stack
echo "βš™οΈ Configuring stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set gcp:project "$GCP_PROJECT"
pulumi config set gcp:region "$GCP_REGION"

# Preview changes
echo "πŸ‘€ Previewing changes..."
pulumi preview

# Confirm deployment
read -p "πŸ€” Proceed with deployment? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
    echo "❌ Deployment cancelled"
    exit 1
fi

# Deploy infrastructure
echo "πŸš€ Deploying infrastructure..."
pulumi up --yes

# Verify deployment
echo "βœ… Running post-deployment verification..."
../scripts/verify-deployment.sh

echo "πŸŽ‰ GCP Development deployment complete!"

Production Environment

#!/bin/bash
# deploy-gcp-prod.sh

set -e

export APP_NAME="focus-ai"
export ENVIRONMENT="production"
export PROVIDER="gcp"
export GCP_PROJECT="your-prod-project"
export GCP_REGION="us-central1"

echo "πŸš€ Deploying GCP Production Environment"
echo "======================================"

# Enhanced pre-deployment checks for production
./pre-deployment-check.sh
./validate-config.sh

# Verify GCP project
if [ -z "$GCP_PROJECT" ]; then
    echo "❌ GCP_PROJECT environment variable is required"
    exit 1
fi

# Additional production checks
echo "πŸ”’ Production-specific checks..."

# Verify project is production project
if [[ ! "$GCP_PROJECT" =~ -prod$ ]]; then
    echo "⚠️ Warning: GCP project name doesn't end with '-prod'"
    read -p "Continue anyway? (y/N) " -n 1 -r
    echo
    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
        exit 1
    fi
fi

# Navigate to infrastructure directory
cd infra/pulumi-gcp

# Install dependencies
echo "πŸ“¦ Installing dependencies..."
pnpm install

# Build TypeScript
echo "πŸ”¨ Building TypeScript..."
pnpm build

# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "πŸ“š Setting up Pulumi stack: $STACK_NAME"

if pulumi stack ls | grep -q "$STACK_NAME"; then
    pulumi stack select "$STACK_NAME"
else
    pulumi stack init "$STACK_NAME"
fi

# Configure stack with production settings
echo "βš™οΈ Configuring production stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set gcp:project "$GCP_PROJECT"
pulumi config set gcp:region "$GCP_REGION"

# Preview changes with detailed diff
echo "πŸ‘€ Previewing changes..."
pulumi preview --diff

# Require explicit confirmation for production
echo "⚠️ PRODUCTION DEPLOYMENT WARNING ⚠️"
echo "This will deploy to production environment."
echo "Project: $GCP_PROJECT"
echo "Region: $GCP_REGION"
echo ""

read -p "Type 'DEPLOY TO PRODUCTION' to confirm: " confirmation
if [ "$confirmation" != "DEPLOY TO PRODUCTION" ]; then
    echo "❌ Production deployment cancelled"
    exit 1
fi

# Deploy infrastructure
echo "πŸš€ Deploying production infrastructure..."
pulumi up --yes

# Enhanced verification for production
echo "βœ… Running comprehensive post-deployment verification..."
../scripts/verify-production-deployment.sh

echo "πŸŽ‰ GCP Production deployment complete!"

βœ… Post-Deployment Verification

Basic Verification Script

#!/bin/bash
# verify-deployment.sh

echo "βœ… Post-Deployment Verification"
echo "=============================="

# Check stack outputs
echo "πŸ“Š Checking stack outputs..."
OUTPUTS=$(pulumi stack output --json)

if [ $? -ne 0 ]; then
    echo "❌ Failed to get stack outputs"
    exit 1
fi

# Extract key outputs
DATABASE_URL=$(echo "$OUTPUTS" | jq -r '.databaseUrl // empty')
API_URL=$(echo "$OUTPUTS" | jq -r '.apiUrl // empty')
CDN_URL=$(echo "$OUTPUTS" | jq -r '.cdnUrl // empty')

# Verify database connectivity
if [ -n "$DATABASE_URL" ]; then
    echo "πŸ—„οΈ Testing database connectivity..."
    pg_isready -d "$DATABASE_URL" >/dev/null 2>&1
    if [ $? -eq 0 ]; then
        echo "βœ… Database is accessible"
    else
        echo "❌ Database connection failed"
        exit 1
    fi
fi

# Verify API endpoint
if [ -n "$API_URL" ]; then
    echo "πŸ”— Testing API endpoint..."
    HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$API_URL/health" || echo "000")
    if [ "$HTTP_STATUS" = "200" ]; then
        echo "βœ… API endpoint is healthy"
    else
        echo "❌ API endpoint returned status: $HTTP_STATUS"
        exit 1
    fi
fi

# Verify CDN
if [ -n "$CDN_URL" ]; then
    echo "🌐 Testing CDN..."
    HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$CDN_URL" || echo "000")
    if [[ "$HTTP_STATUS" =~ ^[23] ]]; then
        echo "βœ… CDN is accessible"
    else
        echo "❌ CDN returned status: $HTTP_STATUS"
        exit 1
    fi
fi

echo "πŸŽ‰ All verification checks passed!"

Production Verification Script

#!/bin/bash
# verify-production-deployment.sh

echo "βœ… Production Deployment Verification"
echo "===================================="

# Run basic verification
./verify-deployment.sh

# Additional production checks
echo "πŸ”’ Production-specific verification..."

# Check backup configuration
echo "πŸ’Ύ Verifying backup configuration..."
if [ "$PROVIDER" = "aws" ]; then
    # Check RDS backup retention
    DB_INSTANCE=$(pulumi stack output databaseInstanceId)
    BACKUP_RETENTION=$(aws rds describe-db-instances \
        --db-instance-identifier "$DB_INSTANCE" \
        --query 'DBInstances[0].BackupRetentionPeriod' \
        --output text)
    
    if [ "$BACKUP_RETENTION" -ge 7 ]; then
        echo "βœ… Backup retention: $BACKUP_RETENTION days"
    else
        echo "❌ Backup retention too short: $BACKUP_RETENTION days"
        exit 1
    fi
elif [ "$PROVIDER" = "gcp" ]; then
    # Check Cloud SQL backup configuration
    DB_INSTANCE=$(pulumi stack output databaseInstanceName)
    BACKUP_ENABLED=$(gcloud sql instances describe "$DB_INSTANCE" \
        --format="value(settings.backupConfiguration.enabled)")
    
    if [ "$BACKUP_ENABLED" = "True" ]; then
        echo "βœ… Automated backups enabled"
    else
        echo "❌ Automated backups not enabled"
        exit 1
    fi
fi

# Check monitoring
echo "πŸ“Š Verifying monitoring setup..."
# Add monitoring verification logic here

# Check security settings
echo "πŸ”’ Verifying security settings..."
# Add security verification logic here

# Performance baseline
echo "⚑ Running performance baseline..."
# Add performance testing logic here

echo "πŸŽ‰ Production verification complete!"

πŸ”„ Rollback Procedures

Automated Rollback

#!/bin/bash
# rollback-deployment.sh

set -e

STACK_NAME="$1"
TARGET_VERSION="$2"

if [ -z "$STACK_NAME" ] || [ -z "$TARGET_VERSION" ]; then
    echo "Usage: $0 <stack-name> <target-version>"
    echo "Example: $0 focus-ai-production v1.2.3"
    exit 1
fi

echo "πŸ”„ Rolling back deployment"
echo "========================="
echo "Stack: $STACK_NAME"
echo "Target Version: $TARGET_VERSION"
echo ""

# Confirm rollback
read -p "⚠️ Confirm rollback? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
    echo "❌ Rollback cancelled"
    exit 1
fi

# Select stack
pulumi stack select "$STACK_NAME"

# Get current state
echo "πŸ“Š Capturing current state..."
pulumi stack export > "rollback-from-$(date +%Y%m%d-%H%M%S).json"

# Rollback to previous version
echo "βͺ Rolling back to $TARGET_VERSION..."
git checkout "$TARGET_VERSION"

# Rebuild
pnpm install
pnpm build

# Deploy previous version
pulumi up --yes

# Verify rollback
echo "βœ… Verifying rollback..."
./verify-deployment.sh

echo "πŸŽ‰ Rollback complete!"

Manual Rollback Steps

  1. Identify Target Version:

    git log --oneline -10
    git tag -l | sort -V | tail -5
    
  2. Backup Current State:

    pulumi stack export > rollback-backup-$(date +%Y%m%d-%H%M%S).json
    
  3. Checkout Target Version:

    git checkout <target-version>
    
  4. Rebuild and Deploy:

    cd infra/pulumi-aws  # or pulumi-gcp
    pnpm install
    pnpm build
    pulumi up
    
  5. Verify Rollback:

    ./verify-deployment.sh
    

🚨 Common Issues & Solutions

Issue: Pulumi State Lock

Symptoms:

  • "error: the stack is currently locked"
  • Deployment hangs indefinitely

Solution:

# Check lock status
pulumi stack --show-name
pulumi whoami

# Force unlock (use with caution)
pulumi cancel
pulumi stack --show-name  # Verify unlock

# If still locked, contact team members

Issue: Resource Already Exists

Symptoms:

  • "resource already exists" errors
  • Conflicting resource names

Solution:

# Import existing resource
pulumi import <resource-type> <resource-name> <resource-id>

# Or use different naming
pulumi config set app:name <new-unique-name>

Issue: Insufficient Permissions

Symptoms:

  • "Access Denied" errors
  • "Forbidden" responses

Solution:

# AWS: Check IAM permissions
aws sts get-caller-identity
aws iam list-attached-user-policies --user-name <username>

# GCP: Check IAM permissions
gcloud auth list
gcloud projects get-iam-policy <project-id>

Issue: Network Connectivity

Symptoms:

  • Timeout errors
  • Connection refused

Solution:

# Check security groups/firewall rules
# AWS
aws ec2 describe-security-groups

# GCP
gcloud compute firewall-rules list

# Test connectivity
telnet <endpoint> <port>

πŸ“ž Emergency Contacts

  • Infrastructure Team Lead: [Contact Info]
  • DevOps Engineer: [Contact Info]
  • Cloud Provider Support:
    • AWS: 1-800-xxx-xxxx
    • GCP: 1-855-xxx-xxxx
  • On-Call Rotation: [PagerDuty/Slack Channel]

Last updated: $(date) Version: 1.0.0