Deployment Runbook
1.0.0Operational checklist for releasing platform changes.
On this page
- Infrastructure Deployment Runbook
- π― Overview
- π Pre-Deployment Checklist
- Prerequisites Verification
- Configuration Validation
- π Deployment Procedures
- AWS Deployment
- Development Environment
- Production Environment
- GCP Deployment
- Development Environment
- Production Environment
- β Post-Deployment Verification
- Basic Verification Script
- Production Verification Script
- π Rollback Procedures
- Automated Rollback
- Manual Rollback Steps
- π¨ Common Issues & Solutions
- Issue: Pulumi State Lock
- Issue: Resource Already Exists
- Issue: Insufficient Permissions
- Issue: Network Connectivity
- π Emergency Contacts
Infrastructure Deployment Runbook
This runbook provides step-by-step procedures for deploying App Factory infrastructure across different environments and cloud providers.
π― Overview
This runbook covers:
- Pre-deployment checklist
- Environment-specific deployments
- Post-deployment verification
- Rollback procedures
- Common deployment issues
π Pre-Deployment Checklist
Prerequisites Verification
#!/bin/bash
# pre-deployment-check.sh
echo "π Pre-Deployment Verification"
echo "=============================="
# Check required tools
echo "π οΈ Checking required tools..."
command -v pulumi >/dev/null 2>&1 || { echo "β Pulumi CLI not found"; exit 1; }
command -v node >/dev/null 2>&1 || { echo "β Node.js not found"; exit 1; }
command -v pnpm >/dev/null 2>&1 || { echo "β pnpm not found"; exit 1; }
# Check Node.js version
NODE_VERSION=$(node --version | cut -d'v' -f2 | cut -d'.' -f1)
if [ "$NODE_VERSION" -lt 22 ]; then
echo "β Node.js 22+ required, found: $(node --version)"
exit 1
fi
# Check cloud provider tools
if [ "$PROVIDER" = "aws" ]; then
command -v aws >/dev/null 2>&1 || { echo "β AWS CLI not found"; exit 1; }
aws sts get-caller-identity >/dev/null 2>&1 || { echo "β AWS credentials not configured"; exit 1; }
elif [ "$PROVIDER" = "gcp" ]; then
command -v gcloud >/dev/null 2>&1 || { echo "β Google Cloud SDK not found"; exit 1; }
gcloud auth list --filter=status:ACTIVE --format="value(account)" | head -1 >/dev/null || { echo "β GCP credentials not configured"; exit 1; }
fi
echo "β
All prerequisites verified"
Configuration Validation
#!/bin/bash
# validate-config.sh
echo "βοΈ Configuration Validation"
echo "=========================="
# Required environment variables
REQUIRED_VARS=("APP_NAME" "ENVIRONMENT" "PROVIDER")
for var in "${REQUIRED_VARS[@]}"; do
if [ -z "${!var}" ]; then
echo "β Required environment variable $var is not set"
exit 1
fi
done
# Validate app name format
if [[ ! "$APP_NAME" =~ ^[a-z][a-z0-9-]*[a-z0-9]$ ]]; then
echo "β APP_NAME must be lowercase, start with letter, contain only letters, numbers, and hyphens"
exit 1
fi
# Validate environment
if [[ ! "$ENVIRONMENT" =~ ^(development|staging|production)$ ]]; then
echo "β ENVIRONMENT must be one of: development, staging, production"
exit 1
fi
# Validate provider
if [[ ! "$PROVIDER" =~ ^(aws|gcp)$ ]]; then
echo "β PROVIDER must be one of: aws, gcp"
exit 1
fi
echo "β
Configuration validated"
π Deployment Procedures
AWS Deployment
Development Environment
#!/bin/bash
# deploy-aws-dev.sh
set -e
export APP_NAME="focus-ai"
export ENVIRONMENT="development"
export PROVIDER="aws"
export AWS_REGION="us-east-1"
echo "π Deploying AWS Development Environment"
echo "======================================="
# Run pre-deployment checks
./pre-deployment-check.sh
./validate-config.sh
# Navigate to infrastructure directory
cd infra/pulumi-aws
# Install dependencies
echo "π¦ Installing dependencies..."
pnpm install
# Build TypeScript
echo "π¨ Building TypeScript..."
pnpm build
# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "π Setting up Pulumi stack: $STACK_NAME"
if pulumi stack ls | grep -q "$STACK_NAME"; then
pulumi stack select "$STACK_NAME"
else
pulumi stack init "$STACK_NAME"
fi
# Configure stack
echo "βοΈ Configuring stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set aws:region "$AWS_REGION"
# Preview changes
echo "π Previewing changes..."
pulumi preview
# Confirm deployment
read -p "π€ Proceed with deployment? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "β Deployment cancelled"
exit 1
fi
# Deploy infrastructure
echo "π Deploying infrastructure..."
pulumi up --yes
# Verify deployment
echo "β
Running post-deployment verification..."
../scripts/verify-deployment.sh
echo "π AWS Development deployment complete!"
Production Environment
#!/bin/bash
# deploy-aws-prod.sh
set -e
export APP_NAME="focus-ai"
export ENVIRONMENT="production"
export PROVIDER="aws"
export AWS_REGION="us-east-1"
echo "π Deploying AWS Production Environment"
echo "======================================"
# Enhanced pre-deployment checks for production
./pre-deployment-check.sh
./validate-config.sh
# Additional production checks
echo "π Production-specific checks..."
# Verify backup strategy
if [ ! -f "backup-strategy.md" ]; then
echo "β Backup strategy documentation required for production"
exit 1
fi
# Verify monitoring setup
if [ ! -f "monitoring-config.yaml" ]; then
echo "β Monitoring configuration required for production"
exit 1
fi
# Navigate to infrastructure directory
cd infra/pulumi-aws
# Install dependencies
echo "π¦ Installing dependencies..."
pnpm install
# Build TypeScript
echo "π¨ Building TypeScript..."
pnpm build
# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "π Setting up Pulumi stack: $STACK_NAME"
if pulumi stack ls | grep -q "$STACK_NAME"; then
pulumi stack select "$STACK_NAME"
else
pulumi stack init "$STACK_NAME"
fi
# Configure stack with production settings
echo "βοΈ Configuring production stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set aws:region "$AWS_REGION"
# Production-specific configurations
pulumi config set app:deletionProtection true
pulumi config set app:backupRetentionDays 30
pulumi config set app:multiAz true
# Preview changes with detailed diff
echo "π Previewing changes..."
pulumi preview --diff
# Require explicit confirmation for production
echo "β οΈ PRODUCTION DEPLOYMENT WARNING β οΈ"
echo "This will deploy to production environment."
echo "Ensure you have:"
echo "- Reviewed all changes"
echo "- Notified stakeholders"
echo "- Scheduled maintenance window"
echo ""
read -p "Type 'DEPLOY TO PRODUCTION' to confirm: " confirmation
if [ "$confirmation" != "DEPLOY TO PRODUCTION" ]; then
echo "β Production deployment cancelled"
exit 1
fi
# Deploy infrastructure
echo "π Deploying production infrastructure..."
pulumi up --yes
# Enhanced verification for production
echo "β
Running comprehensive post-deployment verification..."
../scripts/verify-production-deployment.sh
echo "π AWS Production deployment complete!"
GCP Deployment
Development Environment
#!/bin/bash
# deploy-gcp-dev.sh
set -e
export APP_NAME="focus-ai"
export ENVIRONMENT="development"
export PROVIDER="gcp"
export GCP_PROJECT="your-dev-project"
export GCP_REGION="us-central1"
echo "π Deploying GCP Development Environment"
echo "======================================="
# Run pre-deployment checks
./pre-deployment-check.sh
./validate-config.sh
# Verify GCP project
if [ -z "$GCP_PROJECT" ]; then
echo "β GCP_PROJECT environment variable is required"
exit 1
fi
# Navigate to infrastructure directory
cd infra/pulumi-gcp
# Install dependencies
echo "π¦ Installing dependencies..."
pnpm install
# Build TypeScript
echo "π¨ Building TypeScript..."
pnpm build
# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "π Setting up Pulumi stack: $STACK_NAME"
if pulumi stack ls | grep -q "$STACK_NAME"; then
pulumi stack select "$STACK_NAME"
else
pulumi stack init "$STACK_NAME"
fi
# Configure stack
echo "βοΈ Configuring stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set gcp:project "$GCP_PROJECT"
pulumi config set gcp:region "$GCP_REGION"
# Preview changes
echo "π Previewing changes..."
pulumi preview
# Confirm deployment
read -p "π€ Proceed with deployment? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "β Deployment cancelled"
exit 1
fi
# Deploy infrastructure
echo "π Deploying infrastructure..."
pulumi up --yes
# Verify deployment
echo "β
Running post-deployment verification..."
../scripts/verify-deployment.sh
echo "π GCP Development deployment complete!"
Production Environment
#!/bin/bash
# deploy-gcp-prod.sh
set -e
export APP_NAME="focus-ai"
export ENVIRONMENT="production"
export PROVIDER="gcp"
export GCP_PROJECT="your-prod-project"
export GCP_REGION="us-central1"
echo "π Deploying GCP Production Environment"
echo "======================================"
# Enhanced pre-deployment checks for production
./pre-deployment-check.sh
./validate-config.sh
# Verify GCP project
if [ -z "$GCP_PROJECT" ]; then
echo "β GCP_PROJECT environment variable is required"
exit 1
fi
# Additional production checks
echo "π Production-specific checks..."
# Verify project is production project
if [[ ! "$GCP_PROJECT" =~ -prod$ ]]; then
echo "β οΈ Warning: GCP project name doesn't end with '-prod'"
read -p "Continue anyway? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
fi
# Navigate to infrastructure directory
cd infra/pulumi-gcp
# Install dependencies
echo "π¦ Installing dependencies..."
pnpm install
# Build TypeScript
echo "π¨ Building TypeScript..."
pnpm build
# Initialize or select stack
STACK_NAME="${APP_NAME}-${ENVIRONMENT}"
echo "π Setting up Pulumi stack: $STACK_NAME"
if pulumi stack ls | grep -q "$STACK_NAME"; then
pulumi stack select "$STACK_NAME"
else
pulumi stack init "$STACK_NAME"
fi
# Configure stack with production settings
echo "βοΈ Configuring production stack..."
pulumi config set app:name "$APP_NAME"
pulumi config set app:environment "$ENVIRONMENT"
pulumi config set gcp:project "$GCP_PROJECT"
pulumi config set gcp:region "$GCP_REGION"
# Preview changes with detailed diff
echo "π Previewing changes..."
pulumi preview --diff
# Require explicit confirmation for production
echo "β οΈ PRODUCTION DEPLOYMENT WARNING β οΈ"
echo "This will deploy to production environment."
echo "Project: $GCP_PROJECT"
echo "Region: $GCP_REGION"
echo ""
read -p "Type 'DEPLOY TO PRODUCTION' to confirm: " confirmation
if [ "$confirmation" != "DEPLOY TO PRODUCTION" ]; then
echo "β Production deployment cancelled"
exit 1
fi
# Deploy infrastructure
echo "π Deploying production infrastructure..."
pulumi up --yes
# Enhanced verification for production
echo "β
Running comprehensive post-deployment verification..."
../scripts/verify-production-deployment.sh
echo "π GCP Production deployment complete!"
β Post-Deployment Verification
Basic Verification Script
#!/bin/bash
# verify-deployment.sh
echo "β
Post-Deployment Verification"
echo "=============================="
# Check stack outputs
echo "π Checking stack outputs..."
OUTPUTS=$(pulumi stack output --json)
if [ $? -ne 0 ]; then
echo "β Failed to get stack outputs"
exit 1
fi
# Extract key outputs
DATABASE_URL=$(echo "$OUTPUTS" | jq -r '.databaseUrl // empty')
API_URL=$(echo "$OUTPUTS" | jq -r '.apiUrl // empty')
CDN_URL=$(echo "$OUTPUTS" | jq -r '.cdnUrl // empty')
# Verify database connectivity
if [ -n "$DATABASE_URL" ]; then
echo "ποΈ Testing database connectivity..."
pg_isready -d "$DATABASE_URL" >/dev/null 2>&1
if [ $? -eq 0 ]; then
echo "β
Database is accessible"
else
echo "β Database connection failed"
exit 1
fi
fi
# Verify API endpoint
if [ -n "$API_URL" ]; then
echo "π Testing API endpoint..."
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$API_URL/health" || echo "000")
if [ "$HTTP_STATUS" = "200" ]; then
echo "β
API endpoint is healthy"
else
echo "β API endpoint returned status: $HTTP_STATUS"
exit 1
fi
fi
# Verify CDN
if [ -n "$CDN_URL" ]; then
echo "π Testing CDN..."
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$CDN_URL" || echo "000")
if [[ "$HTTP_STATUS" =~ ^[23] ]]; then
echo "β
CDN is accessible"
else
echo "β CDN returned status: $HTTP_STATUS"
exit 1
fi
fi
echo "π All verification checks passed!"
Production Verification Script
#!/bin/bash
# verify-production-deployment.sh
echo "β
Production Deployment Verification"
echo "===================================="
# Run basic verification
./verify-deployment.sh
# Additional production checks
echo "π Production-specific verification..."
# Check backup configuration
echo "πΎ Verifying backup configuration..."
if [ "$PROVIDER" = "aws" ]; then
# Check RDS backup retention
DB_INSTANCE=$(pulumi stack output databaseInstanceId)
BACKUP_RETENTION=$(aws rds describe-db-instances \
--db-instance-identifier "$DB_INSTANCE" \
--query 'DBInstances[0].BackupRetentionPeriod' \
--output text)
if [ "$BACKUP_RETENTION" -ge 7 ]; then
echo "β
Backup retention: $BACKUP_RETENTION days"
else
echo "β Backup retention too short: $BACKUP_RETENTION days"
exit 1
fi
elif [ "$PROVIDER" = "gcp" ]; then
# Check Cloud SQL backup configuration
DB_INSTANCE=$(pulumi stack output databaseInstanceName)
BACKUP_ENABLED=$(gcloud sql instances describe "$DB_INSTANCE" \
--format="value(settings.backupConfiguration.enabled)")
if [ "$BACKUP_ENABLED" = "True" ]; then
echo "β
Automated backups enabled"
else
echo "β Automated backups not enabled"
exit 1
fi
fi
# Check monitoring
echo "π Verifying monitoring setup..."
# Add monitoring verification logic here
# Check security settings
echo "π Verifying security settings..."
# Add security verification logic here
# Performance baseline
echo "β‘ Running performance baseline..."
# Add performance testing logic here
echo "π Production verification complete!"
π Rollback Procedures
Automated Rollback
#!/bin/bash
# rollback-deployment.sh
set -e
STACK_NAME="$1"
TARGET_VERSION="$2"
if [ -z "$STACK_NAME" ] || [ -z "$TARGET_VERSION" ]; then
echo "Usage: $0 <stack-name> <target-version>"
echo "Example: $0 focus-ai-production v1.2.3"
exit 1
fi
echo "π Rolling back deployment"
echo "========================="
echo "Stack: $STACK_NAME"
echo "Target Version: $TARGET_VERSION"
echo ""
# Confirm rollback
read -p "β οΈ Confirm rollback? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "β Rollback cancelled"
exit 1
fi
# Select stack
pulumi stack select "$STACK_NAME"
# Get current state
echo "π Capturing current state..."
pulumi stack export > "rollback-from-$(date +%Y%m%d-%H%M%S).json"
# Rollback to previous version
echo "βͺ Rolling back to $TARGET_VERSION..."
git checkout "$TARGET_VERSION"
# Rebuild
pnpm install
pnpm build
# Deploy previous version
pulumi up --yes
# Verify rollback
echo "β
Verifying rollback..."
./verify-deployment.sh
echo "π Rollback complete!"
Manual Rollback Steps
-
Identify Target Version:
git log --oneline -10 git tag -l | sort -V | tail -5 -
Backup Current State:
pulumi stack export > rollback-backup-$(date +%Y%m%d-%H%M%S).json -
Checkout Target Version:
git checkout <target-version> -
Rebuild and Deploy:
cd infra/pulumi-aws # or pulumi-gcp pnpm install pnpm build pulumi up -
Verify Rollback:
./verify-deployment.sh
π¨ Common Issues & Solutions
Issue: Pulumi State Lock
Symptoms:
- "error: the stack is currently locked"
- Deployment hangs indefinitely
Solution:
# Check lock status
pulumi stack --show-name
pulumi whoami
# Force unlock (use with caution)
pulumi cancel
pulumi stack --show-name # Verify unlock
# If still locked, contact team members
Issue: Resource Already Exists
Symptoms:
- "resource already exists" errors
- Conflicting resource names
Solution:
# Import existing resource
pulumi import <resource-type> <resource-name> <resource-id>
# Or use different naming
pulumi config set app:name <new-unique-name>
Issue: Insufficient Permissions
Symptoms:
- "Access Denied" errors
- "Forbidden" responses
Solution:
# AWS: Check IAM permissions
aws sts get-caller-identity
aws iam list-attached-user-policies --user-name <username>
# GCP: Check IAM permissions
gcloud auth list
gcloud projects get-iam-policy <project-id>
Issue: Network Connectivity
Symptoms:
- Timeout errors
- Connection refused
Solution:
# Check security groups/firewall rules
# AWS
aws ec2 describe-security-groups
# GCP
gcloud compute firewall-rules list
# Test connectivity
telnet <endpoint> <port>
π Emergency Contacts
- Infrastructure Team Lead: [Contact Info]
- DevOps Engineer: [Contact Info]
- Cloud Provider Support:
- AWS: 1-800-xxx-xxxx
- GCP: 1-855-xxx-xxxx
- On-Call Rotation: [PagerDuty/Slack Channel]
Last updated: $(date) Version: 1.0.0