Skip to content

Environments

MenoTime operates three distinct environments, each with specific purposes, configurations, and deployment rules. This document details environment specifications, promotion workflows, and branch-to-environment mapping.

Environment Specifications

Development Environment

Identifier: menotime-dev

Purpose: - Rapid feature development and experimentation - Local testing before staging promotion - Ad-hoc debugging and troubleshooting - Branch feature deployments

Compute Configuration: - ECS Cluster: menotime-dev-cluster - Task Definition: menotime-backend-dev:latest - Desired Tasks: 1 (manual scaling, no auto-scaling) - vCPU: 0.5 - Memory: 1 GB - Container Image: Pulled from ECR with develop branch tag

Database Configuration: - Instance: menotime-dev (RDS PostgreSQL) - Class: db.m7g.large - Storage: 100 GB (gp3) - Multi-AZ: No (single AZ) - Backup Retention: 7 days (automated) - Enhanced Monitoring: Disabled - Performance Insights: Disabled - Publicly Accessible: No (private subnet only)

Networking: - ALB: Shared with staging (separate target group) - DNS: dev-api.menotime.ai (optional; may use Route 53 weighted routing) - WAF: Optional (not typically enabled for dev) - Security Group: menotime-dev-sg (permissive rules for testing)

Note: Configuration uses menotime-{env} naming pattern where {env} is replaced with environment name.

Scaling: - Auto-Scaling: Disabled (manual scaling via AWS Console or CLI) - Replicas: 1 (single task) - Manual Scale-up: When testing load scenarios (up to 2 tasks)

Environment Variables:

ENVIRONMENT=development
DATABASE_HOST=menotime-dev.xxxxx.us-west-1.rds.amazonaws.com
DATABASE_PORT=5432
DATABASE_NAME=menotime_dev
LOG_LEVEL=DEBUG
API_DEBUG=true
SENTRY_ENABLED=true
EMAIL_SANDBOX_MODE=true  # SES sandbox, limited sending
STRIPE_MODE=test

Retention & Cost: - Retention Policy: 30 days (dev data refreshed regularly) - Monthly Cost: ~$200 (lowest tier compute + minimal database usage) - Cleanup: Weekly cleanup of old test data and orphaned resources

Access: - Developers: Full access to console, logs, and database - CI/CD: Automated deployments from develop branch - Secrets: Separate Secrets Manager entries (non-production credentials)


Staging Environment

Identifier: menotime-staging

Purpose: - Pre-production validation before production release - Performance and load testing - Security testing and vulnerability scanning - Demo environment for stakeholders and clients - Integration testing with third-party services

Compute Configuration: - ECS Cluster: menotime-staging-cluster - Task Definition: menotime-backend-staging:latest - Desired Tasks: 2 (manual or minimal auto-scaling) - vCPU: 0.5 - Memory: 1 GB - Container Image: Pulled from ECR with main branch tag (release candidates)

Database Configuration: - Instance: menotime-staging (RDS PostgreSQL) - Class: db.m7g.large - Storage: 500 GB (gp3) - Multi-AZ: No (single AZ; upgrade path available) - Backup Retention: 7 days (automated) - Enhanced Monitoring: Enabled - Performance Insights: Enabled (30-day retention) - Publicly Accessible: No (private subnet only)

Networking: - ALB: Shared ALB with dev (separate target group) - DNS: staging-api.menotime.ai or shared menotime.ai with path routing

Configuration uses naming pattern menotime-staging for staging environment resources. - WAF: Enabled (test WAF rules and patterns) - Security Group: menotime-staging-sg (mirrors production rules)

Scaling: - Auto-Scaling: Minimal (2-3 tasks max) to control costs - CPU Threshold: 70% (trigger scale-up) - Memory Threshold: 80% (trigger scale-up) - Scale-down Cooldown: 300 seconds

Environment Variables:

ENVIRONMENT=staging
DATABASE_HOST=menotime-staging.xxxxx.us-west-1.rds.amazonaws.com
DATABASE_PORT=5432
DATABASE_NAME=menotime_staging
LOG_LEVEL=INFO
API_DEBUG=false
SENTRY_ENABLED=true
EMAIL_SANDBOX_MODE=false  # SES production, but sandbox account initially
STRIPE_MODE=test

Retention & Cost: - Retention Policy: 90 days (mirrors production for realistic testing) - Monthly Cost: ~$250 (shared ALB, larger database) - Data Refresh: Weekly refresh from production snapshot (anonymized)

Access: - Developers: Read-only console access, deploy via CI/CD - QA Team: Read-only access, ability to trigger test scenarios - Product: Demos and stakeholder testing - CI/CD: Automated deployments from main branch pull requests


Production Environment

Identifier: menotime-prod

Purpose: - Live patient data and real-world traffic - Critical healthcare delivery platform - HIPAA-compliant operations - Revenue and patient care dependent

Compute Configuration: - ECS Cluster: menotime-prod-cluster - Task Definition: menotime-backend-prod:latest - Desired Tasks: 2 (minimum for high availability) - vCPU: 1 - Memory: 2 GB - Container Image: Pulled from ECR with semantic version tag (e.g., v1.2.3)

Configuration naming pattern: menotime-prod for production environment.

Database Configuration: - Instance: menotime-prod (RDS PostgreSQL) - Class: db.m7g.large - Storage: 1 TB (gp3, auto-expandable) - Multi-AZ: Recommended upgrade (currently Single-AZ for cost control) - Backup Retention: 7 days (automated), daily snapshots to S3 - Enhanced Monitoring: Enabled - Performance Insights: Enabled (7-day retention) - Publicly Accessible: No (private subnet only) - Encryption: KMS at rest, SSL in transit - IAM Database Authentication: Enabled

Networking: - ALB: Dedicated production ALB - DNS: menotime.ai (primary domain via Route 53) - CloudFront: Enabled for static asset delivery and API caching - WAF: Enabled (production rule set) - Security Group: menotime-prod-sg (restrictive, principle of least privilege)

Scaling: - Auto-Scaling: Enabled (2-4 tasks) - CPU Threshold: 60% (trigger scale-up) - Memory Threshold: 75% (trigger scale-up) - Scale-down Threshold: 30% (with 600-second cooldown) - Scaling Policy: Target Tracking (preferred) or Step Scaling

Environment Variables:

ENVIRONMENT=production
DATABASE_HOST=menotime-prod.xxxxx.us-west-1.rds.amazonaws.com
DATABASE_PORT=5432
DATABASE_NAME=menotime_prod
LOG_LEVEL=WARNING
API_DEBUG=false
SENTRY_ENABLED=true
SENTRY_SAMPLE_RATE=0.1  # Log 10% of errors to avoid alert fatigue
EMAIL_SANDBOX_MODE=false  # SES production account
STRIPE_MODE=live
SECURE_COOKIES=true
HSTS_ENABLED=true

Retention & Cost: - Data Retention: Indefinite (patient records) - Log Retention: CloudWatch (30 days), S3 Glacier (2+ years) - Monthly Cost: ~$616 (250 patients), scaling to ~$896 (1,000 patients) - Cost Drivers: RDS (~70%), ALB, NAT Gateway, data transfer

Access: - Developers: Read-only logs and metrics; no direct console access - On-Call Engineers: Full access during incidents (via SSM Session Manager) - Operations: Monitoring, alerting, backup management - CI/CD: Gated deployments requiring GitHub approvals and passing tests - Audit: All API calls logged to CloudTrail; access controlled by IAM

Deployments: - Frequency: Twice per week (controlled release schedule) - Strategy: Rolling deployment (1 task minimum always running) - Approval: Required merge to main + GitHub Actions approval - Rollback: Automated on health check failure; manual rollback available - Change Window: Business hours (PST) to monitor for issues


Environment Comparison Matrix

Aspect Dev Staging Production
Compute (vCPU/RAM) 0.5 / 1GB 0.5 / 1GB 1 / 2GB
Desired Tasks 1 2 2
Auto-Scaling No Minimal (2-3 max) Yes (2-4)
Database db.m7g.large db.m7g.large db.m7g.large
Multi-AZ No No Recommended upgrade
Enhanced Monitoring No Yes Yes
Performance Insights No Yes (30-day) Yes (7-day)
WAF Optional Yes Yes
CloudFront No No Yes
Backup Retention 7 days 7 days 7 days
Data Refresh Manual Weekly (anonymized) N/A (live)
Cost/Month ~$200 ~$250 ~$616-896
Access Level Full QA/Demo Restricted
Deployment Automated Automated (PR) Gated approval
Rollback Fast Fast Automated on failure

Promotion Workflow

Development → Staging

Trigger: Pull request to main branch

Process: 1. Developer creates PR from develop (or feature branch) to main 2. CI/CD pipeline: - Runs unit tests, linting, security scans - Builds container image tagged with PR number and commit SHA - Pushes to ECR - Deploys to staging (automatic or manual trigger) 3. QA validates in staging environment 4. PR approval from code review team 5. Merge to main (triggers automated tagging with main branch tag in ECR)

Testing Checklist: - Unit tests pass (>80% coverage) - Integration tests pass (database migrations, API endpoints) - Security scan passes (no critical vulnerabilities) - Performance test passes (response time \<500ms for 95th percentile) - Manual QA sign-off (if applicable)


Staging → Production

Trigger: Git tag matching semantic version (e.g., v1.2.3)

Process: 1. Release Manager or CI/CD creates git tag on main branch bash git tag v1.2.3 git push origin v1.2.3 2. CI/CD pipeline: - Runs full test suite (unit, integration, security, performance) - Builds container image tagged with semantic version - Creates GitHub release with changelog - Requires GitHub Actions approval for production deployment 3. Deployment: - Rolling update to production ECS cluster - Health checks verify new tasks are healthy before removing old ones - CloudWatch alarms monitored for first 30 minutes 4. Post-deployment: - Smoke tests run (critical API endpoints, patient data access) - Monitoring for anomalies (error rates, latency, database connections) - Rollback available if issues detected

Production Deployment Requirements: - All tests passing - Code review approval - GitHub Actions manual approval - Change window compliance (PST business hours preferred) - Runbook prepared for rollback


Branch-to-Environment Mapping

┌──────────────────────────────────────────────────────────┐
│                  GIT WORKFLOW                            │
└──────────────────────────────────────────────────────────┘

Feature Branch (feature/*)
    │
    └──> PR to develop
         │
         └──> Deploy to Dev (if enabled)
              │
              └──> Merge to develop
                   │
                   └──> Auto-deploy to Dev (latest)

develop Branch
    │
    └──> PR to main
         │
         └──> Build & Push (ECR: tag=PR-number, SHA)
              │
              └──> Deploy to Staging (manual or auto)
                   │
                   └──> QA/Testing
                        │
                        └──> PR Approved & Merged
                             │
                             └──> Merge to main
                                  │
                                  └──> ECR: tag=main
                                       │
                                       └──> Staging updated

main Branch
    │
    └──> Git tag: v1.2.3
         │
         └──> Build & Push (ECR: tag=v1.2.3)
              │
              └──> Create GitHub Release
                   │
                   └──> GitHub Actions Approval
                        │
                        └──> Deploy to Production
                             │
                             └──> Rolling update (2→4 tasks)
                                  │
                                  └──> Health checks
                                       │
                                       └──> Live!

Mapping Details:

Branch Environment Image Tag Frequency Approval
feature/* Dev feature-name On-demand None
develop Dev develop Per commit None
main (PR) Staging PR-{number} Per PR Code review
main (merged) Staging main Per merge Code review
main (tag) Production v{MAJOR}.{MINOR}.{PATCH} Per release GitHub Actions

Environment Variables & Secrets

Secret Rotation Policy

Environment Frequency Rotation Method
Development Never Manual if leaked
Staging Every 90 days Lambda + Secrets Manager
Production Every 30 days Lambda + Secrets Manager

Database Credentials

Stored in: AWS Secrets Manager

Format:

{
  "username": "menotime_user",
  "password": "auto-generated-strong-password",
  "engine": "postgresql",
  "host": "menotime-{env}.xxxxx.us-west-1.rds.amazonaws.com",
  "port": 5432,
  "dbname": "menotime_{env}"
}

Access: ECS task role assumes role to retrieve secret; automatic rotation every 30 days (production)


Environment Promotion Requirements

Before Promotion to Staging

  1. Code Quality:
  2. All tests passing
  3. Code review approval
  4. Linting and style checks

  5. Security:

  6. No hardcoded secrets
  7. Security scan passing (no critical/high CVEs)
  8. Dependency audit passing

  9. Documentation:

  10. Changelog entry
  11. Migration scripts (if database changes)
  12. API documentation updated

Before Promotion to Production

  1. Validation:
  2. 24+ hours in staging (minimum)
  3. Passed QA sign-off
  4. Performance test results reviewed
  5. Database migration tested (if applicable)

  6. Operational:

  7. Runbook prepared (if manual steps required)
  8. Rollback plan documented
  9. On-call engineer identified
  10. Monitoring alerts verified

  11. Compliance:

  12. HIPAA assessment completed (if PHI-touching changes)
  13. Audit log entries reviewed
  14. Security team sign-off (if applicable)

Rollback Procedures

Staging Rollback

Trigger: Manual or automated (if health checks fail)

Process: 1. Identify issue from CloudWatch logs and alarms 2. Retrieve previous image tag from ECR (e.g., main vs PR-123) 3. Update ECS service task definition to use previous image 4. Monitor health checks (typically 2-3 minutes) 5. Verify in staging environment

Command:

aws ecs update-service \
  --cluster menotime-staging-cluster \
  --service menotime-staging-service \
  --task-definition menotime-backend-staging:previous-revision \
  --force-new-deployment

Production Rollback

Trigger: Automated on health check failure OR manual if critical issue

Process: 1. Page on-call engineer 2. Assess severity (patient impact, data loss risk) 3. If rollback warranted: - Retrieve previous production version (e.g., v1.2.2) - Update ECS service to use previous version - Monitor for 30+ minutes 4. Post-incident review within 24 hours

Command:

aws ecs update-service \
  --cluster menotime-prod-cluster \
  --service menotime-prod-service \
  --task-definition menotime-backend-prod:previous-revision \
  --force-new-deployment

Monitoring by Environment

Development

  • Basic CloudWatch logs (no aggregation)
  • Manual review of errors
  • No alerting

Staging

  • CloudWatch dashboards for key metrics
  • Alarms for critical failures (database down, service unhealthy)
  • SNS → Slack for staging team

Production

  • Comprehensive CloudWatch dashboards
  • Real-time alarms for all critical metrics
  • SNS → PagerDuty for on-call escalation
  • GuardDuty findings reviewed weekly
  • Monthly cost reviews and optimization

Summary

The three-environment approach provides: - Safety: Staging validates changes before production - Agility: Dev enables rapid iteration - Compliance: Audit trails and controlled promotions - Cost Control: Smaller instances in non-production - Scalability: Production auto-scaling for patient volume

For operational runbooks, see Monitoring and ECS Fargate.