Skip to content

Troubleshooting

This guide provides solutions to common issues encountered during development and deployment of MenoTime.

Development Issues

Python Virtual Environment Problems

Issue: command not found: python3

# Solution: Install Python 3.11
brew install python@3.11  # macOS
apt-get install python3.11  # Ubuntu/Debian

Verify installation:

python3.11 --version
# Python 3.11.x

Issue: Virtual environment not activating

# Wrong
cd venv && bin/activate

# Correct
source venv/bin/activate  # macOS/Linux
venv\Scripts\activate  # Windows

# Verify it worked (should show (venv) prefix)
which python

Issue: Pip packages not installing

# Update pip first
pip install --upgrade pip setuptools wheel

# Check if you're in the right venv
which pip  # Should point to ./venv/bin/pip

# Try installing again
pip install -r requirements.txt -v  # -v for verbose output

PostgreSQL Connection Issues

Issue: psql: error: could not connect to server

# Check if Docker container is running
docker-compose ps

# If not running, start it
docker-compose up -d postgres

# Wait a moment, then verify
docker-compose logs postgres | tail -20

Issue: FATAL: password authentication failed

# Check .env.local has correct credentials
cat .env.local | grep POSTGRES

# Test with explicit credentials
psql -h localhost -U menotime -d menotime -W
# When prompted, enter password from .env.local

# If that fails, check container logs
docker-compose logs postgres

Issue: database "menotime" does not exist

# The database should be created automatically
# If not, create it manually

# First, exec into postgres container
docker-compose exec postgres psql -U postgres

# Inside psql:
CREATE DATABASE menotime;
CREATE USER menotime WITH PASSWORD 'menotime_dev';
ALTER ROLE menotime SET client_encoding TO 'utf8';
ALTER ROLE menotime SET timezone TO 'UTC';
GRANT ALL PRIVILEGES ON DATABASE menotime TO menotime;
\q

# Then test connection
psql -h localhost -U menotime -d menotime -c "SELECT 1;"

Issue: Port 5432 already in use

# Find what's using port 5432
lsof -i :5432

# Stop Docker container
docker-compose down

# Or use a different port
docker-compose.yml:
  ports:
    - "5433:5432"  # Use 5433 instead

# Update .env.local
DATABASE_URL=postgresql://menotime:menotime_dev@localhost:5433/menotime

Database Migration Issues

Issue: No such table: alembic_version

# The migrations table wasn't created. Initialize it:
alembic stamp head

# Then apply any pending migrations
alembic upgrade head

Issue: Migration fails with "column already exists"

# This usually means a migration was partially applied

# Check current version
alembic current

# View detailed migration info
alembic history --verbose

# Check the database for the conflicting column
psql -h localhost -U menotime -d menotime -c "\d patients"

# You may need to manually fix the database or roll back
# See the Database Migrations guide for rollback procedures

Issue: sqlalchemy.exc.ProgrammingError: column "X" does not exist

# A model expects a column the database doesn't have
# This usually means migrations didn't run

# Check your migrations
alembic current
alembic history --oneline

# If migrations are behind, apply them
alembic upgrade head

# If already applied, verify in the database
psql -h localhost -U menotime -d menotime -c "\d table_name"

FastAPI Server Issues

Issue: Address already in use (port 8000)

# Find what's using port 8000
lsof -i :8000

# Kill the process
kill -9 PID

# Or use a different port
uvicorn app.main:app --reload --port 8001

Issue: Server starts but endpoints return 500

# Check server logs for the error
# Look for Python traceback in the terminal

# Common causes:
# 1. Database not running
docker-compose ps

# 2. Migrations not applied
alembic current
alembic upgrade head

# 3. Missing dependencies
pip install -r requirements.txt

# 4. Import error in your code
# Check the traceback for the specific line

Issue: ModuleNotFoundError: No module named 'X'

# A dependency isn't installed

# Reinstall all dependencies
pip install -r requirements.txt

# Or install the specific module
pip install sqlalchemy

# Make sure you're in the virtual environment
which python  # Should be in ./venv/bin/

Issue: Health check endpoint returns 404

# The health endpoint is in app.main
# Make sure it's defined

# app/main.py should have:
@app.get("/health")
def health():
    return {"status": "healthy"}

# Test it directly
curl http://localhost:8000/health

Staging and Production Deployment Issues

ECS Task Issues

Issue: Task fails to start (lastStatus: STOPPED)

# Get the stopped task ID
aws ecs list-tasks \
  --cluster staging \
  --desired-status STOPPED \
  --region us-west-1

# Get details
aws ecs describe-tasks \
  --cluster staging \
  --tasks arn:aws:ecs:us-west-1:ACCOUNT_ID:task/staging/TASK_ID \
  --region us-west-1 \
  --query 'tasks[0].{status: lastStatus, reason: stoppedReason, errors: containers[0].reason}'

Common reasons: - Image not found — Check ECR image exists: aws ecr describe-images --repository-name menotime-api - Memory limit exceeded — Increase task memory in task definition - Port already in use — Check other tasks on the service - Secret not found — Verify secret exists and task role has permissions

Issue: Task is running but API unreachable

# Check health check status
aws ecs describe-services \
  --cluster staging \
  --services menotime-api \
  --region us-west-1 \
  --query 'services[0].deployments[0].runningCount'

# Check CloudWatch logs
aws logs tail /ecs/menotime-api-staging --follow --region us-west-1 | grep -i error

# Check load balancer health
aws elbv2 describe-target-health \
  --target-group-arn arn:aws:elasticloadbalancing:us-west-1:ACCOUNT_ID:targetgroup/menotime-staging \
  --region us-west-1

If targets are unhealthy: - Check the log group /ecs/menotime-api-staging for startup errors - Verify database is accessible from the VPC - Check security groups allow traffic from ALB

Issue: Task OOM (Out of Memory)

# Task definition doesn't have enough memory
aws ecs describe-task-definition \
  --task-definition menotime-api-staging \
  --region us-west-1 \
  --query 'taskDefinition.memory'

# Increase task memory
# Edit task definition JSON and increase memory field from 512 to 1024 (or higher)
aws ecs register-task-definition --cli-input-json file://task-def-updated.json

# Update service to use new task definition
aws ecs update-service \
  --cluster staging \
  --service menotime-api \
  --task-definition menotime-api-staging:NEW_REVISION \
  --region us-west-1

Database Connection Issues

Issue: could not connect to server: Connection refused

The application can't reach the RDS database.

# Check RDS is running
aws rds describe-db-instances \
  --db-instance-identifier menotime-staging \
  --region us-west-1 \
  --query 'DBInstances[0].DBInstanceStatus'

# Check security group allows access from ECS security group
aws ec2 describe-security-groups \
  --region us-west-1 \
  --filters Name=group-id,Values=sg-xxxxxx

# Check RDS security group allows inbound on port 5432
aws ec2 authorize-security-group-ingress \
  --group-id sg-xxxxxx \
  --protocol tcp \
  --port 5432 \
  --source-security-group-id sg-ecs-tasks \
  --region us-west-1

Issue: FATAL: database "menotime" does not exist

# Database wasn't created or was deleted

# Create it via RDS:
# 1. Go to AWS Console → RDS
# 2. Select the DB instance
# 3. Create database through AWS Console or:

aws rds execute-statement \
  --resource-arn arn:aws:rds:us-west-1:ACCOUNT_ID:db:menotime-staging \
  --database menotime \
  --sql "CREATE DATABASE menotime;" \
  --region us-west-1

Issue: too many connections

The connection pool is exhausted.

# Check database connections
aws rds describe-db-instances \
  --db-instance-identifier menotime-prod \
  --region us-west-1 \
  --query 'DBInstances[0].DBParameterGroups'

# Increase max_connections parameter
# 1. Modify parameter group
# 2. Increase max_connections (default 100 for db.t3.small)
# 3. Reboot database

# Or reduce pool size in application
# app/database.py:
# engine = create_engine(DATABASE_URL, pool_size=10, max_overflow=5)

Docker Image Issues

Issue: ImageNotFoundException when deploying

# The Docker image isn't in ECR

# Check images in ECR
aws ecr describe-images \
  --repository-name menotime-api \
  --region us-west-1

# If missing, rebuild and push
docker build -t menotime-api:latest .
aws ecr get-login-password --region us-west-1 | docker login --username AWS --password-stdin ACCOUNT_ID.dkr.ecr.us-west-1.amazonaws.com
docker tag menotime-api:latest ACCOUNT_ID.dkr.ecr.us-west-1.amazonaws.com/menotime-api:latest
docker push ACCOUNT_ID.dkr.ecr.us-west-1.amazonaws.com/menotime-api:latest

Issue: failed to get digest during build

# Docker daemon isn't running

# Start Docker
# macOS
open /Applications/Docker.app

# Linux
sudo systemctl start docker

# Windows
# Open Docker Desktop application

Deployment Pipeline Issues

Issue: GitHub Actions workflow fails

# Check workflow logs
# Go to GitHub → Actions → Select failed workflow → View details

# Common failures:
# 1. Test failures
pytest  # Run locally first

# 2. Docker build failures
docker build .  # Build locally first

# 3. ECR login failures
aws ecr get-login-password --region us-west-1 | docker login --username AWS --password-stdin ACCOUNT_ID.dkr.ecr.us-west-1.amazonaws.com

# 4. ECS deployment failures
aws ecs describe-services \
  --cluster staging \
  --services menotime-api \
  --region us-west-1 \
  --query 'services[0].deployments'

API Response Issues

Issue: API returns 500 Internal Server Error

# Check the logs in CloudWatch
aws logs tail /ecs/menotime-api-staging --follow --region us-west-1

# Look for the error traceback to identify the problem

# Common causes:
# 1. Unhandled exception in route handler
# 2. Database query failed
# 3. Missing environment variable
# 4. Invalid data format

Issue: API returns 401 Unauthorized

# Check if authorization header is being sent
curl -v http://localhost:8000/api/v1/patients

# Should include:
# Authorization: Bearer YOUR_TOKEN

# If missing, the request needs a token:
TOKEN=$(curl -X POST http://localhost:8000/auth/login -d '{"email":"test@example.com","password":"password"}' | jq '.token')

curl -H "Authorization: Bearer $TOKEN" http://localhost:8000/api/v1/patients

Issue: API returns 404 Not Found

# Check the route exists
curl -v http://localhost:8000/api/v1/patients/1

# Check the endpoint is registered in app.main
grep -r "include_router" app/

# Make sure the path is correct
# GET /api/v1/patients/{patient_id}
# Not /api/v1/patient/{id}

Issue: API returns 422 Validation Error

# Request body doesn't match schema

# Check what validation error is returned
curl -X POST http://localhost:8000/api/v1/patients \
  -H "Content-Type: application/json" \
  -d '{"email": "invalid"}'

# Response will show:
# "detail": [
#   {
#     "loc": ["body", "email"],
#     "msg": "invalid email format",
#     "type": "value_error.email"
#   }
# ]

# Fix the request body to match the schema
# Check the OpenAPI docs at /docs

Monitoring and Observability

View Real-Time Logs

# Staging logs
aws logs tail /ecs/menotime-api-staging --follow --region us-west-1

# Production logs
aws logs tail /ecs/menotime-api-production --follow --region us-west-1

# Filter for errors only
aws logs tail /ecs/menotime-api-staging --follow --filter-pattern "ERROR" --region us-west-1

# Search for specific text
aws logs filter-log-events \
  --log-group-name /ecs/menotime-api-staging \
  --filter-pattern "database connection" \
  --region us-west-1

Check Metrics in CloudWatch

# CPU utilization
aws cloudwatch get-metric-statistics \
  --namespace AWS/ECS \
  --metric-name CPUUtilization \
  --dimensions Name=ServiceName,Value=menotime-api Name=ClusterName,Value=staging \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
  --period 300 \
  --statistics Average \
  --region us-west-1

# Memory utilization
aws cloudwatch get-metric-statistics \
  --namespace AWS/ECS \
  --metric-name MemoryUtilization \
  --dimensions Name=ServiceName,Value=menotime-api Name=ClusterName,Value=staging \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
  --period 300 \
  --statistics Average \
  --region us-west-1

Container Debugging with ECS Exec

# Get a running task
TASK_ID=$(aws ecs list-tasks \
  --cluster staging \
  --service-name menotime-api \
  --region us-west-1 \
  --query 'taskArns[0]' \
  --output text | awk -F'/' '{print $NF}')

# Exec into the container
aws ecs execute-command \
  --cluster staging \
  --task $TASK_ID \
  --container menotime-api \
  --interactive \
  --command "/bin/bash" \
  --region us-west-1

# Inside the container:
# Check environment variables
env | grep DATABASE

# Check logs
tail -f /var/log/app.log

# Test database connection
psql $DATABASE_URL -c "SELECT 1;"

# Check running processes
ps aux

# Exit
exit

Database Debugging

# Connect to staging database
psql -h menotime-staging.xxxxx.us-west-1.rds.amazonaws.com \
  -U menotime_admin \
  -d menotime

# Inside psql:
# List tables
\dt

# Check table structure
\d patients

# Count records
SELECT COUNT(*) FROM patients;

# View migrations applied
SELECT * FROM alembic_version;

# Exit
\q

Getting Help

If you still can't resolve the issue:

  1. Check the logs — CloudWatch, ECS, Docker
  2. Search existing issues — GitHub Issues repository
  3. Ask on team Slack — #engineering channel
  4. Create a detailed issue — Include error messages, logs, steps to reproduce
  5. Contact DevOps — For AWS infrastructure issues

When reporting an issue, include: - Error message (full text) - CloudWatch logs (if deployment issue) - Steps to reproduce - What you've already tried - Environment (dev/staging/prod, your OS)