Engineering Team

The engineering function is led by the Founding Technical Lead, who owns the MenoTime platform end-to-end — architecture, backend, infrastructure, and security. As an early-stage startup, the Technical Lead wears many hats and works closely with the Founding ML Scientist on data pipelines and the Commercial Lead on provider-facing features.

Team Lead

Founding Technical Lead: Owns all engineering, architecture decisions, infrastructure, security, and deployments. Reports directly to the Founder/CEO.

Responsibilities

The engineering function owns:

Platform development: Building and shipping MenoTime features
System architecture: High-level design and technical decisions
Code quality: Testing, code review, best practices
Infrastructure: AWS services (ECS, RDS, S3, etc.), deployments, DevOps
Security & compliance: Implementing HIPAA controls, encryption, audit logging
Incident response: Investigating and fixing production issues
Observability: Monitoring, logging, alerting

The engineering function does NOT own: - ML model design or data science (owned by Founding ML Scientist) - Clinical/medical decisions (owned by Founding ML Scientist) - Business development and clinic relationships (owned by Commercial Lead) - Company strategy (owned by Founder/CEO)

Tech Stack

Backend

Language: Python 3.11+
Framework: FastAPI (async web framework)
ORM: SQLAlchemy
Database: PostgreSQL (RDS managed)
Task Queue: Celery + Redis (for async work)
Cache: Redis

Frontend

Framework: React 18+
Language: TypeScript
Styling: Tailwind CSS or styled-components
State: React Query + Context API
Build: Vite

Infrastructure

Cloud: Amazon Web Services (us-west-1 region)
Compute: ECS Fargate (containerized)
Database: RDS PostgreSQL
3 environments: dev, staging, production
Instance type: db.m7g.large
Multi-AZ (redundancy)
Automated backups
Storage: S3 (file storage)
Networking: VPC, security groups, ALB (load balancer)
Email: AWS SES (for transactional emails)
Monitoring: CloudWatch, DataDog
Logs: CloudWatch Logs, aggregated in DataDog

Development Tools

Version Control: Git, GitHub
CI/CD: GitHub Actions
Containerization: Docker, Docker Compose
IaC: Terraform (infrastructure as code)
Testing: pytest (Python), Jest (JavaScript)
Code Quality: Black (formatting), mypy (type checking), ESLint (JS)
Secrets: 1Password, AWS Secrets Manager

Key Integrations

EHR Systems: HL7 FHIR API (planned)
Analytics: Mixpanel (event tracking)
Email: SendGrid/SES (transactional)
Payment: Stripe (subscription management)
Auth: OAuth 2.0, JWT, Okta SSO

Sprint Process

We run 2-week sprints with the following rhythm:

Sprint Planning (Monday, Week 1)

Duration: 1-2 hours
Who: Engineering team + VP Eng
Agenda:
Review previous sprint: what we shipped, blockers
Prioritize tasks for upcoming sprint
Break down large features into stories
Assign tasks based on expertise and balance
Estimate effort (story points)

Daily Standup (Mon-Fri, 10-10:10 AM PT)

Duration: 5-10 minutes
Format: What we did yesterday, what we're doing today, blockers
Location: Slack async #engineering or video call
Attendees: Engineering team (optional for remote folks in other time zones)

Mid-Sprint Check-in (Wednesday or Thursday)

Duration: 15-30 minutes
Goal: Track progress, identify blockers, adjust if needed

Sprint Review & Retrospective (Friday, Week 2)

Duration: 1-1.5 hours
Agenda:
Demonstrate completed work
Celebrate wins
Retrospective: what went well, what to improve
Update metrics and team dashboard

Code Review Standards

All code changes go through peer review before merging to main.

Review Requirements

Assigned reviewer: At least one engineer must review (doesn't have to be the most senior)
CI passing: All automated tests and checks must pass
Coverage: New code should have unit test coverage (aim for 80%+)
Documentation: Changes to public APIs should include docstrings

Review Checklist

Reviewers check for:

Correctness: Does the code do what it claims to do?
Maintainability: Is the code clear and well-structured?
Security: Are there any potential security issues? (especially important for us)
Performance: Will this scale? Any obvious inefficiencies?
Testing: Are the tests adequate?
Compliance: Does this follow HIPAA requirements?
Style: Does it follow our coding standards?

Review Tone

Be respectful and constructive
Ask questions instead of making demands ("Why did you choose X?" vs. "You should use Y")
Praise good code
Review promptly (aim for 24 hours)
Assume good intent

What to Review

Merge main into your feature branch frequently to catch conflicts early
Keep PRs small (under 400 lines if possible) for easier review
Write clear PR descriptions explaining what and why

Code Standards

Python

# Black for formatting (auto-run in pre-commit)
# Mypy for type checking
# Follow PEP 8

# Good:
def get_patient_data(patient_id: int) -> PatientSchema:
    """Fetch patient data by ID, applying privacy filters."""
    patient = db.session.query(Patient).filter(Patient.id == patient_id).first()
    if not patient:
        raise PatientNotFound(f"Patient {patient_id} not found")
    return PatientSchema.from_orm(patient)

# Bad:
def get_patient(id):
    p = db.session.query(Patient).filter(Patient.id==id).first()
    if p is None:
        return None
    return p

JavaScript/TypeScript

// ESLint for linting
// Prettier for formatting
// TypeScript strict mode

// Good:
interface UserProfile {
  id: string;
  email: string;
  clinicId: string;
}

async function fetchUserProfile(userId: string): Promise<UserProfile> {
  const response = await api.get(`/users/${userId}`);
  return response.data;
}

// Bad:
function getUser(id) {
  return api.get('/users/' + id);
}

Deployment Process

Deployment Timeline

Develop: Push to feature branch, open PR
Review & Test: Peer review, automated tests pass
Merge: Approve PR, merge to main
Build: GitHub Actions builds Docker image, runs tests
Deploy to Staging: Automatically deployed to staging environment
Manual Testing: QA tests in staging (24-48 hours)
Approval: VP Eng approves production deployment
Deploy to Prod: Manually triggered by VP Eng or delegated engineer
Monitor: Watch logs, metrics, customer reports

Deployment Best Practices

Deploy during business hours (unless critical hotfix) so we can respond to issues
Database migrations: Run before code deployment, test on staging first
Feature flags: Use feature flags for risky rollouts, can roll back quickly
Staging matches prod: Keep staging infrastructure identical to production
Runbook available: Before deploying, ensure we have a rollback plan

Rollback

If something breaks in production: 1. Notify team in #incidents 2. VP Eng makes rollback decision 3. Rollback to previous version (fast) 4. Investigate root cause afterward 5. Fix issue and re-deploy

On-Call Rotation

We maintain a 24/7 on-call rotation for production support.

On-Call Expectations

Response time: 15 minutes (during business hours), immediate (for critical issues)
Availability: Check Slack, be reachable
Investigation: Diagnose the issue, gather context
Communication: Post updates to #incidents
Escalation: Page other engineers if needed
Handoff: Brief next person if issue ongoing

On-Call Schedule

Rotated weekly, Monday-Sunday
VP Eng usually covers weekends/nights as fallback
Scheduled in Asana/Google Calendar
If you need to swap, arrange with another engineer and notify VP Eng

Critical Issues (P1)

Definition: Production is down or patient data is affected

Response: - Page on-call engineer immediately - Start war room (video call) - Incident commander posts updates every 5 min - All hands on deck to resolve - Postmortem after resolution

Team Communication

Standups

Daily standup in #engineering (async or video)
Weekly engineering sync Thursday 2-3 PM PT
Monthly architecture review (larger decisions)

Decision Making

Architecture decisions: Discussion in #engineering, VP Eng decides
Feature priority: Product manager + VP Eng + clinical input
Tooling changes: Discussion with affected team members

Pair programming encouraged, especially for onboarding
Code reviews as learning tool
Postmortems document lessons learned
Wiki/docs updated regularly

Metrics & Goals

We track:

Deployment frequency: How often we ship (target: 2-3x/week)
Build/test pass rate: Should be >95%
Production incidents: Track type and frequency
Code coverage: Aim for 80%+ on new code
Response time: On-call response within 15 min

Last updated: February 2025