Engineering Team
The engineering function is led by the Founding Technical Lead, who owns the MenoTime platform end-to-end — architecture, backend, infrastructure, and security. As an early-stage startup, the Technical Lead wears many hats and works closely with the Founding ML Scientist on data pipelines and the Commercial Lead on provider-facing features.
Team Lead
- Founding Technical Lead: Owns all engineering, architecture decisions, infrastructure, security, and deployments. Reports directly to the Founder/CEO.
Responsibilities
The engineering function owns:
- Platform development: Building and shipping MenoTime features
- System architecture: High-level design and technical decisions
- Code quality: Testing, code review, best practices
- Infrastructure: AWS services (ECS, RDS, S3, etc.), deployments, DevOps
- Security & compliance: Implementing HIPAA controls, encryption, audit logging
- Incident response: Investigating and fixing production issues
- Observability: Monitoring, logging, alerting
The engineering function does NOT own: - ML model design or data science (owned by Founding ML Scientist) - Clinical/medical decisions (owned by Founding ML Scientist) - Business development and clinic relationships (owned by Commercial Lead) - Company strategy (owned by Founder/CEO)
Tech Stack
Backend
- Language: Python 3.11+
- Framework: FastAPI (async web framework)
- ORM: SQLAlchemy
- Database: PostgreSQL (RDS managed)
- Task Queue: Celery + Redis (for async work)
- Cache: Redis
Frontend
- Framework: React 18+
- Language: TypeScript
- Styling: Tailwind CSS or styled-components
- State: React Query + Context API
- Build: Vite
Infrastructure
- Cloud: Amazon Web Services (us-west-1 region)
- Compute: ECS Fargate (containerized)
- Database: RDS PostgreSQL
- 3 environments: dev, staging, production
- Instance type: db.m7g.large
- Multi-AZ (redundancy)
- Automated backups
- Storage: S3 (file storage)
- Networking: VPC, security groups, ALB (load balancer)
- Email: AWS SES (for transactional emails)
- Monitoring: CloudWatch, DataDog
- Logs: CloudWatch Logs, aggregated in DataDog
Development Tools
- Version Control: Git, GitHub
- CI/CD: GitHub Actions
- Containerization: Docker, Docker Compose
- IaC: Terraform (infrastructure as code)
- Testing: pytest (Python), Jest (JavaScript)
- Code Quality: Black (formatting), mypy (type checking), ESLint (JS)
- Secrets: 1Password, AWS Secrets Manager
Key Integrations
- EHR Systems: HL7 FHIR API (planned)
- Analytics: Mixpanel (event tracking)
- Email: SendGrid/SES (transactional)
- Payment: Stripe (subscription management)
- Auth: OAuth 2.0, JWT, Okta SSO
Sprint Process
We run 2-week sprints with the following rhythm:
Sprint Planning (Monday, Week 1)
- Duration: 1-2 hours
- Who: Engineering team + VP Eng
- Agenda:
- Review previous sprint: what we shipped, blockers
- Prioritize tasks for upcoming sprint
- Break down large features into stories
- Assign tasks based on expertise and balance
- Estimate effort (story points)
Daily Standup (Mon-Fri, 10-10:10 AM PT)
- Duration: 5-10 minutes
- Format: What we did yesterday, what we're doing today, blockers
- Location: Slack async #engineering or video call
- Attendees: Engineering team (optional for remote folks in other time zones)
Mid-Sprint Check-in (Wednesday or Thursday)
- Duration: 15-30 minutes
- Goal: Track progress, identify blockers, adjust if needed
Sprint Review & Retrospective (Friday, Week 2)
- Duration: 1-1.5 hours
- Agenda:
- Demonstrate completed work
- Celebrate wins
- Retrospective: what went well, what to improve
- Update metrics and team dashboard
Code Review Standards
All code changes go through peer review before merging to main.
Review Requirements
- Assigned reviewer: At least one engineer must review (doesn't have to be the most senior)
- CI passing: All automated tests and checks must pass
- Coverage: New code should have unit test coverage (aim for 80%+)
- Documentation: Changes to public APIs should include docstrings
Review Checklist
Reviewers check for:
- Correctness: Does the code do what it claims to do?
- Maintainability: Is the code clear and well-structured?
- Security: Are there any potential security issues? (especially important for us)
- Performance: Will this scale? Any obvious inefficiencies?
- Testing: Are the tests adequate?
- Compliance: Does this follow HIPAA requirements?
- Style: Does it follow our coding standards?
Review Tone
- Be respectful and constructive
- Ask questions instead of making demands ("Why did you choose X?" vs. "You should use Y")
- Praise good code
- Review promptly (aim for 24 hours)
- Assume good intent
What to Review
- Merge main into your feature branch frequently to catch conflicts early
- Keep PRs small (under 400 lines if possible) for easier review
- Write clear PR descriptions explaining what and why
Code Standards
Python
# Black for formatting (auto-run in pre-commit)
# Mypy for type checking
# Follow PEP 8
# Good:
def get_patient_data(patient_id: int) -> PatientSchema:
"""Fetch patient data by ID, applying privacy filters."""
patient = db.session.query(Patient).filter(Patient.id == patient_id).first()
if not patient:
raise PatientNotFound(f"Patient {patient_id} not found")
return PatientSchema.from_orm(patient)
# Bad:
def get_patient(id):
p = db.session.query(Patient).filter(Patient.id==id).first()
if p is None:
return None
return p
JavaScript/TypeScript
// ESLint for linting
// Prettier for formatting
// TypeScript strict mode
// Good:
interface UserProfile {
id: string;
email: string;
clinicId: string;
}
async function fetchUserProfile(userId: string): Promise<UserProfile> {
const response = await api.get(`/users/${userId}`);
return response.data;
}
// Bad:
function getUser(id) {
return api.get('/users/' + id);
}
Deployment Process
Deployment Timeline
- Develop: Push to feature branch, open PR
- Review & Test: Peer review, automated tests pass
- Merge: Approve PR, merge to main
- Build: GitHub Actions builds Docker image, runs tests
- Deploy to Staging: Automatically deployed to staging environment
- Manual Testing: QA tests in staging (24-48 hours)
- Approval: VP Eng approves production deployment
- Deploy to Prod: Manually triggered by VP Eng or delegated engineer
- Monitor: Watch logs, metrics, customer reports
Deployment Best Practices
- Deploy during business hours (unless critical hotfix) so we can respond to issues
- Database migrations: Run before code deployment, test on staging first
- Feature flags: Use feature flags for risky rollouts, can roll back quickly
- Staging matches prod: Keep staging infrastructure identical to production
- Runbook available: Before deploying, ensure we have a rollback plan
Rollback
If something breaks in production: 1. Notify team in #incidents 2. VP Eng makes rollback decision 3. Rollback to previous version (fast) 4. Investigate root cause afterward 5. Fix issue and re-deploy
On-Call Rotation
We maintain a 24/7 on-call rotation for production support.
On-Call Expectations
- Response time: 15 minutes (during business hours), immediate (for critical issues)
- Availability: Check Slack, be reachable
- Investigation: Diagnose the issue, gather context
- Communication: Post updates to #incidents
- Escalation: Page other engineers if needed
- Handoff: Brief next person if issue ongoing
On-Call Schedule
- Rotated weekly, Monday-Sunday
- VP Eng usually covers weekends/nights as fallback
- Scheduled in Asana/Google Calendar
- If you need to swap, arrange with another engineer and notify VP Eng
Critical Issues (P1)
Definition: Production is down or patient data is affected
Response: - Page on-call engineer immediately - Start war room (video call) - Incident commander posts updates every 5 min - All hands on deck to resolve - Postmortem after resolution
Team Communication
Standups
- Daily standup in #engineering (async or video)
- Weekly engineering sync Thursday 2-3 PM PT
- Monthly architecture review (larger decisions)
Decision Making
- Architecture decisions: Discussion in #engineering, VP Eng decides
- Feature priority: Product manager + VP Eng + clinical input
- Tooling changes: Discussion with affected team members
Knowledge Sharing
- Pair programming encouraged, especially for onboarding
- Code reviews as learning tool
- Postmortems document lessons learned
- Wiki/docs updated regularly
Metrics & Goals
We track:
- Deployment frequency: How often we ship (target: 2-3x/week)
- Build/test pass rate: Should be >95%
- Production incidents: Track type and frequency
- Code coverage: Aim for 80%+ on new code
- Response time: On-call response within 15 min
Last updated: February 2025