Projects
High-Quality AI Coding Workflow
Client
Multiple startups and small companies (5-50 engineers)
Challenge
- Engineering teams were making heavy use of Claude Code, but lacked systematic approaches to maintain code quality and prevent design drift
Solution
- Improved
CLAUDE.md and internal documentation to provide agents with clear context about the system and how to interact with it - Created prompt guidelines to ensure better code quality and test coverage
- Built tools to automate:
- Code reviews for individual changes
- Code reviews over periods of time
- Useful for answering questions like “How has the system changed during the last week/month/quarter?”
- Design reviews for full system architecture
- Resolving merge conflicts
- Regenerating accurate usage documentation from code
Impact
- Team maintained productivity improvements of AI-coding while also maintaining high code quality and understanding.
- Automated a “weekly code changes” Slack update to keep all team members informed about the changes made to the system.
- Reduced production incidents and debugging time related to poorly understood AI-generated code by ~10%.
ML Orchestration Strategy & Roadmap
Client
30-person AI/ML team at a mid-sized company.
Challenge
- Client was running 3 different orchestration systems (Dagster, Airflow, Argo)
- Engineers faced high friction moving jobs from ad-hoc development to production
- Dependency management was brittle
- GPU scheduling issues caused frequent job failures
- No standardized observability across batch jobs
Solution
- Led cross-functional working group to assess the entire ML orchestration landscape.
- Built comprehensive 1-2 year technical roadmap defining measurable goals (faster iteration, higher job success rates, better GPU utilization) and specific projects with clear timelines and success criteria.
- Prioritized initiatives across three key areas: developer experience, resource efficiency, and system reliability.
Impact
- Delivered actionable roadmap covering 10+ concrete projects from dependency isolation to automated model deployments.
- Provided client with clear technical direction and quarterly milestones, enabling informed resource allocation and reducing uncertainty around platform evolution.
(A few examples below from my full-time roles)
Personalization Runtime Analytics
Company
Twitter
Challenge
- Twitter served billions of personalized timelines per day, but engineers, data scientists, and product teams lacked visibility into how personalization actually worked in production.
- Teams couldn’t answer questions like “why did this user see this tweet?” or understand the impact of ML model changes on real timelines.
- Debugging personalization issues required manual investigation with limited data.
Solution
- Built an analytics platform that logged detailed information about every personalized timeline – candidate sourcing, ranking decisions, filtering logic, and multiple ML model scores per tweet.
- Data was collected within runtime personalization systems, queued in HDFS, and landed in GCP BigQuery for internal consumption.
- Also built a Timeline debugger tool that annotated individual user timelines with metadata like ML model scores and features.
Impact
- Enabled dozens of engineers, data scientists, researchers, and product managers to understand production personalization behavior at scale.
- Teams could investigate specific “why am I seeing this?” concerns and validate ML model changes against real production data.
- Spread general understanding of how timeline personalization worked across the organization, reducing knowledge silos and improving debugging capabilities.
User Sampling Pipeline
Company
Netflix
Challenge
- Netflix’s AI/ML personalization pipelines needed diverse, representative training datasets to support global models serving ~100 million members
- Existing sampling approaches couldn’t provide the precise control needed for balanced country/tenure coverage or strategic over-sampling of high-value segments
Solution
- Built an intelligent user-sampling service that selected diverse, balanced datasets by country and tenure for training pipelines.
- System enabled configurable sampling strategies to ensure representative global coverage while allowing over-sampling of strategically important segments like new and free trial members.
- Service supported dozens of downstream ML training jobs with consistent, high-quality datasets.
Impact
- Increased our training data by ~10% landing on the minimum volume of training data to provide maximum increase in model quality.
- Empowered machine learning engineers to dynamically express their training data requirements in code across dozens of models.
Personalized Video Re-Architecture
Company
Comcast
Challenge
- Comcast had a tightly-coupled legacy content personalization architecture which involved embedding a ranking library inside of our service for browsing and searching.
- In order to enable personalizing all video content discovery, we needed a new system which
allowed machine learning engineers to innovate on new models, features, etc. independent of search service preferences and requirements.
Solution
- Led architecture and implementation for a new personalized video architecture involving a new standalone personalization service
- Ran an architectural A/B test to evaluate latency, availability, and error rates across different configurations for caching, page sizes, etc.
- Implemented fallback mechanisms to gracefully degrade when personalization systems were unavailable, ensuring customers always received content even during failures.
Impact
- Successfully launched new personalized video experience to all Comcast video customers.
- Customer video engagement increased by ~10%.