SUPREME INFORMATICS

Projects

High-Quality AI Coding Workflow

Client

Multiple startups and small companies (5-50 engineers)

Challenge

Engineering teams were making heavy use of Claude Code, but lacked systematic approaches to maintain code quality and prevent design drift

Solution

Improved CLAUDE.md and internal documentation to provide agents with clear context about the system and how to interact with it
Created prompt guidelines to ensure better code quality and test coverage
Built tools to automate:
- Code reviews for individual changes
- Code reviews over periods of time
  - Useful for answering questions like “How has the system changed during the last week/month/quarter?”
- Design reviews for full system architecture
- Resolving merge conflicts
- Regenerating accurate usage documentation from code

Impact

Team maintained productivity improvements of AI-coding while also maintaining high code quality and understanding.
- Automated a “weekly code changes” Slack update to keep all team members informed about the changes made to the system.
Reduced production incidents and debugging time related to poorly understood AI-generated code by ~10%.

ML Orchestration Strategy & Roadmap

Client

30-person AI/ML team at a mid-sized company.

Challenge

Client was running 3 different orchestration systems (Dagster, Airflow, Argo)
Engineers faced high friction moving jobs from ad-hoc development to production
Dependency management was brittle
GPU scheduling issues caused frequent job failures
No standardized observability across batch jobs

Solution

Led cross-functional working group to assess the entire ML orchestration landscape.
Built comprehensive 1-2 year technical roadmap defining measurable goals (faster iteration, higher job success rates, better GPU utilization) and specific projects with clear timelines and success criteria.
Prioritized initiatives across three key areas: developer experience, resource efficiency, and system reliability.

Impact

Delivered actionable roadmap covering 10+ concrete projects from dependency isolation to automated model deployments.
Provided client with clear technical direction and quarterly milestones, enabling informed resource allocation and reducing uncertainty around platform evolution.

(A few examples below from my full-time roles)

Personalization Runtime Analytics

Company

Twitter

Challenge

Twitter served billions of personalized timelines per day, but engineers, data scientists, and product teams lacked visibility into how personalization actually worked in production.
Teams couldn’t answer questions like “why did this user see this tweet?” or understand the impact of ML model changes on real timelines.
Debugging personalization issues required manual investigation with limited data.

Solution

Built an analytics platform that logged detailed information about every personalized timeline – candidate sourcing, ranking decisions, filtering logic, and multiple ML model scores per tweet.
Data was collected within runtime personalization systems, queued in HDFS, and landed in GCP BigQuery for internal consumption.
Also built a Timeline debugger tool that annotated individual user timelines with metadata like ML model scores and features.

Impact

Enabled dozens of engineers, data scientists, researchers, and product managers to understand production personalization behavior at scale.
Teams could investigate specific “why am I seeing this?” concerns and validate ML model changes against real production data.
Spread general understanding of how timeline personalization worked across the organization, reducing knowledge silos and improving debugging capabilities.

User Sampling Pipeline

Company

Netflix

Challenge

Netflix’s AI/ML personalization pipelines needed diverse, representative training datasets to support global models serving ~100 million members
- More details on the shift to global models here: https://netflixtechblog.com/recommending-for-the-world-8da8cbcf051b
Existing sampling approaches couldn’t provide the precise control needed for balanced country/tenure coverage or strategic over-sampling of high-value segments

Solution

Built an intelligent user-sampling service that selected diverse, balanced datasets by country and tenure for training pipelines.
System enabled configurable sampling strategies to ensure representative global coverage while allowing over-sampling of strategically important segments like new and free trial members.
Service supported dozens of downstream ML training jobs with consistent, high-quality datasets.

Impact

Increased our training data by ~10% landing on the minimum volume of training data to provide maximum increase in model quality.
Empowered machine learning engineers to dynamically express their training data requirements in code across dozens of models.

Personalized Video Re-Architecture

Company

Comcast

Challenge

Comcast had a tightly-coupled legacy content personalization architecture which involved embedding a ranking library inside of our service for browsing and searching.
In order to enable personalizing all video content discovery, we needed a new system which allowed machine learning engineers to innovate on new models, features, etc. independent of search service preferences and requirements.

Solution

Led architecture and implementation for a new personalized video architecture involving a new standalone personalization service
Ran an architectural A/B test to evaluate latency, availability, and error rates across different configurations for caching, page sizes, etc.
Implemented fallback mechanisms to gracefully degrade when personalization systems were unavailable, ensuring customers always received content even during failures.

Impact

Successfully launched new personalized video experience to all Comcast video customers.
Customer video engagement increased by ~10%.

Contact

No obligation. Let's discuss your technical challenges.
Reach out async via mh@supremeinformatics.com

Schedule Free Consultation