Zeta Alpha was where I built a framework to evaluate and improve recommender systems without risking real users.

ML Engineer PythonKubernetesCollaborative FilteringMulti-armed Bandits
2021

Highlights

  • Risk-free testing framework: Built an offline evaluation framework that let the business run experiments and compare algorithms without alienating real users
  • Recommender systems: Designed and built multiple recommendation algorithms for a live academic paper platform
  • Data pipelines: Ingested real user signals and social media data to power the recommendation engine
  • Production impact: Achieved 80%+ accuracy, paving the way for confident, data-driven A/B testing

Zeta Alpha is a platform for surfacing and recommending academic papers suited to your taste, almost like a curated repository. What drew me to this project was the intersection of AI and industry. This was a live SaaS product with real users, running microservices on Kubernetes, seamlessly switching between ML models and running A/B tests. It was exactly where I wanted to be: building products with machine learning.

The problem

The company needed more data and better recommendations. My initial focus was data extraction (getting online signals like tweets and engagement metrics). From there, the question became: how do you design and evaluate recommender systems in an offline way, without risking your user base?

The approach

I combined multiple recommender systems into an ensemble. Each individual system took a different approach (collaborative filtering, signal-based, social media-based) and by using a counterfactual method, we could figure out the optimal combination. I designed a setup where we could plug in any number of recommenders, evaluate them against real engagement data, and use a multi-armed bandit approach to measure which configurations performed best.

The results

The accuracy of these recommenders landed upwards of 80%, which allowed the company to start working in a more data-driven, evaluation-based way. This paved the way for them to experiment with A/B testing without risking losing users. For me, it was the chance to work with real signals, real data, in a mature architecture that was already well set up.