Projects

Projects

Self-directed work — architectures, prototypes, and explorations.

May 2026AIEvalsReddit Answers

Evaluating AI Systems

A field guide to measuring whether your LLM-powered product actually works — and how to make it better over time

A structured framework for AI evaluations — prompt banks, LLM-as-judge calibration, failure clustering, and the loop that turns diagnostics into product improvement. Drawn from building the eval system for Reddit Answers.

View project →

April 2026Vertical AIAccountingArchitecture

The Close Agent

Architecture for an AI agent that earns trust to automate the monthly financial close — one task at a time

An agent that absorbs the mechanical 65% of closing the books — but only after earning trust through independent reconciliation, layered safety checks, and months of proven accuracy. The trust ramp is the moat.

View project →

April 2026AIArchitectureAnalytics

Mixpanel Analytics Agent

Architecture for an agentic analytics copilot — orchestration, context, business logic, and data definitions

A six-layer architecture for an analytics agent that doesn't just query the data layer — it carries product context, business logic, and the instrumentation flywheel that makes itself smarter over time. Built around Mixpanel's structured event model.

View project →