PMZ
BALL
A machine learning platform that predicts college basketball outcomes, explains why, and lets you see the data the way the model sees it.
Every day during the college basketball season, seven scrapers pull data from ESPN, Barttorvik, Basketball Reference, NCAA stats, and The Odds API. That raw data flows through a feature engineering pipeline that computes 74 features per team luck-adjusted efficiency, strength of schedule, rolling form, quality wins before feeding into an XGBoost model trained on three full seasons.
The model doesn't just predict winners. It outputs win probabilities, predicted spreads, and confidence tiers — and every prediction ships with a SHAP explanation of the top features driving the call.
The model
explains itself.
Most sports prediction apps are black boxes. PMZ Ball uses SHAP (SHapley Additive exPlanations) to decompose every single prediction into the features that drove it. Each game detail page shows a force plot of the top contributing factors — positive pushes toward a win, negative toward a loss.
SHAP values are computed at prediction time, stored as JSONB in Supabase, and rendered as interactive narratives. Users see why the model likes Duke −7.5, not just that it does.
Every team has a play style DNA.
K-means clustering on 11 statistical features assigns every D1 team an archetype. These archetypes feed back into the prediction model because how a team plays matters as much as how well. Matchup win rates between play styles become another predictive feature.
K-means on 11 features:
tempo, eFG%, TO rate, OR%,
FT rate, 3PA rate, ast%,
blk%, stl%, 2PT%, adj. eff.
Five stages. One pipeline. Zero manual steps.
Barttorvik blocks standard Selenium. Two-step approach: visit main page with undetected-chromedriver to pass the challenge, then XMLHttpRequest with established cookies to fetch CSV data.
Every prediction computes SHAP values at inference time, stores them as JSONB. Game detail pages render narrative explanations from top 3–5 features — unusual for sports apps, which are typically black boxes.
68 teams.
67 games.
One bracket.
The bracket simulator projects a full NCAA tournament field using T-Rank ratings and S-curve seeding logic, then runs every game through the ML model under neutral-site conditions.
Each matchup gets a win probability, predicted spread, and upset probability. The simulator outputs round-by-round advancement odds for all 68 teams — so you can see that your 12-seed has a 38% chance to reach the Sweet 16 before you lock it in.
Built with Claude Code
+ three custom skills encoding domain expertise
Sign conventions for spreads, cover vs. ATS terminology, juice calculations, parlay logic
Kenpom/Barttorvik metrics, efficiency margin, four factors, conference adjustment
S-curve seeding, play-in game handling, upset probability definitions, bracket region balance
362 teams · 4,000+ games · 74 features · 74.5% accuracy
7 scrapers · 16 tables · 3 custom Claude Code skills
Every prediction explained.