A large number of people have developed models for predicting the outcomes of college basketball games. For those that have made their picks publicly available, ThePredictionTracker does a great service by tracking the live performance of each model over the course of the season. Unfortunately, it's difficult to do a direct comparison of models using the summary page on the Tracker. For one thing, each model has predicted a different subset of games (in many cases this is accidental -- schedules get modified and web scrapers don't pick up the changes -- but some models don't start making picks altogether until weeks or months into the season). Further, there are a few misprinted lines in the Tracker data. For example, the Tracker shows an opening line of -22 and a closing line of +64.5 for UCLA vs. Presbyterian on 11/19/2019 (the line closed at -23 or -23.5 depending on the book).

Throughout the 2018-19 season, I'll try to update this page with some further analysis of the Tracker data. For what follows, I chose a subset of models which have made picks since the very beginning of the season, and I threw out games for which any of those models did not make a pick. In a (somewhat lazy) attempt to address misprinted lines, I filtered out any games for which the opening and closing lines differed by more than 5 points (it's very rare that this really happens).

Results shown are as of 2019-01-16 for a set of 2515 games:
Mean Squared Error (MSE) Model
126.898 Line
128.493 Opening Line
129.243 Erik Forseth
129.811 TeamRankings
131.892 Dokter Entropy
134.787 ESPN BPI
135.556 Sagarin Rating
135.939 Sagarin Predictor
136.536 Sagarin Golden Mean
137.107 Kenneth Massey
140.995 DRatings.com
142.265 Sonny Moore
149.051 StatFox
150.893 ComPughter Ratings
152.853 Sagarin Recent

Accuracy Model
0.766 Line
0.766 Opening Line
0.764 Erik Forseth
0.760 TeamRankings
0.760 Sagarin Predictor
0.758 ESPN BPI
0.757 Sagarin Rating
0.756 Sagarin Golden Mean
0.755 Kenneth Massey
0.754 Dokter Entropy
0.751 DRatings.com
0.749 Sonny Moore
0.742 ComPughter Ratings
0.741 Sagarin Recent
0.738 StatFox


Clearly the line is the best statistical predictor of the outcome. Nevertheless, we can ask how each model would have done against the spread, shown below:
% Against the Spread Model
0.515 ESPN BPI
0.512 Sagarin Golden Mean
0.507 Sagarin Rating
0.503 Sagarin Predictor
0.502 Sonny Moore
0.500 Erik Forseth
0.500 Dokter Entropy
0.499 TeamRankings
0.499 DRatings.com
0.491 Kenneth Massey
0.490 Opening Line
0.487 StatFox
0.484 Sagarin Recent
0.475 ComPughter Ratings


Although no individual model predicts the point spread as well as the line, we might ask whether any linear combination of models can do so. Let's regress the observed margins of victory onto the predictions of each model, but constrain the regression to have nonnegative coefficients. Subject to the nonnegativity constraint, this would give the optimal (backward-looking) mixture of predictors. We find:
Coefficient Model
0.453 Erik Forseth
0.254 Dokter Entropy
0.235 ESPN BPI
0.042 Sonny Moore
0.033 Sagarin Predictor
0.000 Kenneth Massey
0.000 Sagarin Rating
0.000 TeamRankings
0.000 StatFox
0.000 Sagarin Recent
0.000 Sagarin Golden Mean
0.000 DRatings.com
0.000 ComPughter Ratings

The MSE of this hypothetical predictor would be 127.818. Not bad! However, note that this is optimistic, since in addition to being backward-looking, we both fit the model and then computed the MSE using the full dataset.

Out of curiosity, what if we included the line itself in the above regression? Can any of our models add value when combined with the line?
Coefficient Model
0.689 Line
0.140 Erik Forseth
0.104 ESPN BPI
0.087 Dokter Entropy
0.003 Sonny Moore
0.000 Opening Line
0.000 Kenneth Massey
0.000 Sagarin Rating
0.000 TeamRankings
0.000 StatFox
0.000 Sagarin Recent
0.000 Sagarin Predictor
0.000 Sagarin Golden Mean
0.000 DRatings.com
0.000 ComPughter Ratings

Interestingly, it seems that a couple of the models do perhaps capture something the line does not. The hypothetical MSE of this mixture would be 126.564.

At some point I will make my analysis available as a Jupyter notebook.