NFL In Game Win Probability

I did the NBA in game win probability, but I really wanted to do one for the NFL, so here we are. This is all based on the great explanations at Pro Football Reference, with some more intermediate graphics and steps that helped me understand it better.

In Game Probability

The first thing I wanted to do was to recreate the win probability laid out that the score is a normal random variable, centered around the current point differential. The sigma of the curve decreases linearly as the game progresses, so the curve gets more pointy until the very end of the game, where there is only a single outcome.

Lead:
Loss: %
Tie: %
Win: %

Pretty fun, right? The data for this comes from nflscrapR author Max Horowitz’s Kaggle page. Without his work, I don’t believe there would be any good free data on the NFL.

Including Expected Points

The previous win probabilities don’t take into account the current state of players on the field. If you are up by two, but the opposing team is at your 30 yard line with 10 seconds left, they’ll probably kick a field goal and go home. Right now this model doesn’t account for this, but we can if we included the expected number of points for their current drive.

Using the NFL play by play data and some help from my spouse to understand how GLMs work, we derived this formula for expected points given down, distance to goal:

expected points = 3.5263
    + (yards to goal * -0.0445)
    + (yards to first down * -0.0471)
    + (is first down ? 2.0722 : 0)
    + (is second down ? 1.6599 : 0)
    + (is third down ? 1.0870 : 0)

You can see how this was derived looking at the Jupyter notebook file where the data is processed and create a basic GLM. Now this can be incorporated as a mu offset in the original graph.

Possessor:
Lead:
Down:
Yards To First:
Yards To Goal:
Loss: %
Tie: %
Win: %
Expected Points:

I hope I got all my pluses and minuses straight, generally what you are looking at is the win probability for the home team and the expected number of points for the home team, so if they are in a bad situation like 4th down at their own 1 yard line, it will be negative. Similarly it will be generally negative if the opposing team has the ball and is near the goal line.

Additional Reading

The image for this post can be found here.