11/05/2015

The Big Questions

I might as well just call this "quantifying hockey". What are we trying to learn?

  • What Is "Value"? Ultimately it's about things that either cause or prevent goals being scored.
  • What stochastic models can teach us about the game proper.
  • How do we communicate this in a clear way to practitioners?

The Big Questions (Statistical)

  • How do we value a "win"? In our context, we convert from "goals" (rule of thumb: 6 goals equals 2 standings points) – so how do we value a goal?
  • Do we want to measure "true ability", or correct for statistical anomalies in "realized performance"?
  • Why should we compare performance to "replacement" and not "average"?
  • A touchstone: OpenWAR for Baseball (Jensen, Matthews, Baumer)

The Basics

The Toolbox

The three most basic assumptions I'm going to make here are about structure:

  • The rate of the next occurrence of an event is semi-Markov, but Poisson/Cox processes get us most of the way there in a pinch.
  • Each event may also have degrees of success or failure, that we can model separately with binomial models.
  • We can swap out pieces later on to improve the realism, accuracy and precision of the model.

Prior Work

Way back to 2004-05 (Lockout II): Let's manually get some tracking data. Thomas 2006, JQAS

Prior Work

How to make it (semi)-Markovian: added in composite states reflective of real game outcomes.

Prior Work

Thomas 2007, JQAS: A distribution that generally works for goal-scoring times is almost Poisson distributed, but we didn't have enough game-level data to go beyond simple corrections, to uncover individual impacts.

Prior Work

Well, now we have enough data, more or less! Thomas et al 2013, AOAS

Premise: Both teams (home and away) have a goal scoring process; both compete with each other.

Simplest two-team model:

\[ \log \lambda_H = \mu_H + \omega_C + \delta_D \] \[ \log \lambda_A = \mu_A + \omega_D + \delta_C \]

\(\omega\) reflects the offensive skill of each team (higher is better); \(\delta\) reflects defensive skill (higher is worse).

The Full Model

Hand-coded fully Bayesian MCMC solution for this model:

\[ \log \lambda_H = \mu_H + \sum_k (\omega_k I(k\ on\ ice\ H) + \delta_k I(k\ on\ ice\ A))\] \[ \log \lambda_A = \mu_A + \sum_k (\omega_k I(k\ on\ ice\ A) + \delta_k I(k\ on\ ice\ H))\]

Elastic-net style Laplace-Gaussian shrinkage for parameters grouped on position (centers, wingers, defenders, goaltenders).

Pros: Great statistical properties. Interesting findings on each position's relative contributions. Can do semi-Markov structure if we need it.

Cons: Takes way too long to run in full. No one in industry hockey will get the details of what MCMC is. Needs years of data to overcome collinearity with goaltenders.

Integrating Data Sources

So we forge forward. First, let's take the existing data sets from the previous analysis and augment them. Primary: nhl.com

Integrating Data Sources

  • Secondary additions (x,y) data: espn.com, sportsnet.ca

All Together Now

Integrated data:

##     season gcode refdate event period seconds etype        a1        a2
## 7 20152016 20001    5027     7      1      51  SHOT deshada86 weiseda88
## 9 20152016 20001    5027     9      1      65  SHOT deshada86 weiseda88
##          a3        a4        a5        a6        h1        h2        h3
## 7 fleisto84 petryje87 emelial86 xxxxxxxNA kadrina90 boyesbr82 vanrija89
## 9 fleisto84 petryje87 emelial86 xxxxxxxNA kadrina90 boyesbr82 vanrija89
##          h4        h5        h6 ev.team ev.player.1 ev.player.2
## 7 harrisc93 gardija90 xxxxxxxNA     TOR   boyesbr82   xxxxxxxNA
## 9 harrisc93 gardija90 xxxxxxxNA     TOR   kadrina90   xxxxxxxNA
##   ev.player.3 distance  type homezone xcoord ycoord awayteam hometeam
## 7   xxxxxxxNA       35 Wrist      Off    -55      6      MTL      TOR
## 9   xxxxxxxNA       11  Snap      Off    -79      0      MTL      TOR
##   home.score away.score event.length    away.G    home.G home.skaters
## 7          0          0          5.5 priceca87 bernijo88            6
## 9          0          0          7.0 priceca87 bernijo88            6
##   away.skaters adjusted.distance shot.feature import.ies loc.section
## 7            6         30.976837       rushn8          0           0
## 9            6          9.958361                       0           0
##   new.loc.section newxc newyc score.diff.cat subdistance
## 7               6    59    -5              3           7
## 9              12    80     0              3           3

All Together Now

Events are recorded as the time from the previous event. We have event information on many contributing predictors:

  • Teams on the ice (home and away)
  • Players on the ice (separated into skaters and goaltenders)
  • Where on the ice the event occurred (by three zones)
  • x,y coordinates of shots (of all types)
  • The score and time during the game

Data (x,y) coordinates are manually recorded; stadium location bias fixed with a variant of a method from Schuckers 2011

All Together Now

Each of the processes lead to goals:

  • Faceoffs: Head-to-head one on one contest (Binomial)
  • Penalties: Events that lead to man advantages (Poisson)
  • Shot Rates: Put shots on net (Poisson)
  • Shot Conversion: How does a shot become a goal? (Binomial)

(Listed in relative order of importance, as it happens)

General approach: We can still use differential shrinkage, but:

  • Use glmnet() for speed and cv.glmnet() for optimal shrinkage in each group.
  • Speed advantages for now are extreme
  • Uncertainty calculations can come later

But First: What Do We Mean By "Replacement"?

A concept in development: A player that could be immediately acquired from the open market (standard definition).

Two purposes: 1) Comparative – how would a player in a similar position do? 2) Statistical – what should we shrink towards if not "average"?

Our proposed definition is the "Poor Man's Replacement": replace the indicator for a player who has a limited number of "trials" with a generic player identifier for all such players.

Step 1: Faceoffs

Initial inspiration Model is Bradley-Terry like. For the outcome as a faceoff win for the home team:

\[ Y_i \sim Bin(1, p_i) \]

\[ logit^{-1}(p_i) = \mu_i + \alpha_{home[i]} - \alpha_{away[i]} \]

What modifications?

  1. PMR 1: pool all players who are centers with fewer than 50 faceoffs taken in a season
  2. PMR 2: pool all players who are not centers with fewer than 50 faceoffs taken in a season
  3. Shrinkage: use glmnet() and cross-validation to regularize estimates (very little needed)

Step 1: Faceoffs

Step 1: Faceoffs

In both cases, we have detected a distinct, negative effect for a replacement level player.

Estimated effect against replacement:

\[ alpha_k - alpha_{repl} \]

Estimated performance: difference between realized wins and modelled wins with replacement player.

\[ F_{net}(k) = \sum_i (Y_i*I(home[i]=k) + (1-Y_i)*I(away[i]=k)) - \sum_i E(Y_i|replacement_k) \]

Conversion: 0.013 faceoff wins/goal.

Step 1: Faceoffs

Step 1: Faceoffs

Top 5 performances of last 10 years:

##               Name   season FO.GAR
## 1  Rod Brind'Amour 20052006   5.15
## 2 Patrice Bergeron 20132014   3.55
## 3  Rod Brind'Amour 20062007   3.45
## 4      Chris Drury 20052006   3.32
## 5 Antoine Vermette 20132014   3.27

Bottom 5 performances of last 10 years:

##              Name   season FO.GAR
## 1      Eric Staal 20092010  -0.74
## 2    Mike Ribeiro 20142015  -0.76
## 3 Markus Granlund 20142015  -0.78
## 4 Andrew Cogliano 20082009  -0.93
## 5     Kevin Hayes 20142015  -1.05

Step 2: Penalties

Full Poisson model for each individual, taken or drawn:

\[ Y_{k} \sim Po(\lambda_k T_k) \]

Execution: your favorite GLM method with shrinkage built in.

Poor man's replacement: one for each of taking/drawing, one for offense/defense (four total)

Step 2: Penalties

Step 2: Penalties

Conversion: 0.17 goals/net penalty (based on success rate of power plays)

Step 2: Penalties

Step 2: Penalties

Top 5 performances of last 10 years:

##               Name   season diff.GAR
## 1     Dustin Brown 20082009 9.110639
## 2     Dustin Brown 20092010 9.013138
## 3    Brad Richards 20052006 8.643763
## 4 Patrice Bergeron 20052006 7.759772
## 5     Dustin Brown 20072008 7.648206

Bottom 5 performances of last 10 years:

##           Name   season  diff.GAR
## 1    Ben Eager 20062007 -4.266480
## 2 Brendan Witt 20052006 -4.363690
## 3   Chris Neil 20052006 -4.719525
## 4 Jarkko Ruutu 20052006 -5.250302
## 5   Sean Avery 20052006 -6.876339

Step 3: Shot Rates

Reminder: the 2013 model for Poisson and goals started like this:

\[ \log \lambda_H = \mu_H + \alpha_C + \beta_D \] \[ \log \lambda_A = \mu_A + \alpha_D + \beta_C \]

\(\mu_H\) and \(\mu_A\) initially includes:

  • score of the game
  • in which zone the event happened
  • if the home team won the faceoff

Step 3: Shot Rates

Original paper had general criticism: goals are relatively infrequent when measuring ability. Splitting into shots on goal and goals from shots lets us decouple the goaltender from the shooting process.

Except there are building biases when it comes to recording shots on goal by building.

Shot Rate Pre-Processing

Assume for now that the only biases are undercount and overcount, without a "homer" bias to a particular team. Then we can preprocess our offset for every interval of the form:

glm (TotalEventsFor ~ factor(ScoreDifference) +
                      factor(FaceoffZoneWin) +
                      factor(TeamFor) + factor(TeamAgainst) +
                      factor(HomeRink),
                      offset = log(TOI), family=poisson)

Eliminate the team-for and team-against effects and fit the subsequent model with players with the remaining offset.

Shot Rate Splits

Decision time: known that different shots have different probabilities of success.

  • Could fit one Poisson model for all shot attempts, one super-binomial model for scoring on shots
  • Could fit multiple models for shot attempts based on discretization, one model each for scoring on shots

"Danger Zones"

Earlier adoptions of "shot quality" arguments weren't taken as seriously by many in the online community or by broadcasters. Our group's innovation:

  • Start with corrected shot locations as in the previous plot and bin by location.
  • If a shot is classified as a "rebound" or "rush", upgrade it to the next bin.
  • Discount shots that are blocked by opposing players due to (x,y) being the location of the block.

We now have low, medium and high danger shot attempts; we assign a Poisson process to each of them and fit accordingly. Also fitting 3 models for 5v5, 5v4 and 4v4 play respectively.

Replacement

Replacement

Replacement

Replacement

Replacement

Replacement

Total GAR for Shot Rates

Total GAR for Shot Rates

Total GAR for Shot Rates

##                 Name   season  GAR.Off
## 1      Pavel Datsyuk 20072008 22.05358
## 2       Joe Pavelski 20102011 21.73583
## 3      Alex Ovechkin 20072008 21.43421
## 4       Joe Thornton 20142015 20.67054
## 5       John Tavares 20142015 20.51159
## 6       John Tavares 20112012 20.24696
## 7      Alex Ovechkin 20082009 20.17863
## 8         Eric Staal 20082009 20.13700
## 9       Ryan Getzlaf 20082009 20.05410
## 10      Joe Thornton 20132014 19.94340
## 11      Joe Thornton 20072008 19.93936
## 12 Henrik Zetterberg 20072008 19.45948
## 13      Jaromir Jagr 20062007 19.29356
## 14        Ryan Smyth 20082009 19.25245
## 15   Andrew Brunette 20052006 19.22553

Total GAR for Shot Rates

##                   Name   season   GAR.Def
## 1         Ethan Moreau 20052006 10.321202
## 2          Zdeno Chara 20092010 10.166794
## 3          Zdeno Chara 20072008  9.830160
## 4        Andrew Greene 20132014  9.340990
## 5       Douglas Murray 20082009  9.313444
## 6         Fedor Tyutin 20102011  9.262805
## 7        Alex Ovechkin 20052006  9.248476
## 8     Henrik Tallinder 20102011  9.233074
## 9          Marc Methot 20082009  9.189262
## 10 Marc-Edouard Vlasic 20132014  9.179472
## 11       Michael Sauer 20102011  9.107512
## 12       Mark Giordano 20092010  9.063137
## 13      Brian Campbell 20092010  9.021557
## 14      Anton Stralman 20132014  8.951044
## 15        Daniel Sedin 20132014  8.915217

Step 4: Shot Success

Now the data we have: restrict the events to goals (successes) and shots on goal (failures), separating into each of the three danger zones.

Initial model of the form:

glm (Goal ~ factor(ScoreDifference) +
            factor(TeamFor) + factor(TeamAgainst) +
            factor(HomeRink),
            family="binomial")

Step 4: Shot Success

Now the data we have: restrict the events to goals (successes) and shots on goal (failures), separating into each of the three danger zones (as shooting skill is certainly different for the same player as a function of distance).

Secondary model of the form (but with shrinkage):

glm (Goal ~ factor(Shooter) + factor(Goaltender) + 
            factor(OffensivePlayers) + factor(DefensivePlayers),
            offset = ScoreRinkOffset, family="binomial")

Step 4: Shot Success

What stratifying on danger does to replacement level:

Step 4: Shot Success

What stratifying on danger does to replacement level:

Outcomes for Shooters

Top 5 performances of last 10 years:

##             Name   season Shooting.GAR
## 1 Steven Stamkos 20112012     19.83906
## 2     Brad Boyes 20072008     19.09088
## 3  Alex Ovechkin 20072008     18.04440
## 4  Alex Ovechkin 20132014     17.64218
## 5 Ilya Kovalchuk 20072008     17.55738

Bottom 5 performances of last 10 years:

##                 Name   season Shooting.GAR
## 1       Trent Hunter 20072008    -10.10934
## 2        Scott Gomez 20082009    -10.34245
## 3       Jordan Staal 20072008    -10.64093
## 4 Patrick O'Sullivan 20082009    -11.55000
## 5        Jason Blake 20072008    -16.16243

Outcomes for Goaltenders

Top 5 performances of last 10 years:

##               Name   season Goalie.GAR
## 1       Tim Thomas 20102011   45.37928
## 2       Tim Thomas 20082009   45.04945
## 3       Mike Smith 20112012   41.68637
## 4 Miikka Kiprusoff 20052006   40.26555
## 5      Ryan Miller 20092010   38.38305

Bottom 5 performances of last 10 years:

##             Name   season Goalie.GAR
## 1    Steve Mason 20092010  -25.25123
## 2  Brian Elliott 20102011  -25.97452
## 3 Evgeni Nabokov 20052006  -26.38622
## 4    Steve Mason 20112012  -30.18400
## 5   Ben Scrivens 20142015  -33.77706

All Together Now

Top 15 performances of last 10 years:

##                Name   season      GAR
## 1     Alex Ovechkin 20072008 45.41411
## 2        Tim Thomas 20102011 45.37928
## 3        Tim Thomas 20082009 45.04945
## 4        Mike Smith 20112012 41.67098
## 5      Joe Pavelski 20132014 41.30521
## 6  Miikka Kiprusoff 20052006 40.26555
## 7       Ryan Miller 20092010 38.38305
## 8      Joe Thornton 20072008 37.75541
## 9   Semyon Varlamov 20132014 37.34121
## 10     Tomas Vokoun 20052006 36.90117
## 11    Sidney Crosby 20092010 36.78199
## 12    Pavel Datsyuk 20072008 36.32574
## 13   Roberto Luongo 20062007 35.98890
## 14    Jarome Iginla 20072008 35.37907
## 15    Alex Ovechkin 20052006 34.92775

All Together Now

Bottom 15 performances of last 10 years:

##                Name   season       GAR
## 1       Peter Budaj 20082009 -18.95456
## 2      Manny Legace 20082009 -19.03700
## 3      Chris Osgood 20082009 -19.34375
## 4      Vesa Toskala 20082009 -19.52904
## 5   Jussi Markkanen 20052006 -20.73060
## 6  Miikka Kiprusoff 20122013 -21.07604
## 7       Olaf Kolzig 20072008 -21.12308
## 8       Steve Mason 20102011 -21.67203
## 9      Dan Cloutier 20062007 -22.65102
## 10  Johan Holmqvist 20072008 -23.28397
## 11      Steve Mason 20092010 -25.25123
## 12    Brian Elliott 20102011 -25.97452
## 13   Evgeni Nabokov 20052006 -26.38622
## 14      Steve Mason 20112012 -30.18400
## 15     Ben Scrivens 20142015 -33.77706

What Are We Missing?

What we know we haven't included yet but could from this data:

  • Model for "on-goal success" (blocked shot, missed shot, shot on goal, goal) -> (shot on goal, goal)
  • impact of defenders and playmakers in the "realized" outcome, rather than just the shrunken model parameters

What we'd love to include that people are trying to collect:

  • Zone entry/exit data
  • shot set-ups / "pass assists"

What's Next With Public?

The results presented here are post-processed results: how actual achievement compared to what would have been expected with a replacement player in their place, converted to goals.

The model-derived results – \(\omega_i - \omega_{rep}\) – are nearly identical for high-information situations like faceoffs and penalties (almost no shrinkage) but way different for low-information sitations (shots on goal). How to reconcile?

What's Next With Public?

What's Next?

  • Main problem: Replacement-level experience is currently determined seasonally. (It's also the only magic number in here.)
  • Serious correction: For each of these statistics, make it dynamically updating with each time period/game (including "accrual of non-replacement status", or a probationary period for a player before we remove them from the replacement group.)
  • Why not go full Bayes with a time-dependent model? Probably could build a proper Kalman-type filter that would do the job.

Further Links: