Improving Defensive Statistics and Quantizing Power ForwardsFebruary 22, 2014 7 minute read
Defense in hockey is literally under-“rated”. A player’s defensive ability can be defined as his ability to perform acts that prevents goals against. Being between passing lanes and forcing a dump in is often can be qualified as a defensive act just as intercepting a pass can be. Technically, even being in possession of the puck is a defensive act. Because of this unique nature, it can be difficult to quantify a player’s defensive ability. Not only are there very few defensive metrics available, they also tend to paint the wrong picture. In this series of articles, we’ll adjust a few existing statistics, and explore a new approach at measuring defensive ability.
It’s been noted that blocked shots and hits are roughly inversely proportional to possession. Both blocked shots and hits require that the hitting and blocking team to be without possession of the puck. Not surprisingly, this means that to hit and block often, one needs to be without the puck often, as well. It’s pretty hard to score when you don’t have the puck. To make hits and blocked shots useful, we must remove the effect of possession and ice time. Removing the effect of ice time is simple, we simply use hits/60 minute of ice time. To remove the effect of possession, though, will take a little bit more work.
… to hit and block often, one needs to be without the puck often, as well. It’s pretty hard to score when you don’t have the puck.
Expected Hits Difference
Since hits and blocked shots are inversely proportional to possession (meaning as a player has possession of the puck more often, he will have less hits/blocked shots), we can moderately predict the number of hits/blocked shots a player will have based on his possession of the puck.
Using the road games data from 2011-12 for player who had more than one hit, I plotted each player’s hits/60 minutes against their laegap. As we expected, players hits less as their possession increases, as measured by laegap. Interestingly, forwards seem to follow this trend more strictly than do defensemen, possessing a correlation coefficient of -0.60 (With -1 being perfect inverse correlation, 1 being perfect positive correlation, and 0 being no correlation at all) over defesemen’s -.037. One possible explanation could be due to the difference in the typical role of forwards and defensemen in the defensive zone. Forwards are usually chasing the puck carrier, allowing them to finish checks without compromising the team, whereas defensemen are expected to stay near the posts and move when only necessary.
… players hits less as their possession increases, as measured by laegap. Interestingly, forwards seem to follow this trend more strictly than do defensemen …
Because of this difference, we will create two linear models (best fit lines) to predict a player’s hits/60 from his laegap, one for forwards, and one for defensemen.
forwards model: y = -30.7309x + 16.6036
defensemen model: y = -16.1764x + 10.4158
By taking the residuals of each model (hits/60 difference between data point and the best fit line at the data point’s laegap value), we can measure how many more, or less, hits a player had, per 60 minutes of ice time, than he is expected to have, based on his laegap. We will call this “expected hits difference” (EHD). The full dataset from the 2011-12 season is available here.
Expected Blocked Shots Difference
Applying the same concept to blocked shots, we can measure the difference between a player’s number of blocked shots and his expected number, based on his laegap. The following graph is plotted with road game data from the 2011-12 season, for players with more than one road blocked shots.
It is difficult to visualize this data because of several outliers that are expanding the range of the plots, so we will limit the y-axis range to [0,15] to get a better visualization.
The difference in correlation between forwards and defensemen still exists, -0.52 for forwards over -0.28 for defensemen. Again, we will create two linear models to predict blocked shots/60 minute of ice time from laegap.
forwards model: y = -10.9794x + 5.9599
defensemen model: y = -6.3235x + 6.9798
We will call the residuals of the model “Expected Blocked Shots Difference” (EBSD).
Data and Analysis
Keep in mind that the data is from the most recent full, completed season, 2011-12.
Just sorting by EHD doesn’t reveal anything surprising, along the list of players with the highest EHD are names like Matt Martin, Chris Neil, Jordan Nolan, Zac Rinaldo. Differentiating already proven possession players by their tendency to hit will make this statistic more useful. Ideally, we would have a compound stat that takes into consideration of a player’s LAEGAP and EHD that will find the players who have the best balance of both, but for the sake of simplicity, we’ll just query the list and limit the results to players with EHD and LAEGAP in the 80+ percentile in their respective positions (forwards and defensemen).
|CHRIS NEIL||R||Ottawa Senators||96||0.412145||10.6745|
|BRANDON DUBINSKY||C||New York Rangers||94||0.42617||6.16958|
|BRIAN BOYLE||C||New York Rangers||108||0.411146||6.16341|
|CHRIS KUNITZ||L||Pittsburgh Penguins||86||0.495086||5.62568|
|NICK FOLIGNO||L||Ottawa Senators||94||0.41737||5.51262|
|DAVID CLARKSON||R||New Jersey Devils||92||0.440565||5.29224|
|DUSTIN BROWN||L||Los Angeles Kings||105||0.45883||5.15173|
|JOHN MITCHELL||C||New York Rangers||42||0.44702||5.00791|
|MILAN LUCIC||L||Boston Bruins||102||0.416881||4.77936|
|EVANDER KANE||L||Winnipeg Jets||78||0.434772||4.56427|
|DAVID BACKES||R||St. Louis Blues||101||0.442919||4.48841|
|SCOTT HARTNELL||L||Philadelphia Flyers||85||0.44966||4.36205|
|JAMES NEAL||L||Pittsburgh Penguins||72||0.477339||3.594|
|GABRIEL LANDESKOG||L||Colorado Avalanche||85||0.432186||3.41959|
|PAVEL DATSYUK||C||Detroit Red Wings||49||0.50087||3.17185|
|ERIK COLE||L||Montreal Canadiens||77||0.438388||3.06093|
|VLADIMIR SOBOTKA||C||St. Louis Blues||69||0.404513||3.02599|
|WAYNE SIMMONDS||R||Philadelphia Flyers||70||0.424179||2.90114|
|ALEXEI PONIKAROVSKY||L||New Jersey Devils||22||0.420428||2.89406|
|MAXIME TALBOT||C||Philadelphia Flyers||78||0.403388||2.87413|
|ANDREI KOSTITSYN||L||Nashville Predators||18||0.414156||2.7125|
There are 21 forwards that fit this description. Along that list, many familiar “recognized” power forwards: Dubinsky, Clarkson, Lucic, Brown, Landeskog, etc. More interestingly, though, are the unfamiliar names. I don’t think anyone would consider Datsyuk as a player who has a strong tendency of finishing their checks or a prototypical power forward, but given how much time his team doesn’t have the puck when he is on the ice (time at which he is allowed to hit), Datsyuk is actually quite a physical player. Other notable examples that follow this mold include John Mitchell, Alexei Ponikarovsky, and Andrew Kostitsyn. Next, we will look at the same table for defensemen. Since there isn’t really a consensus list of physical defensemen who earned their reputation by hitting rather than fighting, I’ll allow you to draw your own conclusions upon this following list.
I don’t think anyone would consider Datsyuk as a … power forward, but given how much time his team doesn’t have the puck when he is on the ice, Datsyuk is actually quite a physical player.
|CORY SARICH||D||Calgary Flames||92||0.424682||8.55089|
|MATT GREENE||D||Los Angeles Kings||112||0.418304||6.37771|
|ANTON VOLCHENKOV||D||New Jersey Devils||107||0.420815||6.20573|
|BROOKS ORPIK||D||Pittsburgh Penguins||126||0.433071||5.72495|
|ROMAN POLAK||D||St. Louis Blues||96||0.416533||4.47969|
|ANDREJ MESZAROS||D||Philadelphia Flyers||71||0.431908||2.85333|
|ZDENO CHARA||D||Boston Bruins||85||0.486419||2.69378|
|SHANE O'BRIEN||D||Colorado Avalanche||75||0.419597||2.59454|
|SHEA WEBER||D||Nashville Predators||94||0.458669||2.36478|
|AARON JOHNSON||D||Columbus Blue Jackets||51||0.415857||2.35233|
|RYAN WILSON||D||Colorado Avalanche||49||0.438385||2.32682|
|JONATHAN ERICSSON||D||Detroit Red Wings||58||0.415474||2.30766|
|MATT NISKANEN||D||Pittsburgh Penguins||59||0.456629||2.05726|
|KRIS RUSSELL||D||St. Louis Blues||31||0.416823||2.05232|
|DAN GIRARDI||D||New York Rangers||105||0.406121||1.74624|
|JAY HARRISON||D||Carolina Hurricanes||66||0.414172||1.72893|
Although EHD might not help us pick out hidden gems in the league, it certainly adds another tool in the toolbox of evaluations that could be helpful to determine a player’s physicality in context with other metrics. For example, it can help us put a better quantitative definition on “power forwards”, instead of relying on something silly like PIMs. EBSD seems to give us less definitive data, but perhaps reveals more important information than EHD; I’ll explain later. The list for EBSD below is compiled with the same conditions as the EHD lists: 80th+ percentile in LAEGAP and EBSD, in the players’ respective positions.
|PATRICE BERGERON||C||Boston Bruins||40||0.510792||2.91931|
|DREW MILLER||L||Detroit Red Wings||34||0.418606||2.74863|
|ADAM HENRIQUE||C||New Jersey Devils||40||0.451944||2.42602|
|JOE PAVELSKI||C||San Jose Sharks||46||0.467415||2.38389|
|MATT DUCHENE||C||Colorado Avalanche||26||0.399101||2.20472|
|JOHN MITCHELL||C||New York Rangers||16||0.44702||1.9478|
|RYAN GETZLAF||C||Anaheim Ducks||45||0.425897||1.8383|
|DAVID BACKES||R||St. Louis Blues||39||0.442919||1.79168|
|LOGAN COUTURE||C||San Jose Sharks||38||0.435496||1.75633|
|RJ UMBERGER||C||Columbus Blue Jackets||37||0.40786||1.60035|
|ANDREW DESJARDINS||C||San Jose Sharks||19||0.405102||1.41408|
|BRIAN BOYLE||C||New York Rangers||30||0.411146||1.36872|
|PAVEL DATSYUK||C||Detroit Red Wings||20||0.50087||1.32844|
|MATT READ||R||Philadelphia Flyers||32||0.402183||1.31851|
|JUSTIN ABDELKADER||L||Detroit Red Wings||24||0.395673||1.28329|
|DAVID KREJCI||C||Boston Bruins||32||0.406511||1.1649|
|GABRIEL LANDESKOG||L||Colorado Avalanche||30||0.432186||1.16467|
|EVGENI MALKIN||C||Pittsburgh Penguins||24||0.483255||1.1564|
|ANZE KOPITAR||C||Los Angeles Kings||25||0.484624||1.06346|
|ALEXEI PONIKAROVSKY||L||New Jersey Devils||8||0.420428||1.04798|
|VINNY PROSPAL||L||Columbus Blue Jackets||28||0.424766||1.01385|
|BLAKE WHEELER||R||Winnipeg Jets||24||0.459031||1.00473|
|RYAN O'REILLY||C||Colorado Avalanche||29||0.436466||1.00244|
|JUSTIN WILLIAMS||R||Los Angeles Kings||21||0.46947||0.948008|
|MATT CULLEN||C||Minnesota Wild||29||0.403631||0.932481|
|CHRIS KELLY||C||Boston Bruins||25||0.405398||0.892151|
|PATRIC HORNQVIST||R||Nashville Predators||16||0.476801||0.888986|
|JAMIE BENN||L||Dallas Stars||21||0.448854||0.886591|
|RYAN KESLER||C||Vancouver Canucks||31||0.399312||0.881319|
|PAUL STASTNY||C||Colorado Avalanche||20||0.460145||0.782397|
|CHRIS KUNITZ||L||Pittsburgh Penguins||16||0.495086||0.780934|
|VIKTOR STALBERG||L||Chicago Blackhawks||16||0.449663||0.72158|
|MARCUS KRUGER||C||Chicago Blackhawks||23||0.396877||0.700386|
|T.J. OSHIE||C||St. Louis Blues||26||0.425279||0.692627|
|MAXIME TALBOT||C||Philadelphia Flyers||24||0.403388||0.647905|
|JOHAN FRANZEN||C||Detroit Red Wings||18||0.456562||0.614548|
|ZACH PARISE||L||New Jersey Devils||26||0.436707||0.611747|
|BRAD MARCHAND||C||Boston Bruins||13||0.490595||0.600589|
|SCOTT HARTNELL||L||Philadelphia Flyers||19||0.44966||0.574705|
|NIKLAS HJALMARSSON||D||Chicago Blackhawks||81||0.434114||3.3488|
|ROMAN HAMRLIK||D||Washington Capitals||79||0.408728||2.66502|
|ANTON VOLCHENKOV||D||New Jersey Devils||69||0.420815||2.01005|
|MATT TAORMINA||D||New Jersey Devils||22||0.442522||2.00647|
|JAY HARRISON||D||Carolina Hurricanes||74||0.414172||1.74413|
|ROMAN POLAK||D||St. Louis Blues||70||0.416533||1.60231|
|JOHNNY BOYCHUK||D||Boston Bruins||76||0.476137||1.56313|
|NIKLAS KRONWALL||D||Detroit Red Wings||92||0.423585||1.51907|
|ZBYNEK MICHALEK||D||Pittsburgh Penguins||58||0.42391||1.46907|
|SIMON DESPRES||D||Pittsburgh Penguins||11||0.407129||1.46744|
|BARRET JACKMAN||D||St. Louis Blues||79||0.416804||1.32526|
|AARON JOHNSON||D||Columbus Blue Jackets||47||0.415857||1.21713|
As you may have noticed, EBSD across these players have a much smaller range than EHD. The highest EBSD of any player only goes up to ~3, where as EHD for top hitters go up to ~6, discounting isolated cases. This is due to the much smaller sample size of blocked shots, compared to hits. I originally decided to include blocked shots because of its similarities to hits, in that blocked shots are also inversely correlated with possession, but it would appear a metric that looked that the percentage of blocked shots out of shots against when on ice would better differentiate and distinguish those who are good at getting in shooting lanes, assuming such an ability exists.
This same concept can be applied to takeaways and giveaways, which we will look at in the next article.
One of the assumptions for regression analysis is that the dependent variable has to be normally distributed. The distribution of the amount of hits for each player is positively skewed, but as the kurtosis and skew for the hits and blocked shots distribution of forwards and defencemen under ten, and three, respectively, the normality assumption is not violated. (Kline, 2011)
For those interested in NHST results:
|position||dependent variable||p-value||R squared|
The null hypotheses of all of the regressions can be rejected due to the low p-values, but it is also important to note the small effect size of the defensemen regressions, especially for blocked shots.
Some of you might have noticed that I used “more than 1 hit or blocked shot” as a criteria in sorting out outliers that skew the regression, instead of “one or more hit or blocked shot”. This is because, for some reason, including players with one or more hit/blocked shot in particular differs from including players with two or more somewhat significantly, as you can see from the graph below.
Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York: Guilford Press.