Improving Defensive Statistics and Quantizing Power Forwards

February 22, 2014 7 minute read

Defense in hockey is literally under-“rated”. A player’s defensive ability can be defined as his ability to perform acts that prevents goals against. Being between passing lanes and forcing a dump in is often can be qualified as a defensive act just as intercepting a pass can be. Technically, even being in possession of the puck is a defensive act. Because of this unique nature, it can be difficult to quantify a player’s defensive ability. Not only are there very few defensive metrics available, they also tend to paint the wrong picture. In this series of articles, we’ll adjust a few existing statistics, and explore a new approach at measuring defensive ability.

It’s been noted that blocked shots and hits are roughly inversely proportional to possession. Both blocked shots and hits require that the hitting and blocking team to be without possession of the puck. Not surprisingly, this means that to hit and block often, one needs to be without the puck often, as well. It’s pretty hard to score when you don’t have the puck. To make hits and blocked shots useful, we must remove the effect of possession and ice time. Removing the effect of ice time is simple, we simply use hits/60 minute of ice time. To remove the effect of possession, though, will take a little bit more work.

… to hit and block often, one needs to be without the puck often, as well. It’s pretty hard to score when you don’t have the puck.

Expected Hits Difference

Since hits and blocked shots are inversely proportional to possession (meaning as a player has possession of the puck more often, he will have less hits/blocked shots), we can moderately predict the number of hits/blocked shots a player will have based on his possession of the puck.

ehd

Using the road games data from 2011-12 for player who had more than one hit, I plotted each player’s hits/60 minutes against their laegap. As we expected, players hits less as their possession increases, as measured by laegap. Interestingly, forwards seem to follow this trend more strictly than do defensemen, possessing a correlation coefficient of -0.60 (With -1 being perfect inverse correlation, 1 being perfect positive correlation, and 0 being no correlation at all) over defesemen’s -.037. One possible explanation could be due to the difference in the typical role of forwards and defensemen in the defensive zone. Forwards are usually chasing the puck carrier, allowing them to finish checks without compromising the team, whereas defensemen are expected to stay near the posts and move when only necessary.

…¬†players hits less as their possession increases, as measured by laegap. Interestingly, forwards seem to follow this trend more strictly than do defensemen …

Because of this difference, we will create two linear models (best fit lines) to predict a player’s hits/60 from his laegap, one for forwards, and one for defensemen.

forwards model: y = -30.7309x + 16.6036
defensemen model: y = -16.1764x + 10.4158

By taking the residuals of each model (hits/60 difference between data point and the best fit line at the data point’s laegap value), we can measure how many more, or less, hits a player had, per 60 minutes of ice time, than he is expected to have, based on his laegap. We will call this “expected hits difference” (EHD). The full dataset from the 2011-12 season is available here.

Expected Blocked Shots Difference

Applying the same concept to blocked shots, we can measure the difference between a player’s number of blocked shots and his expected number, based on his laegap. The following graph is plotted with road game data from the 2011-12 season, for players with more than one road blocked shots.

ebsd

It is difficult to visualize this data because of several outliers that are expanding the range of the plots, so we will limit the y-axis range to [0,15] to get a better visualization.

ebsd_scaled

The difference in correlation between forwards and defensemen still exists, -0.52 for forwards over -0.28 for defensemen. Again, we will create two linear models to predict blocked shots/60 minute of ice time from laegap.

forwards model: y = -10.9794x + 5.9599
defensemen model: y = -6.3235x + 6.9798

We will call the residuals of the model “Expected Blocked Shots Difference” (EBSD).

Data and Analysis

Keep in mind that the data is from the most recent full, completed season, 2011-12.

Just sorting by EHD doesn’t reveal anything surprising, along the list of players with the highest EHD are names like Matt Martin, Chris Neil, Jordan Nolan, Zac Rinaldo. Differentiating already proven possession players by their tendency to hit will make this statistic more useful. Ideally, we would have a compound stat that takes into consideration of a player’s LAEGAP and EHD that will find the players who have the best balance of both, but for the sake of simplicity, we’ll just query the list and limit the results to players with EHD and LAEGAP in the 80+ percentile in their respective positions (forwards and defensemen).

name position team hits laegap ehd
CHRIS NEIL R Ottawa Senators 96 0.412145 10.6745
BRANDON DUBINSKY C New York Rangers 94 0.42617 6.16958
BRIAN BOYLE C New York Rangers 108 0.411146 6.16341
CHRIS KUNITZ L Pittsburgh Penguins 86 0.495086 5.62568
NICK FOLIGNO L Ottawa Senators 94 0.41737 5.51262
DAVID CLARKSON R New Jersey Devils 92 0.440565 5.29224
DUSTIN BROWN L Los Angeles Kings 105 0.45883 5.15173
JOHN MITCHELL C New York Rangers 42 0.44702 5.00791
MILAN LUCIC L Boston Bruins 102 0.416881 4.77936
EVANDER KANE L Winnipeg Jets 78 0.434772 4.56427
DAVID BACKES R St. Louis Blues 101 0.442919 4.48841
SCOTT HARTNELL L Philadelphia Flyers 85 0.44966 4.36205
JAMES NEAL L Pittsburgh Penguins 72 0.477339 3.594
GABRIEL LANDESKOG L Colorado Avalanche 85 0.432186 3.41959
PAVEL DATSYUK C Detroit Red Wings 49 0.50087 3.17185
ERIK COLE L Montreal Canadiens 77 0.438388 3.06093
VLADIMIR SOBOTKA C St. Louis Blues 69 0.404513 3.02599
WAYNE SIMMONDS R Philadelphia Flyers 70 0.424179 2.90114
ALEXEI PONIKAROVSKY L New Jersey Devils 22 0.420428 2.89406
MAXIME TALBOT C Philadelphia Flyers 78 0.403388 2.87413
ANDREI KOSTITSYN L Nashville Predators 18 0.414156 2.7125

There are 21 forwards that fit this description. Along that list, many familiar “recognized” power forwards: Dubinsky, Clarkson, Lucic, Brown, Landeskog, etc. More interestingly, though, are the unfamiliar names. I don’t think anyone would consider Datsyuk as a player who has a strong tendency of finishing their checks or a prototypical power forward, but given how much time his team doesn’t have the puck when he is on the ice (time at which he is allowed to hit), Datsyuk is actually quite a physical player. Other notable examples that follow this mold include John Mitchell, Alexei Ponikarovsky, and Andrew Kostitsyn. Next, we will look at the same table for defensemen.¬†Since there isn’t really a consensus list of physical defensemen who earned their reputation by hitting rather than fighting, I’ll allow you to draw your own conclusions upon this following list.

I don’t think anyone would consider Datsyuk as a … power forward, but given how much time his team doesn’t have the puck when he is on the ice, Datsyuk is actually quite a physical player.

name position team hits laegap ehd
CORY SARICH D Calgary Flames 92 0.424682 8.55089
MATT GREENE D Los Angeles Kings 112 0.418304 6.37771
ANTON VOLCHENKOV D New Jersey Devils 107 0.420815 6.20573
BROOKS ORPIK D Pittsburgh Penguins 126 0.433071 5.72495
ROMAN POLAK D St. Louis Blues 96 0.416533 4.47969
ANDREJ MESZAROS D Philadelphia Flyers 71 0.431908 2.85333
ZDENO CHARA D Boston Bruins 85 0.486419 2.69378
SHANE O'BRIEN D Colorado Avalanche 75 0.419597 2.59454
SHEA WEBER D Nashville Predators 94 0.458669 2.36478
AARON JOHNSON D Columbus Blue Jackets 51 0.415857 2.35233
RYAN WILSON D Colorado Avalanche 49 0.438385 2.32682
JONATHAN ERICSSON D Detroit Red Wings 58 0.415474 2.30766
MATT NISKANEN D Pittsburgh Penguins 59 0.456629 2.05726
KRIS RUSSELL D St. Louis Blues 31 0.416823 2.05232
DAN GIRARDI D New York Rangers 105 0.406121 1.74624
JAY HARRISON D Carolina Hurricanes 66 0.414172 1.72893

Although EHD might not help us pick out hidden gems in the league, it certainly adds another tool in the toolbox of evaluations that could be helpful to determine a player’s physicality in context with other metrics. For example, it can help us put a better quantitative definition on “power forwards”, instead of relying on something silly like PIMs. EBSD seems to give us less definitive data, but perhaps reveals more important information than EHD; I’ll explain later. The list for EBSD below is compiled with the same conditions as the EHD lists: 80th+ percentile in LAEGAP and EBSD, in the players’ respective positions.

name position team blocked_shots laegap ebsd
PATRICE BERGERON C Boston Bruins 40 0.510792 2.91931
DREW MILLER L Detroit Red Wings 34 0.418606 2.74863
ADAM HENRIQUE C New Jersey Devils 40 0.451944 2.42602
JOE PAVELSKI C San Jose Sharks 46 0.467415 2.38389
MATT DUCHENE C Colorado Avalanche 26 0.399101 2.20472
JOHN MITCHELL C New York Rangers 16 0.44702 1.9478
RYAN GETZLAF C Anaheim Ducks 45 0.425897 1.8383
DAVID BACKES R St. Louis Blues 39 0.442919 1.79168
LOGAN COUTURE C San Jose Sharks 38 0.435496 1.75633
RJ UMBERGER C Columbus Blue Jackets 37 0.40786 1.60035
ANDREW DESJARDINS C San Jose Sharks 19 0.405102 1.41408
BRIAN BOYLE C New York Rangers 30 0.411146 1.36872
PAVEL DATSYUK C Detroit Red Wings 20 0.50087 1.32844
MATT READ R Philadelphia Flyers 32 0.402183 1.31851
JUSTIN ABDELKADER L Detroit Red Wings 24 0.395673 1.28329
DAVID KREJCI C Boston Bruins 32 0.406511 1.1649
GABRIEL LANDESKOG L Colorado Avalanche 30 0.432186 1.16467
EVGENI MALKIN C Pittsburgh Penguins 24 0.483255 1.1564
ANZE KOPITAR C Los Angeles Kings 25 0.484624 1.06346
ALEXEI PONIKAROVSKY L New Jersey Devils 8 0.420428 1.04798
VINNY PROSPAL L Columbus Blue Jackets 28 0.424766 1.01385
BLAKE WHEELER R Winnipeg Jets 24 0.459031 1.00473
RYAN O'REILLY C Colorado Avalanche 29 0.436466 1.00244
JUSTIN WILLIAMS R Los Angeles Kings 21 0.46947 0.948008
MATT CULLEN C Minnesota Wild 29 0.403631 0.932481
CHRIS KELLY C Boston Bruins 25 0.405398 0.892151
PATRIC HORNQVIST R Nashville Predators 16 0.476801 0.888986
JAMIE BENN L Dallas Stars 21 0.448854 0.886591
RYAN KESLER C Vancouver Canucks 31 0.399312 0.881319
PAUL STASTNY C Colorado Avalanche 20 0.460145 0.782397
CHRIS KUNITZ L Pittsburgh Penguins 16 0.495086 0.780934
VIKTOR STALBERG L Chicago Blackhawks 16 0.449663 0.72158
MARCUS KRUGER C Chicago Blackhawks 23 0.396877 0.700386
T.J. OSHIE C St. Louis Blues 26 0.425279 0.692627
MAXIME TALBOT C Philadelphia Flyers 24 0.403388 0.647905
JOHAN FRANZEN C Detroit Red Wings 18 0.456562 0.614548
ZACH PARISE L New Jersey Devils 26 0.436707 0.611747
BRAD MARCHAND C Boston Bruins 13 0.490595 0.600589
SCOTT HARTNELL L Philadelphia Flyers 19 0.44966 0.574705

name position team blocked_shots laegap ebsd
NIKLAS HJALMARSSON D Chicago Blackhawks 81 0.434114 3.3488
ROMAN HAMRLIK D Washington Capitals 79 0.408728 2.66502
ANTON VOLCHENKOV D New Jersey Devils 69 0.420815 2.01005
MATT TAORMINA D New Jersey Devils 22 0.442522 2.00647
JAY HARRISON D Carolina Hurricanes 74 0.414172 1.74413
ROMAN POLAK D St. Louis Blues 70 0.416533 1.60231
JOHNNY BOYCHUK D Boston Bruins 76 0.476137 1.56313
NIKLAS KRONWALL D Detroit Red Wings 92 0.423585 1.51907
ZBYNEK MICHALEK D Pittsburgh Penguins 58 0.42391 1.46907
SIMON DESPRES D Pittsburgh Penguins 11 0.407129 1.46744
BARRET JACKMAN D St. Louis Blues 79 0.416804 1.32526
AARON JOHNSON D Columbus Blue Jackets 47 0.415857 1.21713

As you may have noticed, EBSD across these players have a much smaller range than EHD. The highest EBSD of any player only goes up to ~3, where as EHD for top hitters go up to ~6, discounting isolated cases. This is due to the much smaller sample size of blocked shots, compared to hits. I originally decided to include blocked shots because of its similarities to hits, in that blocked shots are also inversely correlated with possession, but it would appear a metric that looked that the percentage of blocked shots out of shots against when on ice would better differentiate and distinguish those who are good at getting in shooting lanes, assuming such an ability exists.

This same concept can be applied to takeaways and giveaways, which we will look at in the next article.

Technical Details

position distribution of skew kurtosis
forwards hits 1.62 6.78
defensemen hits 1.08 3.56
forwards blocked shots 1.09 3.83
defensemen blocked shots 0.46 2.39

One of the assumptions for regression analysis is that the dependent variable has to be normally distributed. The distribution of the amount of hits for each player is positively skewed, but as the kurtosis and skew for the hits and blocked shots distribution of forwards and defencemen under ten, and three, respectively, the normality assumption is not violated. (Kline, 2011)

For those interested in NHST results:

position dependent variable p-value R squared
forwards hits <2.2e-16 0.32
defensemen hits 1.1e-10 0.14
forwards blocked shots <2.2e-16 0.25
defensemen blocked shots 9.7e-5 0.053

The null hypotheses of all of the regressions can be rejected due to the low p-values, but it is also important to note the small effect size of the defensemen regressions, especially for blocked shots.

Some of you might have noticed that I used “more than 1 hit or blocked shot” as a criteria in sorting out outliers that skew the regression, instead of “one or more hit or blocked shot”. This is because, for some reason, including players with one or more hit/blocked shot in particular differs from including players with two or more somewhat significantly, as you can see from the graph below.

corcofs

References

Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York: Guilford Press.