Main Index Page About Your Host Send Me E-Mail Browse the Archives Read My Work |
Leftovers
An unscientific exercise in parsing hockey statistics. As you know I sometimes do, I've been fiddling with stats tonight. All of the work of a hockey team--understood as a team rather than a business enterprise--goes into winning games, which is to say it goes into scoring and preventing goals. Broadly speaking, teams have better records as they do better at scoring and preventing goals. In fact, there is a pretty solid linear relationship between goal differential (goals scored minus goals allowed) and points in the standings. Real solid. If you make a scatter diagram of the two things, the points line up nicely and you can see that goal differential can "explain" (or perhaps more properly, "account for") most of the variance in standings-points. More than 95% of it, in fact, for the whole 2002-03 season, if you go by the traditional R² measure. This suggests that adding a certain number of goals to a team will yield a highly predictable number of points in the standings--about 2.8 goals per point, as it happens (and the figure is almost exactly the same so far this year). If you ignore the extra points from overtime losses, which I'm not recommending except for the purpose of grasping the idea here, your favourite team should be X points above .500, where X is goal differential divided by 2.8. And most are. Detroit right now is +49; +49 divided by 2.8 is 17.5; so they should have 55 + 17.5 = 72.5 points after the 55 games they've played so far. They actually have 70. My Oilers are -5, so they should have about 52 points through their 54 games. True total: 51.
But some teams diverge pretty wildly from the levels of real success predicted by their ability to score and prevent goals. The most dramatic case in But of course they're not. Having played 53 games they should have about 53 + (61/2.8) points, or 75. If you account for overtime losses, it should be more like 78. In fact they only have 66 points, which leaves them behind division-mates Toronto (69 points, goal differential +20) and Boston (67, +10).
No team in the NHL has remotely as much negative "residue" as Ottawa does in this sense. They are about 12 points lower in the standings than their GF/GA totals would predict. Why has this happened to Ottawa? Easy answer for anyone who's studied baseball sabermetrics: it's because, in the real world,
I haven't checked to make sure that residue is connected to one-goal performance up and down the league. I know Pittsburgh and Nashville have the most positive "residue" in the league (7.0 and 7.4 points respectively), and they do much better in close games than in others, Pittsburgh being 9-11--which is pretty good for a team that only has 11 wins of any kind!--and Nashville being 15-8. There is really nothing else the residue POSITIVE Nashville +7.4 Pittsburgh +7.0 Boston +5.6 Toronto +5.1 Dallas +4.9 NEGATIVE Ottawa -11.5 Detroit -5.3 Chicago -4.4 Edmonton -4.0 Minnesota -3.8The important question to answer is this: what is the right name for this residue? What actual factor makes teams perform worse or better in close games than "mere" goal-scoring and goal-preventing would allow for?
The temptation for a student of baseball statistics is to assign the residue to I can tell you that, between '02-'03 and this year, it hasn't. A scatter diagram of the residue in both years is a random cloud of bugs, without apparent meaningful correlation. Of the 14 teams that had positive residue last year, only 8 have it now. Of the 16 negatives, only 9 still have it. Better than chance--but only a teeny tiny toony bit better.
The residue
My guess--and given that I'm an amateur statistician, it can be only a guess-is that the residue really is a product of luck. If so, outliers like Nashville and Ottawa should be
But maybe it's What we have here is a potential framework for the equivalent of what is called "Pythagorean standings" in baseball. At the end of a season, you could identify the teams that had especially bad luck and pick them to improve. With what exact confidence could you do this?--I don't know. Looking at last year, the "unluckiest" five teams were San Jose, Nashville, Buffalo, Dallas, and Vancouver. Not all bad teams, certainly!--just the unluckiest by this measure. How many have better records so far this year? Four--Vancouver's record is slightly better--and Dallas's decline is no surprise on other grounds (and is more profound than the standings show, if you believe all this tommyrot).
Last year, the luckiest five teams were Atlanta, Florida, Tampa Bay, Edmonton, and Anaheim. How many stand worse now? Only three: Florida and Tampa have improved. Atlanta's only slightly worse, despite being a very special case. So maybe the predictive value of this "luck" isn't so great on its own. Bill James used the baseball equivalent as one weighting factor among a whole set of pre-season indicators, and team age could certainly be another factor you could use. Then again, maybe age
The G Spot
I guess the best way to introduce this unnerving discovery is to retrace the steps that led to it. It starts with me trying to decrypt the meaning of last year's NHL playoffs--but before we can start to discuss it, before you can share the same weird concepts I think with, I have to show you how I treat hockey statistics. The secret to winning hockey games is no secret at all: you have to score more goals than the other guy. Over a large number of games, the teams that perform better in the standings outscore their opponents by more. There's a strong, simple linear relationship, which I've discussed before: at the current offensive levels, every 2.8 goals you add gain you about a point in the standings over time. So goal differential--your goals scored, less your goals allowed--is an important stat. On its own (and to put it roughly), goal differential determines about 90% of your place in the standings. Gradually I've come to think of teams in terms of their goal differentials almost as much as I think of their actual standings points. Detroit right now is a +61 team. My Edmonton Oilers are +5, and not surprisingly they're around .500. Calgary is a little better (+9) both in the standings and in goal differential. Pittsburgh is an abyssal -120; they're actually much worse than the standings show them to be.
If you understand that, you shouldn't have any trouble with the next step, which is realizing that you can
But you can break offence and defence down How many goals did Dallas's defence prevent, compared to the league average? +26 What's the average save percentage of the league? .909 What's the save percentage of Dallas's goalies? .906 So Dallas's goalies have given up 3 extra goals for every thousand shots... how many shots have they faced this season? About 1,540 (the SOGA numbers I pull off ESPN only go to three significant digits, but that's fine) In other words, the Dallas goalies have given up how many more goals than league-average goaltenders would? About five. They're a -5 on their side of the goal-prevention accounting.
And since Dallas overall is +26 at goal prevention, their defencemen must be? +31. Which is just about what you get if you do the calculation the other way, assigning the defencemen credit for shots on goal prevented above league average (about 330) and count that as being worth the same number of goals a league-average goalie would let in (league-average save percentage is .909, so the average goaltender lets in .091 goals for every shot--on 330 shots,
This is an elaborate but, I think, undeniably effective way of distributing credit for goal-prevention between the defence and the goaltenders. And you can do the same thing for the offence, though the meaning of it would be less clear... maybe. A team like Anaheim generates a large number of shots (29.7 per game, about two above average) but can't put the puck in the net (shot percentage of .076, markedly less than the .091 figure I cited a couple paragraphs back). There are teams like Atlanta that don't create many shots (25.9/g) but convert on a huge fraction of them (.104). What does it mean? In Atlanta's case, I'm inclined to attribute it to Ilya Kovalchuk and some other wingers having great years as snipers. And I notice that over in Anaheim, Petr Sykora is throwing a ton of shots at the net to no great effect. As far as a general interpretation of the figures for "accuracy" and "shot creation" goes, I'm at a loss, but that's all right--we're going back to goaltending.
Goaltending is important in the playoffs. Yeah, yeah... we We've found a way to make a concise statement about the number of goals saved above average (or below average) by a team's goaltenders over the course of a season, or any convenient length of time. Now I'm going to show you a version of the chart that got me thinking. It's the teams that made the playoffs last year, sorted according to their "extra goals prevented by goaltenders" figure for the preceding 2002-03 season as a whole. Nothing else.
Minnesota +42 Anaheim +30 Philly +28 Dallas +28 Colorado +23 New Jersey +18 Detroit +18 Toronto +18 Ottawa +9 Washington +9 Tampa Bay +7 Vancouver -2 NY Isles -14 Edmonton -19 Boston -19 St. Louis -33
Unless I'm completely nuts, you probably noticed, like I did, that the two
In fact, if you go back and check, you'll see that the team with the "better" goaltending by this measure won 10 of the 15 postseason series. But the truth is more remarkable than that: the five "upsets" involved teams that were behind their opponent in this category by 2, 10, 9, 12, and 12 goals.
So I checked the previous year's playoffs, and it had happened again, though the change in the curve was less dramatic. Keep in mind this is a small sample space, and it would be a lot of work to make it bigger. For the two years combined, teams with any advantage won 19/29 (66%); with a five-goal advantage it was 17/26 (65%), a slight dip, and at 10 it was 14/21 (67%)--but at 15 it was 12/15 (80%), at 20 it was 10/12 (83%), and at 25 it was 9/10 (90%). The data seem to be pointing to some sort of sinusoidal relationship:
Is this happening because better teams normally have "better goaltending" by this measure? There's
Moreover, you don't see this "sinusoidal" shape when you compare disparities in "non-goaltending goal differential" to the chance of winning a playoff series. The effect of superiority in respects other than goaltending, amazingly,
[UPDATE, 10:10 pm: This paragraph is slightly newer than the rest and has replaced some mystified head-scratching.] So why would teams be more likely to lose playoff series as their edge in non-goaltending categories gets greater? Are the data telling us that being a better team in non-goaltending respects is at an active So, if you've absorbed all that, and it's my fault if you haven't, you're probably wondering how the 2003-04 teams shape up in the apparently-insanely-important "extra goals prevented by goaltenders" category. This is the list.
San Jose +33 Florida +32 Minnesota +25 Montreal +20 Boston +20 Colorado +18 New Jersey +18 Anaheim +11 Vancouver +10 Detroit +6 Calgary +6 Philly +5 Ottawa +1 Columbus +1 Tampa Bay -1 Nashville -5 Dallas -5 Islanders -7 Carolina -7 Washington -7 Toronto -9 Buffalo -9 St. Louis -9 LA Kings -10 Edmonton -10 Chicago -16 Phoenix -16 NY Rangers -20 Atlanta -22 Pittsburgh -49
What conclusions do
And a sad postscript: with the changes in the rules planned for next year, a concomitant shattering of statistical norms is likely, and so this research is likely to be of little use beyond June, if it's useful at all.
[Return to the main page] |