Wednesday, November 12, 2008

Football, Luck, and Noise

I received a surprising amount of pushback via email regarding my last post about Texas Tech and the Hot Hand Theory. At first I was confused, but then I realized that many readers do not share a rather fundamental assumption I hold about football: an incredible amount of the game is determined by "luck." Now, when I say luck, I do not mean fluke events, or the ol' bounce a da ball, or things like that. What I mean is that almost any and every outcome in football is not set in stone, but rather, there is some probability that the outcome will be X, another probability that the outcome will be Y, and maybe even a chance that it will be Z.


Theological questions aside, I really think this is a rule of life and not just football. But the point is that at no point in a football game, be it success of a play or even a determination of what the other side is actually doing, do you have fixed answers. Instead, you have probabilities, and even then your probabilities are merely estimates of the actual probabilities. So when I talk about "coolly flipping coins," I mean that everything is probabalistic. Just like when Michael Jordan went to the free-throw line, no matter what any sports writer tells you, he is never destined to make the shot, or destined to make the game-winner. Tiger Woods is never destined to hit the putt, and Tom Brady or Peyton Manning were neither destined to win the Super Bowl or hit any particular pass.

Instead, it was merely "highly likely" that each was going to do those things, because each is very good at what they do. But at no point is anything determinate.

Indeed, one of the criticisms of my post was that the probabilities dramatically increase regarding offensive success because you gain more information as time goes on. But that argument doesn't hold water. If Michael Jordan can only max-out his free-throw percentage to a point, then there is no way to max out offensive production in football when at all turns you have a human (or group of them) making choices on the other side in ways that shift your probabilities. That is a far too nebulous cloud to assume certitude.

And any playcaller will tell you the same thing. As Norm Chow says, you are never quite sure what coverage they are in, but instead you take pieces of the field or pieces of the defensive front and attack those, and therein lies success. Mike Leach does not even require his guys to memorize coverages in the sense of "Hey they are in Cover 4!" Instead, they group them into things they can recognize and they probe areas. But at every stage, things are probabalistic. I've even discussed the notion that a purely random approach to offensive and defensive calls might even be optimal.

When I made the point about the hot hand theory, part of it was about how you cannot always extrapolate how good an offense is versus a defense just because they scored on a drive, or even if they scored a lot in a half or game, because the standard deviation is too high. Some people argued that things would even out over the course of a game; I think that is sort-of true, but I still think the variance is higher than they account for. But that's an empirical question we can solve later.

But another (amazing) site, Advanced NFL Stats, made the point about the difficulty of extrapolating skill levels from even successful outcomes:


Consider a very simple example game. Assume both [Pittsburgh] and [Cleveland] each get 12 1st downs in a game against each other. PIT's 1st downs come as 6 separate bunches of 2 consecutive 1st downs followed by a punt. CLE's 1st downs come as 2 bunches of 6 consecutive 1st downs resulting in 2 TDs. CLE's remaining drives are all 3-and-outs followed by a solid punt. Each team performed equally well, but the random "bunching" of successful events gave CLE a 14-0 shutout.

The bunching effect doesn't have to be that extreme to make the difference in a game, but it illustrates my point. Natural and normal phenomena can conspire to overcome the difference between skill, talent, ability, strategy, and everything else that makes one team "better" than another.


And adding support for my argument about the high degree of variance, Advanced NFL Stats went on to try to nail down exactly how much in the way of outcomes can be attributed to skill versus luck in the NFL. You can read the details of the explanation there, and NFL teams obviously are closer in relative skill levels than most college teams, but the results are nevertheless striking:


...By comparing the two distributions, we can calculate that of the 160 season outcomes, only 78 of them differ from what we'd expect from a pure luck distribution. That's only 48%, which would suggest that in 52% of NFL games, luck is the deciding factor!

There might yet be more to it than these calculations, but the point is that variance is high in outcomes in football games. This is not to say that skill is unimportant, but the lesson is instead that you cannot merely look to actual statistics and actual outcomes to determine who is the best. Football games are tests of ranges of probabilities put up against one another:

Will all eleven players execute their assignments; will the quarterback make the right reads; will the coaches accurately assess the opponent's schemes; will the sun shine in the receiver's eye; will the ball become sweaty where the ballcarrier holds it; will there be an injury on the play; and if these factors randomly cut 50/50, will they work in our favor enough times in a row to get us in field goal or touchdown range.

In other words, lots of football fans, players, and even coaches suffer from a Fooled by Randomness problem when they analyze the game. Football is more quantum mechanics than it is Newtonian physics (though with a splash of game theory). Yet the belief in absolute determinism is natural: we intuitively want results to be indicative of objective truths, and it is much less complex to analyze easy to observe statistics and outcomes than it is to try to estimate the underlying probabilities. But football doesn't always give us large enough sample sizes to believe that results are as instructive as we'd like. So, if we want real answers, we have to admit that there's lots of luck around.

(And if you're a fan of the Michigan Wolverines, this gives you an (incredibly weak) excuse: "It's all the result of bad luck!")

8 comments:

  1. I appreciate your analysis regarding luck and the "hot hand"... but I have a question relative to the hot hand:

    What about getting a defense "on their heels"? Catch a defense on a blitz with a screen, and they may not blitz as much... thus increasing the probability of completing a pass later in the game or even on that drive. Doesn't this play a significant factor in getting an edge offensively?

    ReplyDelete
  2. I had a discussion with a friend about the variance of football games versus baseball games. Winning 100 games in baseball (our of 160 is a very good year, but winning 10 games (out of 16) in NFL is just above average. 14-2 would be compared to winning 100 games in baseball, which, to me, shows variance in fotbball isn't nearly as great as variance in baseball, and I think this shows in normal observations.

    Bu, doesn't that also show "luck" really doesn't play that big a role in determining football games?

    ReplyDelete
  3. beernutts:

    "Luck," defined as I did above, plays a significantly bigger factor in football than in baseball. The primary reason is that you can largely model a baseball game as a series of one on one matchups: batters versus hitters. Yes, fielding can sometimes be important, but it only ends up being determinative in a very small class of plays.

    Conversely, to model a football play, you have to model twenty-two different players and their varying, confusing, and unpredictable interactions. This increases the complexity -- and thus the luckiness -- of football games exponentially. This is one reason why the whole "Moneyball" thing is so big in baseball; you can actually do it. Compare this with the kinds of stuff going on at Football Outsiders: pedestrian, uninspired, and ultimately unhelpful to actual football practitioners.

    Anyway, the other reason that football is not less complex is because the exam you use cuts exactly the other way. If 10 wins is normal and 14 is exceptional, then how thin is the margin between great and average? Only four games. Compare that to baseball where, as you say, the sheer volume of games is way higher. It's the converse of the law of large numbers: small sample sizes lead to increased unpredictability and hence luck regarding outcomes.

    ReplyDelete
  4. I think that an important variable to consider when comparing baseball to football in this context is the starting rotation. I am not a statistician, but I think it would be more appropriate to compare the win/loss ratios of individual pitches to the win/loss ratios of individual football teams.

    For instance, let us look at this years NL Cy Young winner, Tim Lincecum, and the other starting pitchers on his team:

    Tim Lincecum: 18-5, 33 starts.
    Matt Cain: 8-14, 34 starts.
    Barry Zito: 10-17, 32 starts.
    Jonathan Sanchez: 9-12, 29 starts.
    Kevin Correia: 3-8, 19 starts.

    If viewed in this manner, Lincecum won ~78% of his decisions and ~55% of his starts. These percentages would translate to approximately a 12-4 record on purely W/L, and 8-8 including all starting appearances.

    Considering the Giants had an overall record of 72-90 (~44%), which would translate to 6-10/7-9 in the NFL, we see how valuable Lincecum was to the Giants, and that essentially they were a different "team" when he was on the mound.

    My point is that comparing sports may take some creativity, and probably is meaningless. Nevertheless, I do believe that football is more "complex" than baseball in that football requires the orchestration of 11 individuals simultaneously under strict time requirements. Baseball is much more situational (balls vs. strikes, runs on-base, number of outs, etc.), yet there is also more time to analyze those situation; however, hitting a baseball is a helluva lot harder than hitting a ball carrier.

    ReplyDelete
  5. Chris-I'm happy at least 1 person out there understood what I was talking about.

    I agree with all the main points in your post. 1 quibble with your comment above, however. Or maybe just a clarification.

    The sports of baseball and football (or any sport) can be compared in their luck fairly easily. You just need to compare the variance of a purely random outcome with the observed variance in team performance.

    In baseball, the better team wins at most perhaps 60% of the time. But in football, the better team wins upwards of 75% of the time. So in any one game, baseball is far more determined by luck than football.

    But there are 10 times more baseball games than football games in a season. 16 games is an incredibly small sample. So over the course of a season, football is indeed the 'luckier' sport.

    ReplyDelete
  6. I can't imagine you're not familiar with their work (though they're not on your blogroll), but the Football Outsiders do a lot of stuff with "luck" in football games. They've learned that fumble recovery is essentially a coin flip (though forcing fumbles isn't), and they stress the probabilistic nature of kicking field goals. They often talk about predictive and non-predictive performance (e.g., the New York Giants' fumble recovery luck isn't predictive; their phenomenally consistent success rushing the ball is).

    Absolutely love your site, by the way (first time, long time).

    ReplyDelete
  7. Your argument here is completely incoherent. Nobody that I know of believes that a hot streak is the result of Fate. A free-throw shooter and a flipped coin are completely different animals, and in comparing them YOU are the one who is thinking deterministically (and reductively to boot). You want to claim that statistical permutations are the real culprit behind hot streaks--meaning that Texas Tech will score a TD 40% of the time in the same way that a coin will land heads 50% of the time, which only makes sense if TT was bound to score 40% of the time over a given period--but then accuse a coach who sticks with the QB who is performing best of determinism!

    Physician, heal thyself.

    ReplyDelete
  8. Very interesting post,

    I agree with the fundamental assumptions of probabilities ruling Football, or any sports for that matter, but I disagree with your conclusions in some ways:

    Overall, I think you make good points if you assume all thing being equal.

    Not only that, I think that complete randomness would work only if their were an unlimited number of possible plays in both the offensive and defensive playbooks, which would support the all things being equal assumption and further lead to the assumption that a football player at the professional level has an equally unlimited (or limited) potential for mentally (or even physically) functioning in the game as any other player (or coach).

    2 points:

    1) The skill levels, size and mental/physical capacities of teams, coaches, and players are highly varied, as are approaches to the game and methods for conveying information. Those variables serve to increase or decrease probabilities perhaps to the advantage of one team or another.

    2) There is a certain amount of preparation that goes into a game plan that serves to increase the probability of success by the offense. For example, Michael Jordan will produce a certain probability of scoring by taking shots that he knows from experiential repetition are probable to fall. There are defenders, yes, but that seems to average out in the product of a season's scoring.

    I agree that these in no way point to a certainty of outcomes, but it points to the idea that that Football cannot be construed as "random" or a game of "luck". It is a game of probabilities, like any other, but no more random because of any characteristic of complexity.

    I agree that no one can assert that Football is deterministic in any way, but there is some determinism affecting the probability of success. Repetition, coaching ability, and mental/physical capacity have a large effect on efficacy of a team to execute. "Luck" as you have defined in terms of probabilities, takes the driver's seat. But psychology and simple physics power those probabilities.

    ReplyDelete