Let’s get this out of the way. Based on Nate Silver’s models, Brazil had a 1 in 4000 chance of losing by six goals. By that same model, Brazil had a 65% chance of victory, even without Neymar and Thiago Silva . And there is no statistical model I would trust more than something Nate Silver created (John Hollinger being a not-too-close second).
Even if you take the reasoning that statistical models don’t really care about the difference at the extremes, that 1-in-4000 is not that distinguishable from 1-in-400, this beatdown was still an all time outlier, as in this was one of the most unexpected scorelines in history.
The problem is this. These statistical models treat players as interchangeable parts worth a set amount of goals. They make the assumption that it’s possible to simply pop someone else in and recapture some of that production. In a lead-up to the semifinal between Brazil and Germany, Silver posits that since Neymar is worth .4 goals per 90 minutes, you can plug in a substitute like Jo and get back .21 goals . Sidebar: shame that Jo didn’t play a single minute in that game, huh?
Plus-minus models don’t capture total impact.
The closest thing we have to a perfect model is the expected-player-value in basketball. EPV tries to account for the impact of every single movement by every player on the court . EPV though is still in its infancy, and has a lot of refining left to be done. There’s no way a plus-minus model then, can realize that Neymar is transcendent enough of a player to hide Fred’s deficiencies. Without Neymar, not only do you need to replace his impact, a mediocre player like Fred will regress too. There’s no way a plus-minus model can see how Thiago Silva allows David Luiz to play higher up, allows Luiz to play to his best attacking tendencies. Without Silva, Luiz cannot anchor the defense by himself and his best tendencies, his ability to marauder forward as a pseudo-midfielder, turn into his worst tendencies.
It should be simple enough to realize that you cannot just plug players and hope to approximate their holistic impact. Soccer doesn’t work that way. Sports don’t work that way. Consider that in basketball, plus-minus ratings can easily be skewed to the point where Patty Mills has a higher rating than Tony Parker. And this is the real plus-minus, which is supposed to be a more complex plus-minus system.
Wake up sheeple, Patty Mills isn’t better than Tony Parker. There just happens to be a greater disparity in the quality of Mills vs. other substitute guards as opposed to Parker vs. other starting guards in the NBA. Nor does he “contribute more to his team” than Tony Parker when he’s on the court, as I’ve heard as well. It’s not an equal swap.
And for that matter, what does it say that Ricky Rubio is above Damian Lillard, and how is Pablo freaking Prigioni above Jrue Holiday?
How do we quantify chemistry?
FIFA14 tries to do this cute thing in Ultimate Team Mode where it allows your players to play better if they have good “chemistry” with the players around them. I say that in quotes because their version of chemistry amounts to nothing more than checking if the players play for the same club, in the same league, or are from the same country.
Nevertheless, that highlights an important point: it’s really hard to quantify team chemistry, which is about the single-most important intangible there is in sports (it’s an “intangible” for a reason). Brazil without Neymar and Silva looked demoralized, disorganized, ragtag, or whatever other adjective you want to use. In the NFL, do you expect any decent receiver to step into the Green Bay offense and make the same back shoulder catches that Jordy Nelson makes? No, because it takes weeks and months of practice to get that sort of rhythm and timing with Aaron Rodgers. It’s called chemistry.
It may be more useful for these purposes to see how different players perform when sharing the court or field with another player. Let’s stay with the topic of the effects of an elite player on others. In the NBA playoffs this past season, Tim Duncan, per 48 minutes, scored 24 points and had 13.5 rebounds. Patty Mills, when sharing the floor with Duncan, had a +/- of 31, which is way higher than his 10.1 rating when sharing the floor with Tiago Splitter. Splitter, per 48 minutes, in the playoffs scored 16 points and had 13 rebounds . Yet it would seem too simplistic to say that it’s possible to pop Splitter in for Duncan and approximate the production, as evidenced by a more nuanced examination of +/- splits. If I had access to comparable soccer stat splits, I would run this same comparison. The point though is the same.
When looking to see how a team will perform without a certain star, it is more important to analyze the performance splits of individual players with and without that star. That is the closest thing we have to measuring chemistry. Not only do you need to replace the production of that one player, but also you need to replace the potential drop-off in play of the surrounding players. The day we have a true measure of chemistry, we will have a truly perfect stat.
We couldn’t predict Brazil’s beatdown because we didn’t quantify the effects of Neymar and Silva on surrounding players. We myopically tried to replace Neymar and Silva’s production in terms of plus-minus. Oh, that and Germany doesn’t care about home field advantage, Nate Silver.
But that’s the beauty of sports. There is no one single number to measure everything. And we’re better off for it.
2 thoughts on “The Flaw in the Models, or How’d Brazil get beat down that badly?”
Dude, what do you think plus minus is -> “Not only do you need to replace the production of that one player, but also you need to replace the potential drop-off in play of the surrounding players.” Plus minus is a measure of when the player is on the court how well does his team do.
I’m aware of what plus-minus is. You’re right, it’s a measure of how many points the team scores when a player is on the court. The whole point here is though, that the player isn’t on the court. How does the team play without him? How do the starters play with his replacement? How does the replacement play against other starter-quality players? There are a lot of contextual questions to be sorted out, not simply answerable by a glance at a box score +/-