Update: See alternative results with ELO rankings below.
I recently got this idea while considering World Cup tickets for Brazil this year. The draw has been released, so everyone knows what teams will be playing in each match for the first stage (groups). But, if I bought a ticket for a Semi-Final in Sao-Paolo, who could be playing in that game?
One way is to look at the possible combinations that can make it to that stage. But the list of possibilities is quite large, and difficult to digest. You could hypothetically guess which way matches will go based on a good knowledge of the teams, their strategies and histories against each other. However, this method will always have some sort of subjectivity and bias involved (I would want my favourite team to progress).
An objective, simplistic, and perhaps quite inaccurate way to look at it is through the FIFA World Rankings. For example, for each match the ranking points of the teams are compared and the “better” team progresses. In this way, you can follow through to the Final and predict the winner.
The problem with this technique is that the teams with the higher points would always win their matches and the final result will always be the same. I wrote a quick simulation in Mathematica where for each match, there is a random plus or minus 20% deviation in the points of both teams. This enables a team with less ranking points to win (on a good day for them). I also gave Brazil a 20% boost in points for home advantage.
The group-stage table standings in each simulation were determined through the FIFA rankings plus/minus a random deviation of up to 20%. The second stage match results were then determined based on the simulated group standings and the matches are progressed through to the final. Running this simulation led to different results each match, meaning that the ultimate winners were generally different and took different paths each time. I then ran the simulation 100,000 times to find out which paths occurred more frequently (for fun, because that’s what nerds do). From this, the likelyhood of a team reaching a particular stage or winning The World Cup can be determined.
Here are a summary of those 100,000 virtual world cups.
Most of the major teams frequently got into the quarter finals. Brazil is surprisingly worse off than many teams, and this is likely a reflection of the tough match-ups against Netherlands and Chile (who could beat Brazil on a good, +20% day). However, they still a very high chance for making it to this stage.
There is an interesting separation in the semi-final stage between the top 4 teams and the rest of the crowd. Spain seems to be strong, due to its very high FIFA World Ranking points. The extra 20% push for Brazil’s home advantage and their path to the semis seem to help here as they are better off than Argentina, despite a lower ranking.
It is clear, if the simulation model is remotely believable, that Spain are definite favourites to be in the final. Brazil and Germany have an almost equal chance, but the rest seem to be trailing off.
The most commonly occurring final matches in the simulations were
- Spain vs. Brazil (33%)
- Spain vs. Germany (30%)
- Spain vs. Colombia (6%)
The most probable final without Spain was Argentina vs. Germany at 4.3% probability. Argentina and Germany had an almost equal chance of taking out 3rd place at 27% and 24%, followed by Brazil at 16% and Switzerland at 4%.
The plot of the World Cup winners look very similar, although much more skewed towards Spain due to a higher chance at winning any of the match ups. Interestingly, changing the random fluctuation from plus/minus 20% to plus/minus 100% did not change the results much, with Spain still favourites to win at 15% followed by Germany (10%), Brazil (9%), Argentina (9%), Colombia (7%) and Portugal (7%). This did level the playing field however, with much smaller differences in probable outcomes.
See the FIFA World Cup simulation summary for possible matches in the second stage.
Of course, this quick and dirty simulation is very likely to be inaccurate, so please don’t take these results too seriously. I definitely am not betting on anything, other than a fantastic world cup full of surprises and amazing football.
edit: Results of another simulation place Brazil as favourites (maybe a larger boost for home advantage?).
Interestingly, Brazil vs. Spain is still the most likely match, however this time Brazil has a huge advantage as it tops ELO rankings and it also receives the +20% home advantage. The top 4 are still Brazil, Spain, Germany and Argentina.