Reflecting on 2017, I decided to return to my most popular blog topic (at least by the number of emails I get). Last time, I built a crude statistical model to predict the result of football matches. I even presented a webinar on the subject here (it’s free to sign up). During the presentation, I described a coefficient in the model that accounts for the fact that the home team tends to score more goals than the away team. This is called the home advantage (or home field advantage) and can probably be explained by a combination of physcological (e.g. familiarity with surroundings) and physical factors (e.g. travel). It occurs in various sports, including American football, baseball, basketball and soccer. Sticking to soccer/football, I mentioned in my talk how it would be interesting to see how this effect varies around the world. In which countries do the home teams enjoy the greatest advantage?
We’re going to use the same statistcal model as last time, so there won’t be any new statistical features developed in this post. Instead, it will focus on retrieving the appropriate goals data for even the most obscure leagues in the world (yes, even the Irish Premier Division) and then interactively visualising the results with D3. The full code can be found in the accompanying Jupyter notebook.
Calculating Home Field Advantage
The first consideration should probably be how to calculate home advantage. The traditional approach is to look at team matchups and check whether teams achieved better, equal or worse results at home than away. For example, let’s imagine Chlesea beat Arsenal 2-0 at home and drew 1-1 away. That would be recored as a better home result (+2 goals versus 0). This process is repeated for every opponent and so you can actually construct a trinomial distribution and test whether there was a statistically significant home field effect. This works for balanced leagues, where team play each other an equal number of times home and away. While this holds for Europe’s most famous leagues (e.g. EPL, La Liga), there are various leagues where teams play each other threes times (e.g. Ireland, Montenegro, Tajikistan aka The Big Leagues) or even just once (e.g Argetnina, Libya and to a lesser extent MLS (balanced for teams within the same conference)). There’s also issues with postponements and abandonments rendering some leagues slightly unbalanced (e.g. Sri Lanka). For those reasons, we’ll opt for a different (though not necessarily better) approach.
In the previous post, we built a model for the EPL 2016/17 season, using the number of goals scored in the past to predict future results. Looking at the model coefficients again, you see the home
coefficient has a value of approximately 0.3. By taking the exponent of this value (), it tells us that the home team are generally 1.35 times more likely to score than the away team. In case you don’t recall, the model accounts for team strength/weakness by including coefficients for each team (e.g 0.07890 and -0.96194 for Chelsea and Sunderland, respectively).
Let’s see how this value compares with the lower divisions in England over the past 10 years. We’ll pull the data from football-data.co.uk, which can loaded in directly using the url link for each csv file. First, we’ll design a function that will take a dataframe of match results as an input and return the home field advantage (plus confidence interval limits) for that league.
# importing the tools required for the Poisson regression model
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn
def get_home_team_advantage(goals_df, pval=0.05):
# extract relevant columns
model_goals_df = goals_df[['HomeTeam','AwayTeam','FTHG','FTAG']]
# rename goal columns
model_goals_df = model_goals_df.rename(columns={'FTHG': 'HomeGoals', 'FTAG': 'AwayGoals'})
# reformat dataframe for the model
goal_model_data = pd.concat([model_goals_df[['HomeTeam','AwayTeam','HomeGoals']].assign(home=1).rename(
columns={'HomeTeam':'team', 'AwayTeam':'opponent','HomeGoals':'goals'}),
model_goals_df[['AwayTeam','HomeTeam','AwayGoals']].assign(home=0).rename(
columns={'AwayTeam':'team', 'HomeTeam':'opponent','AwayGoals':'goals'})])
# build poisson model
poisson_model = smf.glm(formula="goals ~ home + team + opponent", data=goal_model_data,
family=sm.families.Poisson()).fit()
# output model parameters
poisson_model.summary()
return np.concatenate((np.array([poisson_model.params['home']]),
poisson_model.conf_int(alpha=pval).values[-1]))
I’ve essentially combined various parts of the previous post into one convenient function. If it looks a little strange, then I suggest you consult the original post. Okay, we’re ready to start calculating some home advantage scores.
# home field advantage for EPL 2016/17 season
get_home_team_advantage(pd.read_csv("http://www.football-data.co.uk/mmz4281/1617/E0.csv"))
array([ 0.2838454, 0.16246 , 0.4052308])
It’s as easy as that. Feed a url from football-data.co.uk into the function and it’ll quickly tell you the statistical advantage enjoyed by home teams in that league. Note that the latter two values repesent the left and right limit of the 95% confidence interval around the mean value. The first value in the array is actually just the log of the number of goals scored by the home team divided by the total number of away goals.
temp_goals_df = pd.read_csv("http://www.football-data.co.uk/mmz4281/1617/E0.csv")
[np.exp(get_home_team_advantage(temp_goals_df)[0]),
np.sum(temp_goals_df['FTHG'])/float(np.sum(temp_goals_df['FTAG']))]
[1.3282275711159723, 1.3282275711159737]
The goals ratio calculation is obviously much simpler and definitely more intuitive. But it doesn’t allow me to reference my previous post as much (link link link) and it fails to provide any uncertainty around the headline figure. Let’s plot the home advantage figure for the top 5 divisions of the English league pyramid for since 2005. You can remove those hugely informative confidence interval bars by switching the toggle.
It’s probably more apparent without those hugely informative confidence interval bars, but it seems that the home advantage score decreases slightly as you move down the pyramid (analysis by Sky Sports produced something similar). This might make sense for two reasons. Firstly, bigger teams generally have larger stadiums and more supporters, which could strengthen the home field advantage. Secondly, as you go down the leagues, I suspect the quality gap between teams narrows. Taking it to an extreme, when I used to play Sunday league football, it didn’t really matter where we played… we still lost. In that sense, one must be careful comparing the home advantage between leagues, as it will be affected by the relative team strengths within those leagues. For example, a league with a very dominant team (or teams) will record a lower home advantage score, as that dominant team will score goals home and away with little difference (Man Utd would probably beat Cork City 6-0 at Old Trafford and Turners Cross!).
Having warned about the dangers of comparing different leagues with this approach, let’s now compare the top five leagues in Europe over the same time period as before.
Honestly, there’s not much going on there. With the poissble exception of the Spanish La Liga since 2010, the home field advantage enjoyed by the teams in each league is broadly similar (and that’s before we bring in the idea of confidence intervals and hypothesis testing).
Home Advantage Around the World
To find more interesting contrasts, we must venture to crappier and more corrupt leagues. My hunch is that home advantage would be negligible in countries where the overall quality (team, infastructure, etc.) is very low. And by low, I mean leagues worse than the Irish Premier Division (yes, they exist). Unfortunately, the historical results for such leagues are not available on football-data.co.uk. Instead, we’ll scrape the data off betexplorer. I’m extremely impressed by the breadth of this site. You can even retrieve past results for the French overseas department of Réunion. Fun fact: Dimtri Payet spent the 2004 season at AS Excelsior of the Réunion Premier League.
We’ll use Scrapy to pull the appropriate information off the website. If you’ve never used Scrapy before, then you should check out this post. I won’t spend too long on this part, but you can find the full code here.
You don’t actually need to run your own spider, as I’ve shared the output to my GitHub account. We can import the json file in directly using pandas.
all_league_goals = pd.read_json(
"https://raw.githubusercontent.com/dashee87/blogScripts/master/files/all_league_goals.json")
# reorder the columns to it a bit more logical
all_league_goals = all_league_goals[['country', 'league', 'date', 'HomeTeam',
'AwayTeam', 'FTHG', 'FTAG', 'awarded']]
all_league_goals.head()
country | league | date | HomeTeam | AwayTeam | FTHG | FTAG | awarded | |
---|---|---|---|---|---|---|---|---|
0 | Albania | Super League 2016/2017 | 2017-05-27 | Korabi Peshkopi | Flamurtari | 0 | 3 | False |
1 | Albania | Super League 2016/2017 | 2017-05-27 | Laci | Teuta | 2 | 1 | False |
2 | Albania | Super League 2016/2017 | 2017-05-27 | Luftetari Gjirokastra | Kukesi | 1 | 0 | False |
3 | Albania | Super League 2016/2017 | 2017-05-27 | Skenderbeu | Partizani | 2 | 2 | False |
4 | Albania | Super League 2016/2017 | 2017-05-27 | Vllaznia | KF Tirana | 0 | 0 | False |
Hopefully, that’s all relatively clear. You’ll notice that it’s very similar to the format used by football-data, which means that we can feed this dataframe into the get_home_team_advantage
function. Sometimes, matches are awarded due to one team fielding an ineligible player or crowd trouble. We should probably exclude such matches from the home field advantage calculations.
# little bit of data cleansing to remove fixtures that were abandoned/awarded/postponed
all_league_goals = all_league_goals[~all_league_goals['awarded']]
all_league_goals = all_league_goals[all_league_goals['FTAG']!='POSTP.']
all_league_goals = all_league_goals[all_league_goals['FTAG']!='CAN.']
all_league_goals[['FTAG', 'FTHG']] = all_league_goals[['FTAG', 'FTHG']].astype(int)
We’re ready to put it all together. I’ll omit the code (though it can be found here), but we’ll loop through each country and league combination (just in case you decide to include multiple leagues from the same country) and calculate the home advantage score, plus its confidence limits as well as some other information for each league (number of teams, average number of goals in each match). I’ve converted the pandas output to a datatables table that you can interactively filter and sort.
country | league | # games | # teams | avg_goals | home_adv score | left_tail | right_tail | |
---|---|---|---|---|---|---|---|---|
1 | Nigeria | Premier League 2017 | 379 | 20 | 2.011 | 1.195 | 1.027 | 1.363 |
2 | Haiti | Championnat National 2017 | 237 | 16 | 1.717 | 0.741 | 0.533 | 0.949 |
3 | Algeria | Ligue 1 2016/2017 | 238 | 16 | 2.092 | 0.698 | 0.512 | 0.884 |
4 | Ghana | Premier League 2017 | 238 | 16 | 2.202 | 0.676 | 0.494 | 0.857 |
5 | Bolivia | Liga de Futbol Prof 2016/2017 | 132 | 12 | 3.432 | 0.624 | 0.431 | 0.818 |
6 | Guatemala | Liga Nacional 2016/2017 | 264 | 12 | 2.155 | 0.620 | 0.448 | 0.792 |
7 | Benin | Championnat National 2017 | 162 | 19 | 1.778 | 0.571 | 0.330 | 0.811 |
8 | USA | MLS 2017 | 374 | 22 | 2.968 | 0.538 | 0.416 | 0.660 |
9 | Peru | Primera Division 2017 | 238 | 16 | 2.681 | 0.520 | 0.359 | 0.680 |
10 | Indonesia | Liga 1 2017 | 304 | 18 | 2.888 | 0.515 | 0.378 | 0.651 |
11 | Togo | Championnat National 2016/2017 | 181 | 14 | 1.934 | 0.510 | 0.293 | 0.726 |
12 | Uzbekistan | Professional Football League 2017 | 233 | 16 | 2.571 | 0.503 | 0.338 | 0.668 |
13 | Mozambique | Mocambola 2017 | 240 | 16 | 1.867 | 0.501 | 0.310 | 0.692 |
14 | Angola | Girabola 2017 | 239 | 16 | 2.151 | 0.499 | 0.321 | 0.678 |
15 | Greece | Super League 2016/2017 | 240 | 16 | 2.317 | 0.499 | 0.328 | 0.671 |
16 | Tunisia | Ligue Professionnelle 1 2016/2017 | 112 | 16 | 2.098 | 0.495 | 0.231 | 0.759 |
17 | Albania | Super League 2016/2017 | 180 | 10 | 1.889 | 0.488 | 0.269 | 0.707 |
18 | Sudan | Premier League 2017 | 306 | 18 | 2.261 | 0.486 | 0.332 | 0.639 |
19 | Tanzania | Ligi Kuu Bara 2016/2017 | 239 | 16 | 1.971 | 0.480 | 0.294 | 0.665 |
20 | Colombia | Liga Aguila 2017 | 400 | 20 | 2.145 | 0.465 | 0.328 | 0.603 |
21 | Ecuador | Serie A 2017 | 263 | 12 | 2.605 | 0.454 | 0.300 | 0.608 |
22 | Honduras | Liga Nacional 2016/2017 | 180 | 10 | 2.828 | 0.452 | 0.273 | 0.630 |
23 | Ethiopia | Premier League 2016/2017 | 239 | 16 | 1.837 | 0.433 | 0.241 | 0.625 |
24 | Morocco | Botola Pro 2016/2017 | 240 | 16 | 2.229 | 0.405 | 0.232 | 0.578 |
25 | India | I-League 2017 | 90 | 10 | 2.500 | 0.405 | 0.139 | 0.672 |
26 | Montenegro | Prva Crnogorska Liga 2016/2017 | 197 | 12 | 2.020 | 0.398 | 0.196 | 0.599 |
27 | Croatia | 1. HNL 2016/2017 | 180 | 10 | 2.417 | 0.396 | 0.204 | 0.588 |
28 | Zimbabwe | Premier Soccer League 2017 | 305 | 18 | 2.023 | 0.389 | 0.228 | 0.549 |
29 | Kosovo | Superliga 2016/2017 | 196 | 12 | 2.383 | 0.386 | 0.200 | 0.571 |
30 | Sierra Leone | Premier League 2014 | 90 | 14 | 1.833 | 0.376 | 0.060 | 0.691 |
31 | France | Ligue 1 2016/2017 | 379 | 20 | 2.615 | 0.375 | 0.248 | 0.502 |
32 | Malawi | Super League 2017 | 239 | 16 | 2.331 | 0.372 | 0.203 | 0.541 |
33 | Costa Rica | Primera Division 2016/2017 | 264 | 12 | 2.689 | 0.370 | 0.221 | 0.520 |
34 | Norway | Eliteserien 2017 | 240 | 16 | 2.842 | 0.368 | 0.215 | 0.520 |
35 | Bulgaria | Parva Liga 2016/2017 | 182 | 14 | 2.467 | 0.365 | 0.177 | 0.553 |
36 | Russia | Premier League 2016/2017 | 240 | 16 | 2.133 | 0.363 | 0.187 | 0.539 |
37 | Kazakhstan | Premier League 2017 | 198 | 12 | 2.465 | 0.361 | 0.180 | 0.542 |
38 | Belgium | Jupiler League 2016/2017 | 239 | 16 | 2.736 | 0.359 | 0.203 | 0.515 |
39 | FYR of Macedonia | First League 2016/2017 | 180 | 10 | 2.539 | 0.349 | 0.163 | 0.535 |
40 | Senegal | Ligue 1 2016/2017 | 181 | 14 | 2.204 | 0.348 | 0.149 | 0.547 |
41 | Azerbaijan | Premier League 2016/2017 | 111 | 8 | 2.234 | 0.346 | 0.094 | 0.599 |
42 | Moldova | Divizia Nationala 2016/2017 | 165 | 11 | 2.539 | 0.341 | 0.147 | 0.536 |
43 | Slovakia | Fortuna liga 2016/2017 | 184 | 12 | 2.690 | 0.339 | 0.159 | 0.518 |
44 | Cameroon | Elite One 2017 | 303 | 18 | 1.795 | 0.337 | 0.166 | 0.508 |
45 | Jamaica | Premier League 2016/2017 | 198 | 12 | 2.192 | 0.336 | 0.145 | 0.527 |
46 | Réunion | Regionale 1 2017 | 182 | 14 | 2.610 | 0.336 | 0.153 | 0.518 |
47 | Venezuela | Primera Division 2017 | 303 | 18 | 2.482 | 0.331 | 0.186 | 0.476 |
48 | Portugal | Primeira Liga 2016/2017 | 306 | 18 | 2.379 | 0.327 | 0.180 | 0.474 |
49 | South Africa | Premier League 2016/2017 | 240 | 16 | 2.242 | 0.322 | 0.151 | 0.494 |
50 | Germany | Bundesliga 2016/2017 | 306 | 18 | 2.866 | 0.315 | 0.181 | 0.449 |
51 | Uganda | Premier League 2016/2017 | 237 | 16 | 2.135 | 0.313 | 0.137 | 0.490 |
52 | Guinea | Ligue 1 2016/2017 | 181 | 14 | 2.044 | 0.312 | 0.105 | 0.519 |
53 | Thailand | Thai Premier League 2017 | 306 | 18 | 3.389 | 0.309 | 0.186 | 0.432 |
54 | Yemen | Division 1 2013/2014 | 180 | 14 | 2.322 | 0.308 | 0.114 | 0.503 |
55 | Zambia | Super League 2017 | 379 | 20 | 2.003 | 0.308 | 0.164 | 0.452 |
56 | Kyrgyzstan | Top Liga 2017 | 60 | 6 | 2.950 | 0.307 | 0.009 | 0.606 |
57 | Hungary | OTP Bank Liga 2016/2017 | 198 | 12 | 2.631 | 0.307 | 0.133 | 0.481 |
58 | Namibia | MTC Premiership 2015/2016 | 240 | 16 | 2.412 | 0.304 | 0.139 | 0.469 |
59 | China | Super League 2017 | 240 | 16 | 3.050 | 0.303 | 0.156 | 0.449 |
60 | Niger | Ligue 1 2016/2017 | 181 | 14 | 2.171 | 0.301 | 0.101 | 0.502 |
61 | Iraq | Super League 2016/2017 | 354 | 20 | 2.110 | 0.299 | 0.153 | 0.446 |
62 | Netherlands | Eredivisie 2016/2017 | 306 | 18 | 2.889 | 0.296 | 0.163 | 0.430 |
63 | Serbia | Super Liga 2016/2017 | 239 | 16 | 2.364 | 0.290 | 0.124 | 0.457 |
64 | Palestine | West Bank League 2016/2017 | 131 | 12 | 2.450 | 0.287 | 0.066 | 0.508 |
65 | England | Premier League 2016/2017 | 380 | 20 | 2.800 | 0.284 | 0.162 | 0.405 |
66 | Gabon | Championnat D1 2016/2017 | 163 | 14 | 2.307 | 0.282 | 0.077 | 0.486 |
67 | Brazil | Serie A 2017 | 380 | 20 | 2.429 | 0.281 | 0.151 | 0.412 |
68 | Turkmenistan | Yokary Liga 2017 | 143 | 9 | 2.916 | 0.278 | 0.084 | 0.472 |
69 | Spain | LaLiga 2016/2017 | 380 | 20 | 2.942 | 0.263 | 0.144 | 0.381 |
70 | Poland | Ekstraklasa 2016/2017 | 240 | 16 | 2.767 | 0.260 | 0.107 | 0.414 |
71 | Czech Republic | 1. Liga 2016/2017 | 240 | 16 | 2.488 | 0.259 | 0.098 | 0.421 |
72 | Italy | Serie A 2016/2017 | 379 | 20 | 2.955 | 0.257 | 0.139 | 0.375 |
73 | Wales | Premier League 2016/2017 | 132 | 12 | 2.970 | 0.246 | 0.047 | 0.446 |
74 | New Zealand | Football Championship 2016/2017 | 90 | 10 | 3.567 | 0.244 | 0.024 | 0.465 |
75 | Republic of the Congo | Ligue 1 2017 | 300 | 18 | 2.237 | 0.244 | 0.091 | 0.397 |
76 | Kenya | Premier League 2017 | 304 | 18 | 2.026 | 0.244 | 0.084 | 0.403 |
77 | Ukraine | Pari-Match League 2016/2017 | 132 | 12 | 2.462 | 0.241 | 0.022 | 0.460 |
78 | Austria | Tipico Bundesliga 2016/2017 | 180 | 10 | 2.711 | 0.239 | 0.060 | 0.418 |
79 | Switzerland | Super League 2016/2017 | 180 | 10 | 3.233 | 0.235 | 0.071 | 0.398 |
80 | Mexico | Primera Division 2016/2017 | 306 | 18 | 2.634 | 0.234 | 0.095 | 0.373 |
81 | Turkey | Super Lig 2016/2017 | 305 | 18 | 2.708 | 0.231 | 0.094 | 0.369 |
82 | Bosnia and Herzegovina | Premier League 2016/2017 | 132 | 12 | 2.242 | 0.231 | 0.001 | 0.460 |
83 | Romania | Liga 1 2016/2017 | 181 | 14 | 2.376 | 0.223 | 0.033 | 0.413 |
84 | Philippines | PFL 2017 | 109 | 8 | 3.202 | 0.218 | 0.007 | 0.430 |
85 | Malaysia | Super League 2017 | 132 | 12 | 3.091 | 0.217 | 0.021 | 0.412 |
86 | Australia | A-League 2016/2017 | 135 | 10 | 3.030 | 0.213 | 0.018 | 0.409 |
87 | DR Congo | Super Ligue 2016/2017 | 195 | 26 | 2.205 | 0.198 | 0.006 | 0.391 |
88 | Syria | Premier League 2016/2017 | 239 | 16 | 2.180 | 0.197 | 0.025 | 0.370 |
89 | Argentina | Primera Division 2016/2017 | 450 | 30 | 2.276 | 0.195 | 0.071 | 0.318 |
90 | Burundi | Ligue A 2016/2017 | 239 | 16 | 2.138 | 0.192 | 0.018 | 0.366 |
91 | Cyprus | First Division 2016/2017 | 182 | 14 | 2.879 | 0.191 | 0.019 | 0.363 |
92 | Sweden | Allsvenskan 2017 | 240 | 16 | 2.779 | 0.189 | 0.037 | 0.342 |
93 | Tajikistan | Vysshaya Liga 2017 | 84 | 8 | 2.702 | 0.189 | -0.077 | 0.456 |
94 | Northern Ireland | NIFL Premiership 2016/2017 | 195 | 12 | 2.933 | 0.188 | 0.023 | 0.353 |
95 | Scotland | Premiership 2016/2017 | 198 | 12 | 2.687 | 0.187 | 0.016 | 0.358 |
96 | Saudi Arabia | Saudi Professional League 2016/2017 | 182 | 14 | 3.016 | 0.186 | 0.018 | 0.354 |
97 | Iceland | Pepsideild 2017 | 132 | 12 | 3.053 | 0.184 | -0.012 | 0.380 |
98 | Nicaragua | Primera Division 2016/2017 | 179 | 10 | 3.156 | 0.182 | 0.016 | 0.347 |
99 | Denmark | Superliga 2016/2017 | 182 | 14 | 2.632 | 0.180 | 0.000 | 0.360 |
100 | Lesotho | Premier League 2016/2017 | 180 | 14 | 2.428 | 0.180 | -0.009 | 0.368 |
101 | Vietnam | V-League 2017 | 182 | 14 | 2.912 | 0.174 | 0.003 | 0.345 |
102 | Rwanda | National Football league 2016/2017 | 239 | 16 | 2.134 | 0.174 | -0.001 | 0.348 |
103 | Ireland | Premier Division 2017 | 198 | 12 | 2.773 | 0.172 | 0.004 | 0.341 |
104 | Estonia | Meistriliiga 2017 | 180 | 10 | 3.656 | 0.171 | 0.017 | 0.324 |
105 | United Arab Emirates | UAE League 2016/2017 | 182 | 14 | 3.137 | 0.165 | 0.000 | 0.330 |
106 | El Salvador | Primera Division 2016/2017 | 263 | 12 | 2.601 | 0.161 | 0.010 | 0.311 |
107 | Luxembourg | National Division 2016/2017 | 182 | 14 | 3.319 | 0.153 | -0.007 | 0.313 |
108 | Bangladesh | Premier League 2016 | 132 | 12 | 2.591 | 0.152 | -0.060 | 0.365 |
109 | Mauritania | Championnat D1 2016/2017 | 181 | 14 | 2.453 | 0.149 | -0.038 | 0.336 |
110 | Swaziland | MTN Premier League 2016/2017 | 131 | 12 | 2.634 | 0.146 | -0.065 | 0.358 |
111 | Trinidad and Tobago | Pro League 2017 | 90 | 10 | 2.922 | 0.145 | -0.098 | 0.387 |
112 | Malta | Premier League 2016/2017 | 197 | 12 | 2.878 | 0.145 | -0.021 | 0.310 |
113 | Chile | Primera Division 2016/2017 | 120 | 16 | 2.892 | 0.145 | -0.068 | 0.357 |
114 | Israel | Ligat ha'Al 2016/2017 | 182 | 14 | 2.132 | 0.145 | -0.055 | 0.344 |
115 | Botswana | Premier League 2016/2017 | 240 | 16 | 2.317 | 0.144 | -0.023 | 0.311 |
116 | Oman | Professional League 2016/2017 | 182 | 14 | 2.758 | 0.136 | -0.040 | 0.311 |
117 | Iran | Persian Gulf Pro League 2016/2017 | 240 | 16 | 2.100 | 0.135 | -0.040 | 0.310 |
118 | Bermuda | Premier League 2016/2017 | 90 | 10 | 3.144 | 0.134 | -0.099 | 0.368 |
119 | Lithuania | A Lyga 2017 | 112 | 8 | 2.580 | 0.132 | -0.099 | 0.363 |
120 | Egypt | Premier League 2016/2017 | 305 | 18 | 2.256 | 0.122 | -0.027 | 0.272 |
121 | Faroe Islands | Premier League 2017 | 134 | 10 | 3.187 | 0.122 | -0.068 | 0.312 |
122 | Burkina Faso | Premier League 2016/2017 | 240 | 16 | 1.721 | 0.121 | -0.072 | 0.314 |
123 | Finland | Veikkausliiga 2017 | 198 | 12 | 2.737 | 0.120 | -0.049 | 0.289 |
124 | Seychelles | Division One 2017 | 132 | 12 | 3.303 | 0.114 | -0.075 | 0.302 |
125 | Japan | J-League 2017 | 306 | 18 | 2.592 | 0.114 | -0.026 | 0.253 |
126 | Myanmar | National League 2017 | 128 | 12 | 2.594 | 0.113 | -0.104 | 0.329 |
127 | Ivory Coast | Ligue 1 2016/2017 | 182 | 14 | 1.802 | 0.110 | -0.107 | 0.327 |
128 | Georgia | Erovnuli Liga 2017 | 179 | 10 | 2.810 | 0.108 | -0.067 | 0.283 |
129 | Qatar | Premier League 2016/2017 | 182 | 14 | 3.132 | 0.105 | -0.059 | 0.270 |
130 | South Korea | K-League Classic 2017 | 198 | 12 | 2.737 | 0.102 | -0.067 | 0.271 |
131 | Slovenia | Prva liga 2016/2017 | 180 | 10 | 2.572 | 0.099 | -0.083 | 0.282 |
132 | Lebanon | Premier League 2016/2017 | 131 | 12 | 2.771 | 0.094 | -0.113 | 0.300 |
133 | San Marino | Campionato Sammarinese 2016/2017 | 154 | 15 | 3.143 | 0.085 | -0.095 | 0.264 |
134 | Belarus | Vysshaya Liga 2017 | 240 | 16 | 2.333 | 0.071 | -0.094 | 0.237 |
135 | Mali | Premiere Division 2016 | 162 | 19 | 2.031 | 0.064 | -0.153 | 0.281 |
136 | Gibraltar | Premier Division 2016/2017 | 132 | 10 | 3.288 | 0.064 | -0.126 | 0.253 |
137 | Hong Kong | Premier League 2016/2017 | 110 | 11 | 3.427 | 0.058 | -0.144 | 0.260 |
138 | Singapore | S.League 2017 | 108 | 9 | 2.981 | 0.056 | -0.163 | 0.275 |
139 | Sri Lanka | Champions League 2017 | 143 | 18 | 3.266 | 0.051 | -0.133 | 0.235 |
140 | Cape Verde | Campeonato Nacional 2017 | 36 | 12 | 2.389 | 0.047 | -0.376 | 0.469 |
141 | Djibouti | Division 1 2016/2017 | 90 | 10 | 3.978 | 0.045 | -0.163 | 0.252 |
142 | Uruguay | Primera Division 2017 | 240 | 16 | 2.729 | 0.034 | -0.120 | 0.187 |
143 | Gambia | GFA League 2016/2017 | 131 | 12 | 1.908 | 0.032 | -0.218 | 0.281 |
144 | Canada | CSL 2017 | 56 | 8 | 4.304 | 0.025 | -0.228 | 0.277 |
145 | Armenia | Premier League 2016/2017 | 90 | 6 | 2.200 | 0.020 | -0.258 | 0.299 |
146 | Panama | LPF 2016/2017 | 180 | 10 | 2.083 | 0.005 | -0.197 | 0.208 |
147 | Kuwait | Premier League 2016/2017 | 210 | 15 | 3.048 | -0.000 | -0.155 | 0.155 |
148 | Mauritius | Mauritian League 2016/2017 | 179 | 10 | 2.922 | -0.001 | -0.173 | 0.170 |
149 | Andorra | Primera Divisió 2016/2017 | 83 | 8 | 3.265 | -0.005 | -0.245 | 0.236 |
150 | Latvia | SynotTip Virslīga 2017 | 96 | 8 | 2.417 | -0.006 | -0.264 | 0.251 |
151 | Libya | Premier League 2017 | 83 | 28 | 2.265 | -0.015 | -0.324 | 0.293 |
152 | Dominican Republic | LDF 2017 | 90 | 10 | 2.567 | -0.021 | -0.280 | 0.239 |
153 | Cambodia | C-League 2017 | 132 | 12 | 3.864 | -0.031 | -0.205 | 0.142 |
154 | Paraguay | Primera Division 2017 | 264 | 12 | 2.534 | -0.033 | -0.184 | 0.119 |
155 | Jordan | Premier League 2016/2017 | 132 | 12 | 2.235 | -0.034 | -0.262 | 0.194 |
156 | Bahrain | Premier League 2016/2017 | 90 | 10 | 2.556 | -0.048 | -0.310 | 0.213 |
157 | Pakistan | Premier League 2014/2015 | 132 | 12 | 2.333 | -0.065 | -0.288 | 0.159 |
158 | Liberia | LFA First Division 2016/2017 | 125 | 12 | 2.248 | -0.089 | -0.323 | 0.145 |
159 | Somalia | Nation Link Telecom Championship 2016/2017 | 90 | 10 | 2.922 | -0.153 | -0.396 | 0.090 |
160 | Maldives | Dhivehi Premier League 2017 | 56 | 8 | 3.304 | -0.370 | -0.782 | 0.042 |
Focusing on the home_adv score
column, teams in Nigeria by far enjoy the greatest benefit from playing at home (score = 1.195). In other words, home teams scored 3.3 (= ) times more goals than their opponents. This isn’t new information and can be attributed to a combination of corruption (e.g. bribing referees) and violent fans. In fact, my motivation for this post was to identify more football corruption hotspots. Alas, when it comes to home turf invincibility, it seems Nigeria are the World Cup winners.
Fifteen leagues have a negative home_advantage_score
, meaning that visiting teams actually scored more goals than their hosts- though none was statistically significant. By some distance, the Maldives records the most negative score. Luckily, I’ve twice researched this beautiful archipelago and I’m aware that all matches in the Dhiveli Premier League are played at the national stadium in Malé (much like the Gibraltar Premier League). So it would make sense that there’s no particular advantage gained by the home team. Libya is another interesting example. Owing to security issues, all matches in the Libyan Premier League are played in neutral venues with no spectators present. Quite fittingly, it returned a home advantage score just off zero. Generally speaking, the leagues with near zero home advantage come from small countries (minimal inconvenience for travelling teams) with a small number of teams and they tend to share stadiums.
If you sort the avg_goals
column, you’ll see the semi-pro Canadian Soccer League is the place to be for goals (average = 4.304). But rather than sifting through that table or explaining the results with words, the most intuitive way to illustrate this type of data is with a map of world. This might also help to clarify whether there’s any geographical influence on the home advantage effect. Again, I won’t go into the details (an appendix can be found in the Jupyter notebook), but I built a map using the JavaScript library, D3. And by built I mean I adapted the code from this post and this post. Though a little outdated now, I found this post quite useful too. Finally, I think this post shows off quite well what you can do with maps using D3.
And here it is! The country colour represents its home_advantage_score
. You can zoom in and out and hover over a country to reveal a nice informative overlay; use the radio buttons to switch between home advantage and goals scored. I recommend viewing it on desktop (mobile’s a bit jumpy) and on Chrome (sometimes have security issues with Firefox).
It’s not scientifically rigorous (not in academia any more, baby!), but there’s evidence for some geographical trends. For example, it appears that home advantage is stronger in Africa and South America compared to Western and Central Europe, with the unstable warzones of Libya, Somalia and Paraguay (?) being notable exceptions. As for average goals, Europe boasts stonger colours compared to Africa, though South East Asia seems to be the global hotspot for goals. North America is also quite dark, but you can debate whether Canada should be coloured grey, as the best Canadian teams belong to the American soccer system.
Conclusion
Using a previously described model and some JavaScript, this post explored the so called home advantage in football leagues all over the world (including Réunion). I don’t think it uncovered anything particularly amazing: different leagues have different properties and don’t bet on the away team in the Nigerian league. You can play around with the Python code here. Thanks for reading!
Leave a Comment