Reflecting on 2017, I decided to return to my most popular blog topic (at least by the number of emails I get). Last time, I built a crude statistical model to predict the result of football matches. I even presented a webinar on the subject here (it’s free to sign up). During the presentation, I described a coefficient in the model that accounts for the fact that the home team tends to score more goals than the away team. This is called the home advantage (or home field advantage) and can probably be explained by a combination of physcological (e.g. familiarity with surroundings) and physical factors (e.g. travel). It occurs in various sports, including American football, baseball, basketball and soccer. Sticking to soccer/football, I mentioned in my talk how it would be interesting to see how this effect varies around the world. In which countries do the home teams enjoy the greatest advantage?

We’re going to use the same statistcal model as last time, so there won’t be any new statistical features developed in this post. Instead, it will focus on retrieving the appropriate goals data for even the most obscure leagues in the world (yes, even the Irish Premier Division) and then interactively visualising the results with D3. The full code can be found in the accompanying Jupyter notebook.

Calculating Home Field Advantage

The first consideration should probably be how to calculate home advantage. The traditional approach is to look at team matchups and check whether teams achieved better, equal or worse results at home than away. For example, let’s imagine Chlesea beat Arsenal 2-0 at home and drew 1-1 away. That would be recored as a better home result (+2 goals versus 0). This process is repeated for every opponent and so you can actually construct a trinomial distribution and test whether there was a statistically significant home field effect. This works for balanced leagues, where team play each other an equal number of times home and away. While this holds for Europe’s most famous leagues (e.g. EPL, La Liga), there are various leagues where teams play each other threes times (e.g. Ireland, Montenegro, Tajikistan aka The Big Leagues) or even just once (e.g Argetnina, Libya and to a lesser extent MLS (balanced for teams within the same conference)). There’s also issues with postponements and abandonments rendering some leagues slightly unbalanced (e.g. Sri Lanka). For those reasons, we’ll opt for a different (though not necessarily better) approach.

In the previous post, we built a model for the EPL 2016/17 season, using the number of goals scored in the past to predict future results. Looking at the model coefficients again, you see the home coefficient has a value of approximately 0.3. By taking the exponent of this value (), it tells us that the home team are generally 1.35 times more likely to score than the away team. In case you don’t recall, the model accounts for team strength/weakness by including coefficients for each team (e.g 0.07890 and -0.96194 for Chelsea and Sunderland, respectively).

Let’s see how this value compares with the lower divisions in England over the past 10 years. We’ll pull the data from football-data.co.uk, which can loaded in directly using the url link for each csv file. First, we’ll design a function that will take a dataframe of match results as an input and return the home field advantage (plus confidence interval limits) for that league.

# importing the tools required for the Poisson regression model
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn

def get_home_team_advantage(goals_df, pval=0.05):
    
    # extract relevant columns
    model_goals_df = goals_df[['HomeTeam','AwayTeam','FTHG','FTAG']]
    # rename goal columns
    model_goals_df = model_goals_df.rename(columns={'FTHG': 'HomeGoals', 'FTAG': 'AwayGoals'})

    # reformat dataframe for the model
    goal_model_data = pd.concat([model_goals_df[['HomeTeam','AwayTeam','HomeGoals']].assign(home=1).rename(
                columns={'HomeTeam':'team', 'AwayTeam':'opponent','HomeGoals':'goals'}),
               model_goals_df[['AwayTeam','HomeTeam','AwayGoals']].assign(home=0).rename(
                columns={'AwayTeam':'team', 'HomeTeam':'opponent','AwayGoals':'goals'})])

    # build poisson model
    poisson_model = smf.glm(formula="goals ~ home + team + opponent", data=goal_model_data, 
                            family=sm.families.Poisson()).fit()
    # output model parameters
    poisson_model.summary()
    
    return np.concatenate((np.array([poisson_model.params['home']]), 
                    poisson_model.conf_int(alpha=pval).values[-1]))

I’ve essentially combined various parts of the previous post into one convenient function. If it looks a little strange, then I suggest you consult the original post. Okay, we’re ready to start calculating some home advantage scores.

# home field advantage for EPL 2016/17 season
get_home_team_advantage(pd.read_csv("http://www.football-data.co.uk/mmz4281/1617/E0.csv"))
array([ 0.2838454,  0.16246  ,  0.4052308])

It’s as easy as that. Feed a url from football-data.co.uk into the function and it’ll quickly tell you the statistical advantage enjoyed by home teams in that league. Note that the latter two values repesent the left and right limit of the 95% confidence interval around the mean value. The first value in the array is actually just the log of the number of goals scored by the home team divided by the total number of away goals.

temp_goals_df = pd.read_csv("http://www.football-data.co.uk/mmz4281/1617/E0.csv")
[np.exp(get_home_team_advantage(temp_goals_df)[0]),
 np.sum(temp_goals_df['FTHG'])/float(np.sum(temp_goals_df['FTAG']))]
[1.3282275711159723, 1.3282275711159737]

The goals ratio calculation is obviously much simpler and definitely more intuitive. But it doesn’t allow me to reference my previous post as much (link link link) and it fails to provide any uncertainty around the headline figure. Let’s plot the home advantage figure for the top 5 divisions of the English league pyramid for since 2005. You can remove those hugely informative confidence interval bars by switching the toggle.

Error Bars
result.png

It’s probably more apparent without those hugely informative confidence interval bars, but it seems that the home advantage score decreases slightly as you move down the pyramid (analysis by Sky Sports produced something similar). This might make sense for two reasons. Firstly, bigger teams generally have larger stadiums and more supporters, which could strengthen the home field advantage. Secondly, as you go down the leagues, I suspect the quality gap between teams narrows. Taking it to an extreme, when I used to play Sunday league football, it didn’t really matter where we played… we still lost. In that sense, one must be careful comparing the home advantage between leagues, as it will be affected by the relative team strengths within those leagues. For example, a league with a very dominant team (or teams) will record a lower home advantage score, as that dominant team will score goals home and away with little difference (Man Utd would probably beat Cork City 6-0 at Old Trafford and Turners Cross!).

Having warned about the dangers of comparing different leagues with this approach, let’s now compare the top five leagues in Europe over the same time period as before.

Error Bars
result.png

Honestly, there’s not much going on there. With the poissble exception of the Spanish La Liga since 2010, the home field advantage enjoyed by the teams in each league is broadly similar (and that’s before we bring in the idea of confidence intervals and hypothesis testing).

Home Advantage Around the World

To find more interesting contrasts, we must venture to crappier and more corrupt leagues. My hunch is that home advantage would be negligible in countries where the overall quality (team, infastructure, etc.) is very low. And by low, I mean leagues worse than the Irish Premier Division (yes, they exist). Unfortunately, the historical results for such leagues are not available on football-data.co.uk. Instead, we’ll scrape the data off betexplorer. I’m extremely impressed by the breadth of this site. You can even retrieve past results for the French overseas department of Réunion. Fun fact: Dimtri Payet spent the 2004 season at AS Excelsior of the Réunion Premier League.

We’ll use Scrapy to pull the appropriate information off the website. If you’ve never used Scrapy before, then you should check out this post. I won’t spend too long on this part, but you can find the full code here.

You don’t actually need to run your own spider, as I’ve shared the output to my GitHub account. We can import the json file in directly using pandas.

all_league_goals = pd.read_json(
    "https://raw.githubusercontent.com/dashee87/blogScripts/master/files/all_league_goals.json")
# reorder the columns to it a bit more logical
all_league_goals = all_league_goals[['country', 'league', 'date', 'HomeTeam', 
                                     'AwayTeam', 'FTHG', 'FTAG', 'awarded']]
all_league_goals.head()
country league date HomeTeam AwayTeam FTHG FTAG awarded
0 Albania Super League 2016/2017 2017-05-27 Korabi Peshkopi Flamurtari 0 3 False
1 Albania Super League 2016/2017 2017-05-27 Laci Teuta 2 1 False
2 Albania Super League 2016/2017 2017-05-27 Luftetari Gjirokastra Kukesi 1 0 False
3 Albania Super League 2016/2017 2017-05-27 Skenderbeu Partizani 2 2 False
4 Albania Super League 2016/2017 2017-05-27 Vllaznia KF Tirana 0 0 False

Hopefully, that’s all relatively clear. You’ll notice that it’s very similar to the format used by football-data, which means that we can feed this dataframe into the get_home_team_advantage function. Sometimes, matches are awarded due to one team fielding an ineligible player or crowd trouble. We should probably exclude such matches from the home field advantage calculations.

# little bit of data cleansing to remove fixtures that were abandoned/awarded/postponed
all_league_goals = all_league_goals[~all_league_goals['awarded']]
all_league_goals = all_league_goals[all_league_goals['FTAG']!='POSTP.']
all_league_goals = all_league_goals[all_league_goals['FTAG']!='CAN.']
all_league_goals[['FTAG', 'FTHG']] = all_league_goals[['FTAG', 'FTHG']].astype(int)

We’re ready to put it all together. I’ll omit the code (though it can be found here), but we’ll loop through each country and league combination (just in case you decide to include multiple leagues from the same country) and calculate the home advantage score, plus its confidence limits as well as some other information for each league (number of teams, average number of goals in each match). I’ve converted the pandas output to a datatables table that you can interactively filter and sort.


country league # games # teams avg_goals home_adv score left_tail right_tail
1 Nigeria Premier League 2017 379 20 2.011 1.195 1.027 1.363
2 Haiti Championnat National 2017 237 16 1.717 0.741 0.533 0.949
3 Algeria Ligue 1 2016/2017 238 16 2.092 0.698 0.512 0.884
4 Ghana Premier League 2017 238 16 2.202 0.676 0.494 0.857
5 Bolivia Liga de Futbol Prof 2016/2017 132 12 3.432 0.624 0.431 0.818
6 Guatemala Liga Nacional 2016/2017 264 12 2.155 0.620 0.448 0.792
7 Benin Championnat National 2017 162 19 1.778 0.571 0.330 0.811
8 USA MLS 2017 374 22 2.968 0.538 0.416 0.660
9 Peru Primera Division 2017 238 16 2.681 0.520 0.359 0.680
10 Indonesia Liga 1 2017 304 18 2.888 0.515 0.378 0.651
11 Togo Championnat National 2016/2017 181 14 1.934 0.510 0.293 0.726
12 Uzbekistan Professional Football League 2017 233 16 2.571 0.503 0.338 0.668
13 Mozambique Mocambola 2017 240 16 1.867 0.501 0.310 0.692
14 Angola Girabola 2017 239 16 2.151 0.499 0.321 0.678
15 Greece Super League 2016/2017 240 16 2.317 0.499 0.328 0.671
16 Tunisia Ligue Professionnelle 1 2016/2017 112 16 2.098 0.495 0.231 0.759
17 Albania Super League 2016/2017 180 10 1.889 0.488 0.269 0.707
18 Sudan Premier League 2017 306 18 2.261 0.486 0.332 0.639
19 Tanzania Ligi Kuu Bara 2016/2017 239 16 1.971 0.480 0.294 0.665
20 Colombia Liga Aguila 2017 400 20 2.145 0.465 0.328 0.603
21 Ecuador Serie A 2017 263 12 2.605 0.454 0.300 0.608
22 Honduras Liga Nacional 2016/2017 180 10 2.828 0.452 0.273 0.630
23 Ethiopia Premier League 2016/2017 239 16 1.837 0.433 0.241 0.625
24 Morocco Botola Pro 2016/2017 240 16 2.229 0.405 0.232 0.578
25 India I-League 2017 90 10 2.500 0.405 0.139 0.672
26 Montenegro Prva Crnogorska Liga 2016/2017 197 12 2.020 0.398 0.196 0.599
27 Croatia 1. HNL 2016/2017 180 10 2.417 0.396 0.204 0.588
28 Zimbabwe Premier Soccer League 2017 305 18 2.023 0.389 0.228 0.549
29 Kosovo Superliga 2016/2017 196 12 2.383 0.386 0.200 0.571
30 Sierra Leone Premier League 2014 90 14 1.833 0.376 0.060 0.691
31 France Ligue 1 2016/2017 379 20 2.615 0.375 0.248 0.502
32 Malawi Super League 2017 239 16 2.331 0.372 0.203 0.541
33 Costa Rica Primera Division 2016/2017 264 12 2.689 0.370 0.221 0.520
34 Norway Eliteserien 2017 240 16 2.842 0.368 0.215 0.520
35 Bulgaria Parva Liga 2016/2017 182 14 2.467 0.365 0.177 0.553
36 Russia Premier League 2016/2017 240 16 2.133 0.363 0.187 0.539
37 Kazakhstan Premier League 2017 198 12 2.465 0.361 0.180 0.542
38 Belgium Jupiler League 2016/2017 239 16 2.736 0.359 0.203 0.515
39 FYR of Macedonia First League 2016/2017 180 10 2.539 0.349 0.163 0.535
40 Senegal Ligue 1 2016/2017 181 14 2.204 0.348 0.149 0.547
41 Azerbaijan Premier League 2016/2017 111 8 2.234 0.346 0.094 0.599
42 Moldova Divizia Nationala 2016/2017 165 11 2.539 0.341 0.147 0.536
43 Slovakia Fortuna liga 2016/2017 184 12 2.690 0.339 0.159 0.518
44 Cameroon Elite One 2017 303 18 1.795 0.337 0.166 0.508
45 Jamaica Premier League 2016/2017 198 12 2.192 0.336 0.145 0.527
46 Réunion Regionale 1 2017 182 14 2.610 0.336 0.153 0.518
47 Venezuela Primera Division 2017 303 18 2.482 0.331 0.186 0.476
48 Portugal Primeira Liga 2016/2017 306 18 2.379 0.327 0.180 0.474
49 South Africa Premier League 2016/2017 240 16 2.242 0.322 0.151 0.494
50 Germany Bundesliga 2016/2017 306 18 2.866 0.315 0.181 0.449
51 Uganda Premier League 2016/2017 237 16 2.135 0.313 0.137 0.490
52 Guinea Ligue 1 2016/2017 181 14 2.044 0.312 0.105 0.519
53 Thailand Thai Premier League 2017 306 18 3.389 0.309 0.186 0.432
54 Yemen Division 1 2013/2014 180 14 2.322 0.308 0.114 0.503
55 Zambia Super League 2017 379 20 2.003 0.308 0.164 0.452
56 Kyrgyzstan Top Liga 2017 60 6 2.950 0.307 0.009 0.606
57 Hungary OTP Bank Liga 2016/2017 198 12 2.631 0.307 0.133 0.481
58 Namibia MTC Premiership 2015/2016 240 16 2.412 0.304 0.139 0.469
59 China Super League 2017 240 16 3.050 0.303 0.156 0.449
60 Niger Ligue 1 2016/2017 181 14 2.171 0.301 0.101 0.502
61 Iraq Super League 2016/2017 354 20 2.110 0.299 0.153 0.446
62 Netherlands Eredivisie 2016/2017 306 18 2.889 0.296 0.163 0.430
63 Serbia Super Liga 2016/2017 239 16 2.364 0.290 0.124 0.457
64 Palestine West Bank League 2016/2017 131 12 2.450 0.287 0.066 0.508
65 England Premier League 2016/2017 380 20 2.800 0.284 0.162 0.405
66 Gabon Championnat D1 2016/2017 163 14 2.307 0.282 0.077 0.486
67 Brazil Serie A 2017 380 20 2.429 0.281 0.151 0.412
68 Turkmenistan Yokary Liga 2017 143 9 2.916 0.278 0.084 0.472
69 Spain LaLiga 2016/2017 380 20 2.942 0.263 0.144 0.381
70 Poland Ekstraklasa 2016/2017 240 16 2.767 0.260 0.107 0.414
71 Czech Republic 1. Liga 2016/2017 240 16 2.488 0.259 0.098 0.421
72 Italy Serie A 2016/2017 379 20 2.955 0.257 0.139 0.375
73 Wales Premier League 2016/2017 132 12 2.970 0.246 0.047 0.446
74 New Zealand Football Championship 2016/2017 90 10 3.567 0.244 0.024 0.465
75 Republic of the Congo Ligue 1 2017 300 18 2.237 0.244 0.091 0.397
76 Kenya Premier League 2017 304 18 2.026 0.244 0.084 0.403
77 Ukraine Pari-Match League 2016/2017 132 12 2.462 0.241 0.022 0.460
78 Austria Tipico Bundesliga 2016/2017 180 10 2.711 0.239 0.060 0.418
79 Switzerland Super League 2016/2017 180 10 3.233 0.235 0.071 0.398
80 Mexico Primera Division 2016/2017 306 18 2.634 0.234 0.095 0.373
81 Turkey Super Lig 2016/2017 305 18 2.708 0.231 0.094 0.369
82 Bosnia and Herzegovina Premier League 2016/2017 132 12 2.242 0.231 0.001 0.460
83 Romania Liga 1 2016/2017 181 14 2.376 0.223 0.033 0.413
84 Philippines PFL 2017 109 8 3.202 0.218 0.007 0.430
85 Malaysia Super League 2017 132 12 3.091 0.217 0.021 0.412
86 Australia A-League 2016/2017 135 10 3.030 0.213 0.018 0.409
87 DR Congo Super Ligue 2016/2017 195 26 2.205 0.198 0.006 0.391
88 Syria Premier League 2016/2017 239 16 2.180 0.197 0.025 0.370
89 Argentina Primera Division 2016/2017 450 30 2.276 0.195 0.071 0.318
90 Burundi Ligue A 2016/2017 239 16 2.138 0.192 0.018 0.366
91 Cyprus First Division 2016/2017 182 14 2.879 0.191 0.019 0.363
92 Sweden Allsvenskan 2017 240 16 2.779 0.189 0.037 0.342
93 Tajikistan Vysshaya Liga 2017 84 8 2.702 0.189 -0.077 0.456
94 Northern Ireland NIFL Premiership 2016/2017 195 12 2.933 0.188 0.023 0.353
95 Scotland Premiership 2016/2017 198 12 2.687 0.187 0.016 0.358
96 Saudi Arabia Saudi Professional League 2016/2017 182 14 3.016 0.186 0.018 0.354
97 Iceland Pepsideild 2017 132 12 3.053 0.184 -0.012 0.380
98 Nicaragua Primera Division 2016/2017 179 10 3.156 0.182 0.016 0.347
99 Denmark Superliga 2016/2017 182 14 2.632 0.180 0.000 0.360
100 Lesotho Premier League 2016/2017 180 14 2.428 0.180 -0.009 0.368
101 Vietnam V-League 2017 182 14 2.912 0.174 0.003 0.345
102 Rwanda National Football league 2016/2017 239 16 2.134 0.174 -0.001 0.348
103 Ireland Premier Division 2017 198 12 2.773 0.172 0.004 0.341
104 Estonia Meistriliiga 2017 180 10 3.656 0.171 0.017 0.324
105 United Arab Emirates UAE League 2016/2017 182 14 3.137 0.165 0.000 0.330
106 El Salvador Primera Division 2016/2017 263 12 2.601 0.161 0.010 0.311
107 Luxembourg National Division 2016/2017 182 14 3.319 0.153 -0.007 0.313
108 Bangladesh Premier League 2016 132 12 2.591 0.152 -0.060 0.365
109 Mauritania Championnat D1 2016/2017 181 14 2.453 0.149 -0.038 0.336
110 Swaziland MTN Premier League 2016/2017 131 12 2.634 0.146 -0.065 0.358
111 Trinidad and Tobago Pro League 2017 90 10 2.922 0.145 -0.098 0.387
112 Malta Premier League 2016/2017 197 12 2.878 0.145 -0.021 0.310
113 Chile Primera Division 2016/2017 120 16 2.892 0.145 -0.068 0.357
114 Israel Ligat ha'Al 2016/2017 182 14 2.132 0.145 -0.055 0.344
115 Botswana Premier League 2016/2017 240 16 2.317 0.144 -0.023 0.311
116 Oman Professional League 2016/2017 182 14 2.758 0.136 -0.040 0.311
117 Iran Persian Gulf Pro League 2016/2017 240 16 2.100 0.135 -0.040 0.310
118 Bermuda Premier League 2016/2017 90 10 3.144 0.134 -0.099 0.368
119 Lithuania A Lyga 2017 112 8 2.580 0.132 -0.099 0.363
120 Egypt Premier League 2016/2017 305 18 2.256 0.122 -0.027 0.272
121 Faroe Islands Premier League 2017 134 10 3.187 0.122 -0.068 0.312
122 Burkina Faso Premier League 2016/2017 240 16 1.721 0.121 -0.072 0.314
123 Finland Veikkausliiga 2017 198 12 2.737 0.120 -0.049 0.289
124 Seychelles Division One 2017 132 12 3.303 0.114 -0.075 0.302
125 Japan J-League 2017 306 18 2.592 0.114 -0.026 0.253
126 Myanmar National League 2017 128 12 2.594 0.113 -0.104 0.329
127 Ivory Coast Ligue 1 2016/2017 182 14 1.802 0.110 -0.107 0.327
128 Georgia Erovnuli Liga 2017 179 10 2.810 0.108 -0.067 0.283
129 Qatar Premier League 2016/2017 182 14 3.132 0.105 -0.059 0.270
130 South Korea K-League Classic 2017 198 12 2.737 0.102 -0.067 0.271
131 Slovenia Prva liga 2016/2017 180 10 2.572 0.099 -0.083 0.282
132 Lebanon Premier League 2016/2017 131 12 2.771 0.094 -0.113 0.300
133 San Marino Campionato Sammarinese 2016/2017 154 15 3.143 0.085 -0.095 0.264
134 Belarus Vysshaya Liga 2017 240 16 2.333 0.071 -0.094 0.237
135 Mali Premiere Division 2016 162 19 2.031 0.064 -0.153 0.281
136 Gibraltar Premier Division 2016/2017 132 10 3.288 0.064 -0.126 0.253
137 Hong Kong Premier League 2016/2017 110 11 3.427 0.058 -0.144 0.260
138 Singapore S.League 2017 108 9 2.981 0.056 -0.163 0.275
139 Sri Lanka Champions League 2017 143 18 3.266 0.051 -0.133 0.235
140 Cape Verde Campeonato Nacional 2017 36 12 2.389 0.047 -0.376 0.469
141 Djibouti Division 1 2016/2017 90 10 3.978 0.045 -0.163 0.252
142 Uruguay Primera Division 2017 240 16 2.729 0.034 -0.120 0.187
143 Gambia GFA League 2016/2017 131 12 1.908 0.032 -0.218 0.281
144 Canada CSL 2017 56 8 4.304 0.025 -0.228 0.277
145 Armenia Premier League 2016/2017 90 6 2.200 0.020 -0.258 0.299
146 Panama LPF 2016/2017 180 10 2.083 0.005 -0.197 0.208
147 Kuwait Premier League 2016/2017 210 15 3.048 -0.000 -0.155 0.155
148 Mauritius Mauritian League 2016/2017 179 10 2.922 -0.001 -0.173 0.170
149 Andorra Primera Divisió 2016/2017 83 8 3.265 -0.005 -0.245 0.236
150 Latvia SynotTip Virslīga 2017 96 8 2.417 -0.006 -0.264 0.251
151 Libya Premier League 2017 83 28 2.265 -0.015 -0.324 0.293
152 Dominican Republic LDF 2017 90 10 2.567 -0.021 -0.280 0.239
153 Cambodia C-League 2017 132 12 3.864 -0.031 -0.205 0.142
154 Paraguay Primera Division 2017 264 12 2.534 -0.033 -0.184 0.119
155 Jordan Premier League 2016/2017 132 12 2.235 -0.034 -0.262 0.194
156 Bahrain Premier League 2016/2017 90 10 2.556 -0.048 -0.310 0.213
157 Pakistan Premier League 2014/2015 132 12 2.333 -0.065 -0.288 0.159
158 Liberia LFA First Division 2016/2017 125 12 2.248 -0.089 -0.323 0.145
159 Somalia Nation Link Telecom Championship 2016/2017 90 10 2.922 -0.153 -0.396 0.090
160 Maldives Dhivehi Premier League 2017 56 8 3.304 -0.370 -0.782 0.042

Focusing on the home_adv score column, teams in Nigeria by far enjoy the greatest benefit from playing at home (score = 1.195). In other words, home teams scored 3.3 (= ) times more goals than their opponents. This isn’t new information and can be attributed to a combination of corruption (e.g. bribing referees) and violent fans. In fact, my motivation for this post was to identify more football corruption hotspots. Alas, when it comes to home turf invincibility, it seems Nigeria are the World Cup winners.

Fifteen leagues have a negative home_advantage_score, meaning that visiting teams actually scored more goals than their hosts- though none was statistically significant. By some distance, the Maldives records the most negative score. Luckily, I’ve twice researched this beautiful archipelago and I’m aware that all matches in the Dhiveli Premier League are played at the national stadium in Malé (much like the Gibraltar Premier League). So it would make sense that there’s no particular advantage gained by the home team. Libya is another interesting example. Owing to security issues, all matches in the Libyan Premier League are played in neutral venues with no spectators present. Quite fittingly, it returned a home advantage score just off zero. Generally speaking, the leagues with near zero home advantage come from small countries (minimal inconvenience for travelling teams) with a small number of teams and they tend to share stadiums.

If you sort the avg_goals column, you’ll see the semi-pro Canadian Soccer League is the place to be for goals (average = 4.304). But rather than sifting through that table or explaining the results with words, the most intuitive way to illustrate this type of data is with a map of world. This might also help to clarify whether there’s any geographical influence on the home advantage effect. Again, I won’t go into the details (an appendix can be found in the Jupyter notebook), but I built a map using the JavaScript library, D3. And by built I mean I adapted the code from this post and this post. Though a little outdated now, I found this post quite useful too. Finally, I think this post shows off quite well what you can do with maps using D3.

And here it is! The country colour represents its home_advantage_score. You can zoom in and out and hover over a country to reveal a nice informative overlay; use the radio buttons to switch between home advantage and goals scored. I recommend viewing it on desktop (mobile’s a bit jumpy) and on Chrome (sometimes have security issues with Firefox).

Home Advantage Goals

It’s not scientifically rigorous (not in academia any more, baby!), but there’s evidence for some geographical trends. For example, it appears that home advantage is stronger in Africa and South America compared to Western and Central Europe, with the unstable warzones of Libya, Somalia and Paraguay (?) being notable exceptions. As for average goals, Europe boasts stonger colours compared to Africa, though South East Asia seems to be the global hotspot for goals. North America is also quite dark, but you can debate whether Canada should be coloured grey, as the best Canadian teams belong to the American soccer system.

Conclusion

Using a previously described model and some JavaScript, this post explored the so called home advantage in football leagues all over the world (including Réunion). I don’t think it uncovered anything particularly amazing: different leagues have different properties and don’t bet on the away team in the Nigerian league. You can play around with the Python code here. Thanks for reading!

Leave a Comment