The predictable Premier League

Top Premier League teams, including champions Man United, aren't prone to upsets (Image: Flickr/socialBedia)
A couple of weeks ago I looked down the Premier League table and thought: there’s nothing particularly interesting about this season. The teams you’d expect to be near the top occupied the top six spaces, the teams with weak squads were struggling near the bottom, and the rest filled the space in between. This remains true as I write.
On the weekend of 17th-18th December no team beat a club higher-placed in the league table. An isolated case, maybe, but hardly an advert for the Premier League.
I then decided to look at Man United and Man City’s results from the last two seasons – how many times had they been on the receiving end of a surprise result in the league? I found that since 5th February, when Man United lost to Wolves, only once had either team been beaten by a side outside the ‘big six’. That’s nearly 11 months of predictability for the league’s strongest two teams.
By way of comparison, Real Madrid and Barcelona have lost five matches to clubs outside the top six over the same period (a quick test suggests this is at a statistically higher rate than England’s top two, though that’s not accounting for the opposition played and number of home games).
Measuring unpredictability
Both Soccer by the Numbers and Decision Technology have looked at competition levels across European Leagues, and I’ve adopted a similar approach using data from Infostrada Sports and their Euro Club Index.
The Euro Club Index generates probabilities for match outcomes throughout the season (match odds available here and the methodology is explained here). For example, for Fulham v Man United on 21st December, the ECI predicted a 17% chance of a Fulham victory, a 56% chance for United, and a 28% chance of a draw.
We define an upset as a team winning when they had at least a 10% lower chance of winning than the match favourites.
So had Fulham won, given their chance of winning was 39 percentage points lower than the favourites Man United (56% – 17%), they would have caused an upset. In the end a 0-5 defeat suggests they came nowhere near.
Therefore leagues that have a number of upsets can be termed ‘unpredictable’, whilst leagues with a low rate of such results can be considered predictable.
European comparisons
The percentage of unpredictable results is the proportion of matches won by teams who had a minimum of 10% lower chance of winning than the favourite. So far this season (correct up to 20 December) Europe’s biggest four leagues, highlighted in red, have all shown similar levels of unpredictability.
The leagues shown are those in the top 20 of UEFA’s country coefficient at the end of the 2010/11 season, including UEFA’s two continental competitions. Winter 2010 denotes the rate of upsets at this time last year.
Portugal lie bottom of the pile, whilst lesser-followed leagues in Israel, Denmark and Switzerland can all say they’ve had a number of upsets this season (around one in every five games given the difference in victory probability between teams was at least 10%). Curiously the Europa League is more predictable than the Champions League, though this difference may be insignificant.
Germany’s fall in upsets this season is remarkable given its reputation as a strong but competitive league, and is the biggest ‘mover’ on this time last year.
Elsewhere, no leagues can really claim to have become particularly more unpredictable this season, with only marginal changes in the other three major European leagues.
Last season Germany had the best of both worlds, a strong and unpredictable league. Nearly three in every ten matches could be classified as upsets (Edit: should be noted that this is given the difference in win probabilities was at least 10%), and the league moved into third in UEFA’s country coefficient.
The Bundesliga was very much in a league of its own last winter, with the Premier League painfully predictable despite a strong coefficient. It’s a similar case for England’s top flight this season, but this time they’re joined by Germany as no league can really boast to be strong and unpredictable so far this season.
No great surprises
The vast majority of matches in the Premier League see the favourites win; even small upsets are relatively scarce. That said, this is in keeping with major European leagues so far this season.
Despite marketing to suggest otherwise, it’s not really a case of expecting the unexpected in England. However, if the Premier League’s top clubs were more prone to upsets, would it still be able to maintain its position at the top of UEFA’s coefficient rankings through performance in Europe? In other words, is an unpredictable but strong league sustainable?
Data provided by Infostrada Sports from the Euro Club Index (powered by Hypercube). You can follow both Infostrada Sports accounts on Twitter: @InfostradaLive, @EuroClubIndex.
Niece piece.
I am very curious how the charts look for a longer period, say 10 years. Further, did you gain insight in the spread of clubs responsible for the upsets?
Following ECI’s methodology the Bundesliga had -at the time- (relatively?) a lot of overperformers (Dortmund, Mainz, Hannover) and underperformers (Stuttgart, Schalke, Bremen). I wonder how many upsets FSV Mainz “scored” in their first 7 matches.
I am curious about the strength of any historical pattern. I remember 1. FC Kaiserslautern in 97/98, who must have scored a lot of points in the methodology used to determine predictability. Same goes for Hoffenheim in recent years.
My hypothesis would be that Bundesliga’s long term predictability would lie somewhere between last years and this years, and is significant more unpredictable than EPL.
I’m afraid I only have collated data so I’m really not sure about how it would look broken down further. I agree with your last line though.
Then I’m glad I didn’t suggest to introduce different upset levels and let Excel do the work 🙂
That’s hopefully the next step!
very interesting! Nice work! …hoping for a predictable result at the Emirates today :p
Thanks! 🙂
I jinxed it.
hi there, i’m not sure wether your method is valid. in order to have upsets you have to have the possibilty for them to occur. if for example you have a league of 20 teams with the same strength, you wont get any upsets…
i believe a comparison of the standard variation of points or goals scored per match, or maybe those team strength coefficents would be more reliable
or am i overlooking something?
cheers robs
I’m only analysing matches which have the ‘potential’ to be an upset – so tight matches are excluded from the data. Thus I’m only looking at comparable matches across the leagues.
ok, that makes sense i guess.
so how many tight matches have been there in the different leagues (it somehow strikes me as an indicator for league competitivesness) and how did you come up with this 10% treshhold (it looks a bit arbitrary).
i’m not judging or anything, just being interested
on another matter. how did you aquire this massive data for your blog. i’m thinking about doing something similar to the eci, but the amount of data needed is impossible.
chers robs
I’m afraid I don’t have access to more detailed data – all received from Infostrada Sports. You’re right, unpredictability is related to competition, though generally the work done on competition looks at season data as opposed to individual match data. The 10% cut off was suggested by Infostrada – from the data it appears to be a good cut off for upsets. Thanks for the interest, questions welcome 🙂
I must say i enjoyed the navigational experience.
As I have already been searching round the internet, I could see that I need to utilize something apart
from Tumblr. Thanks a lot for the attention opening
experience