Four years in football analytics
Four years ago this week, sitting on the third floor of my university’s library, I launched 5 Added Minutes. For a couple of months, the blog was simply a place to express frustrations at media clichés and received wisdom in football, but it quickly evolved into a medium for me to use data and analytics to challenge the conversations that were taking place in press conferences, ‘expert’ columns and television analyses.
I wasn’t the first to join this ‘revolution’ – not my word, but Simon Kuper’s in a June 2011 Financial Times piece; one of the first mainstream articles to take a view on the impact of statistics on professional football – and since 2011 there have been many more who have taken an interest, and even a lead, in the development of football analytics. But when a WordPress notification reminded me that it had been four years since the creation of my blog, I felt obliged to reflect on the journey of both amateurs and professionals in this industry. How far have we come, and has some of the promise of the Kuper article materialised into results?
To my knowledge, in 2011 there were only a handful of active analytics blogs*: Chris Anderson’s Soccer by the Numbers, Zach Slaton’s A Beautiful Numbers Game, James Grayson’s eponymous blog, Sarah Rudd’s On Footy, Sander IJtsma’s 11tegen11 and Mark Taylor’s The Power of Goals. This diverse and disparate group wrote in a time before performance data was readily available on websites, and yet they produced many pioneering insights, in Anderson’s case forming the basis of a bestselling book.
The size of this group swelled during the next 18 months thanks to increasingly accessible data and statistical literacy amongst fans; Manchester City released a dataset in partnership with Opta in 2012 in the wake of Moneyball‘s Hollywood adaptation, and sites like EPL Index, Who Scored and Squawka provided further data to interrogate.
As such, the focus shifted from broader, ‘macro’ analyses of trends and team or league traits to more ‘micro’ analysis of individual player actions and interactions; just take a look at some of the archived posts in the blogs mentioned above. In many ways this is reflected in the first two books written on data in professional football; Kuper and Syzmanski’s Soccernomics / Why England Lose looked at topics like spending, player development and the sport’s popularity whilst Anderson and Sally’s more recent The Numbers Game delved more (but not exclusively) into on-field actions such as goalscoring, corners and possession.
Increasingly, the work of analytical bloggers was being detected and used in mainstream media, with some in particular, such as Mike Goodman and Richard Whittall, using this work to inform their writing. The proliferation of expected goals models was a big step in attempting to find meaning, in a statistical sense, in a game with enormous unexplained, uncontrollable and possibly uncoachable variation. It would also appear to be the first step on the way to a ‘goal probability added’ stat in the aforementioned Kuper FT article.
Given the constraints – namely time and availability of data – the amateur community has grown impressively in four years. There also is a sense of collaboration, aided perhaps by sites like StatsBomb, and many bloggers look to other sports for inspiration. Whilst the ultimate goal of this group is to influence decision-making at professional clubs, what it has achieved so far is to quietly help raise the level of conversation around the game. Of course, it is only a small cog in the wider context, and the phrase ‘expected goals’ is unlikely to feature on Match of the Day at any point soon, but there has been a marked shift in attention and attitude towards data in the last four years, and for that the work of the analytics bloggers should not be ignored.
If the amateur analytics community has taken 5 strides forward in 4 years, the professional clubs have shuffled awkwardly in the vague direction of north.
This isn’t to criticise the clubs entirely. In the grand scheme of things, many are still small or medium-sized businesses who view analytics as a leap into the unknown, and don’t have the time to investigate further. Even if a club did have inclination to invest in a team of “quants”, would they know who to hire? And can they ensure that this department would have a voice in the club?
This has surely been a challenge even for the 6 to 10 clubs in England who have taken the leap, in some form or another. Publicly, they include Arsenal with StatDNA, Tottenham Hotspur with Decision Technology and Liverpool with Ian Graham. Whilst it would be misleading to suggest that other clubs aren’t using data, for virtually everyone in the top two divisions has access and refers to vast amounts of statistics, very few are engaging in the “systematic computation” of this information that would qualify them as users of analytics.
Slowly though – and this is a trend that extends beyond the last four years – there is a group of younger and arguably savvier individuals who have taken leading positions at football clubs, many of whom have non-football backgrounds dealing with data. They attend the ever-growing number of sports analytics conferences that are taking place in the US and Europe, though often these events reveal disappointingly little about work being actually done in clubs.
I’ve been fortunate enough to work for two companies – previously Prozone and now 21st Club – who have and do offer varying products and services that involve analytics, which is a reflection of the simmering interest in this area. From experience, there is sometimes a bigger sense of urgency outside the moneyed shores of England to use analytics; a few MLS clubs have made strategic hires in this area (some of them having been amateur bloggers like Ravi Ramineni) whilst Bayern Munich announced SAP as their “official partner for sport analytics and enterprise software” last year. Chairman of table-topping Danish Superliga team FC Midtjylland and 21st Club co-founder Rasmus Ankersen also has analytics resources that he taps into to try to gain a competitive edge.
These are all relatively quiet stories that are threatening to grow into more widely-recognised cases of best practice. No clubs are stubborn enough to think analytics holds all the answers or that a “holy grail” exists, and over four years most have come to recognise that it has a role to play in on- and off-field decision making. The problem is that the next step can often lead to inaction: where do we start, who do we start with, and can we take someone’s lead?
The next four years
An optimistic prediction would be that by January 2019, half of the clubs in the top two divisions in England will be using and applying analytics in all aspects of the club, either through internal hires or external agents. More likely is that analytics will grow organically in a handful of clubs, perhaps starting in player recruitment or asset management, and slowly seeping into other areas like tactics or coaching. There are plenty of signs that this is already happening.
Either way, it’s a tremendously exciting industry to be a part of, and I would recommend anyone who wants to be a part of it to set up a blog and get writing. There’s lots of questions to be answered, and plenty of growth to come.
*I can only apologise to those bloggers that I’ve failed to mention in this piece; I’ve primarily tried to celebrate the overall work and progression of amateur bloggers. The fact that I can’t mention all is a reflection of the size and rate at which the community has grown in four years. For those who want to know which bloggers are leading the way, Colin Trainor’s list is as good a resource as any (with the exception being that Colin himself is not on the list).