Bridging the Gap: Improving the Coach-Analyst Relationship (Part 2)

Executive Summary

  1. Analytical results are usually presented most effectively to coaches by using data visualisation and story-telling.
  2. Don’t ignore external commercial data if it is available and affordable.
  3. Data analysts can make a vital contribution to the organisation of training sessions.
  4. Data analytics is only one input into decision making by coaches, albeit a potentially very important one if used effectively.

 

  1. Analytical results are usually presented most effectively to coaches by using data visualisation and story-telling.

As well as the imperative of translating analytical results into practical recommendations framed in the language of coaches, a number of speakers stressed the importance of data visualisation and story-telling as communication devices. “A picture is worth a thousand words” has become even truer in the age of data analytics where effective data visualisation has become a vital tool for the analyst. Rob Carroll (The Video Analyst) illustrated this very well with his graphics on the quality of shooting opportunities in Gaelic football, a form of expected goals model. Ann Bruen (Metrifit) suggested that we should always have in mind the story we are going to tell as we collect and analyse the data. Ben Mackriell from Opta, whose core business is providing performance data, made the same point when he said that it is possible to have a conversation about data without actually mentioning the data (or the analytical techniques). Of course when it comes to evidence-based story-telling we must remain open-minded and allow the precise details and ending of the story to emerge from the analysis. There is always a danger of not allowing the data to get in the way of a good story, of pre-judging the results of the data analysis; it is what cognitive psychologists call confirmation bias. A good evidence-based story is a story that conveys analytical results in the language of coaches, focusing on the practical implications with explanations of athlete and team performance framed in terms of skill technique and tactical decisions. As Edward Metgod (Royal Dutch Football Association) pointed out, coaches are interested in causality not correlation. Analysts must translate the evidence of statistical associations into credible stories of cause and effect with clear implications for targeted interventions to improve performance. When all is said and done analytics is actionable insight.

The discussion of the importance of story-telling reminded me of the advice of Alfred Marshall on the use of mathematics in economics. Marshall probably did more than anyone to systemise economics as a subject and much of his mathematics and diagrams still remain in the textbooks. Marshall was very aware of the uses and abuses of mathematics. Economics was intended to be a practical subject about the everyday business of life but Marshall became increasingly concerned that economists assumed good mathematics meant good economics. He advised that if the mathematics could not be translated into English and then illustrated with important real-life examples (i.e. a good story), then it should be burnt. Apart from the health and safety issues (perhaps safer to shred than burn), Marshall’s advice holds good for data analytics too. If it doesn’t produce actionable insight, it is worthless.

 

  1. Don’t ignore external commercial data if it is available and affordable.

Any discussion of data analytics must include a discussion of the nature of the data being used. The Forum was a great place for this type of discussion giving that it brought together external and internal data providers, data analysts and end-users. In the past there has been too much emphasis on different types of data as substitutes whereas now there is greater acceptance of the complementarity of data. And that complementarity will get even better as there is more and more cross-over in personnel between teams and commercial data providers. Ben Mackriell at Opta is a good case in point, now in charge of OptaPro but with years of experience working with teams in rugby union and football. External commercial data offers consistency and coverage whereas internal data is team-specific and often includes expert coach evaluation of skill technique and tactical decision-making relative to the game plan. The differences between these two types of data are variously described as objective vs subjective, frequency vs evaluation, general vs expert, small data vs big data. The differences were well illustrated in the Q&A that followed Edward Metgod’s presentation when he was asked how he would define the transition phase of play. Edward replied as a coach and scout with a subjective/evaluation/expert definition that the transition represents the period of play after a team loses possession but has not gained its defensive shape. Transition is determined by tactical factors in contrast to more objective definitions in terms of a specific time period (e.g. the first five seconds after possession is lost) or the number of passes made by the team gaining possession. What is important to recognise is that these different types of data have different but complementary functions. For example, external data, possibly in the form of a player rating system, can be used at the first stage of player recruitment to identify a target group of players for whom at the second stage internal data is then produced by the team’s scouts. This is exactly the system of e-screening of potential player acquisitions that I recommended to Bolton Wanderers in 2005. Increasingly I am finding that my greatest and most interesting challenge as an analyst is to generate expert insight from non-expert data particularly in opposition analysis. Can I get inside the minds of the opposition coaches by studying the patterns in their data?

 

  1. Data analysts can make a vital contribution to the organisation of training sessions.

There were a number of speakers at the Forum whose specialism lay in strength and conditioning, and sports science. In addition the Forum also included presentations from coach educators. Both of these groups shared a concern with the optimal use of training time. As a qualified coach and university professor, I want to gain a deeper understanding of the skill-acquisition process whether it be how players learn to perform in games or how data analysts learn to be effective in teams. Nick Winkelman (IRFU) was the lead-off speaker at the Forum and made some great points on both skill acquisition and the role of analytics. As both Nick and several other speakers stressed, when it comes to effective learning, “context is everything” and randomised but relevant learning opportunities provide the most effective way of acquiring and retaining new skills. Blocked repetitions of a specific skill will improve the accuracy with which a skill is performed in a training session but this does not necessarily transform into a game context when the player must not only accurately execute the skill but also make the right decision as to when to execute that particular skill. Nick argued, rightly in my mind, that too much of the data analysis linked to training is focused on workload when what is also needed is a greater input into creating the appropriate game-related contexts.

 

  1. Data analytics is only one input into decision making by coaches, albeit a potentially very important one if used effectively.

The Forum brought together a diversity of specialisms involved in high performance sport. All agreed, albeit with greater or lesser conviction, that data analytics is potentially a very important coaching tool but its effectiveness had been often limited by poor communication particularly the failure of analysts to translate analytical results into actionable insight framed in the language of coaches. I came away from the Forum feeling positive about the future of data analytics in high performance sport. Data analytics is now being seen as another tool to complement scouting, video analysis and reporting. But analysts must guard against complacency. There is still much to do in many sports and in many teams to create a thorough-going commitment to evidence-based coaching. And we will only do that by “bridging the gap” and producing actionable insight relevant to day-to-day coaching decisions.

Bridging the Gap: Improving the Coach-Analyst Relationship (Part 1)

Executive Summary

  1. The analyst must be able to translate analytical results into coaching recommendations.
  2. Data analytics can only be effective in organisations with a cultural commitment to evidence-based practice.
  3. Start simple when first introducing data analytics as a coaching tool.

 

Last week I attended the Sportdata & Performance Forum held at University College Dublin in Ireland. The Forum is in its third year having been previously held in Berlin in 2014 and 2015. The organiser, Edward Abankwa and his colleagues are to be congratulated on yet again putting together an interesting and varied programme with a good mix of speakers. Frequently European sports conferences are dominated by (association) football but this gathering was again pretty diverse with Olympic sports, rugby union, rugby league and the Gaelic sports all well represented. And crucially the Forum is not a purely sports analytics event but draws speakers and delegates involved in all aspects of sports performance – coaches, coach educators, performance analysts, data analysts, sports scientists, academics, consultants and commercial data providers. I presented an overview on developments in spatial analytics which I will discuss in a later post. In this post (split into two parts) I want to draw together the various contributions around the theme of how to make data analytics more effective in elite sports.

 

  1. The analyst must be able to translate analytical results into coaching recommendations.

A recurring theme throughout the Forum was that the impact of data analytics in elite sports is often limited by a language problem. Brian Cunniffe (English Institute of Sport) talked about the need to bridge the language gap between the coach and the analyst/scientist. So often analysts and coaches do not speak the same language. Analysts see the world as a modelling problem formulated in the language of statistics and other data analytical techniques. Coaches see the world as a performance problem formulated in the language of skill technique and tactics. My very strong view is that it is solely the analyst’s responsibility to resolve the language problem. Analytics always starts and ends with the coaches. Coaches have to make a myriad of coaching decisions. Analysts are trying to provide an evidential base to support these coaching decisions. The analysts must start by trying to understand the coaching decision problem and then translate that into a modelling problem to be analysed. The analyst must then translate the analytical results into a practical action-focussed recommendations framed in the language of coaching not the language of analytics. Denise Martin, a performance analyst consultant with massive experience in a number of sports in Ireland, summed it up very succinctly when she said that the task of the analyst is to “make the abstract tangible”. To do this the analyst must spend time with the coaches, learning how coaches see the world in just the same way as performance analysts do in order to produce effective video analysis.

 

Martin Rumo (Swiss Federal Institute of Sports) provided a great example of the coaching-analytics process working effectively. He described his experience collaborating with a football coach who wanted to evaluate how well his players were putting pressure on the ball. In order to build an algorithm to measure the degree of pressure on the ball Martin started by having a conversation with the coach to identify the key characteristics of situations in which the coach considered there was pressure on the ball. This conversation provided the bridge from the coaching problem to the modelling problem and increased the likelihood that the analytical results would have practical relevance to the coach.

 

One of the most interesting speakers at the Forum was Edward Metgod, the former Dutch goalkeeper and now a scout and analyst with the Dutch national team. Edward has a playing and coaching background, a deep commitment to self-improvement and an open mind to using the best available tools to do his job effectively. He is precisely the type of football person with whom a data analyst would want to work. Edward started his talk recounting how he had read a number of books on data analytics which he had found interesting but when he came to books on football analytics he was quickly turned off. The problem with the football analytics books is the language (although I also sensed that he had found nothing new in these books to advance his knowledge on football in any practical way). Edward then detailed that in Dutch football there is a common coaching language which breaks the game down into four moments – defensive transition, offensive transition, ball possession, and opponent ball possession. All of Edward’s reports are structured around these four moments. The clear implication for any data analyst, like myself, working in Dutch football is that you must learn this coaching language if you want to communicate effectively with coaches. I should add that I have subscribed to the four-moments perspective for several years and apply it as a way of structuring my analysis in any invasion-territorial team sport.

 

  1. Data analytics can only be effective in organisations with a cultural commitment to evidence-based practice.

The importance of having the right organisational culture to support data analytics was stressed by many of the speakers. Rob Carroll (The Video Analyst) defined culture very neatly as what a team does every day. A common characteristic of every sports organisation with which I have worked and in which data analytics has a real impact is a cultural commitment to creating an evidential base for their decisions. And that cultural commitment is led from the top by the performance director and head coach with buy-in from all of the coaching staff. As I have discussed in a previous post, Saracens epitomise an elite team in which data analytics has become part of how they do things day to day, and that culture has been built over a number of years led by their directors of rugby, initially Brendan Venter and then his successor, Mark McCall. Many European sports organisations still have a long way to go to in their analytical development and some remain staunchly “knowledge-allergic”. Analysts themselves have been part of the problem by not learning the language needed to communicate with coaches. But the organisations bear much of the responsibility for the lack of progress compared to many leading teams in the North American major leagues which have used evidence-based practice to gain a competitive advantage with the 2016 World Series champions, the Chicago Cubs, just the latest case study of how to do evidence-based practice effectively. Too often teams have appointed analysts without any real strategic purpose other than it seemed the right thing to do and what other teams were doing. Data analytics must be seen as a strategic choice by the sporting leadership of the team, a point made eloquently by as Daniel Stenz who has extensive experience in applying analytics in football in Germany, Hungary and Canada. It can also require buy-in from the team ownership particularly since, as Denise Martin explained, evidence-based practice thrives in a culture that emphasises the process not the outcome. But of course an emphasis on process requires that the team ownership adopts a long-term perspective on their sporting investment which is always difficult in sports organised as merit hierarchies with promotion and relegation (and play-offs and European qualification). When the financial risk is so dependent on sporting results the team ownership inevitably tends to become increasingly short term in judging performance so that quick-fix solutions such as signing new players or firing the head coach prevail. Analytics is unlikely ever to be a quick fix.

 

  1. Start simple when first introducing data analytics as a coaching tool.

Another common message at the Forum for teams starting out on the use of data analytics is to start simple, a point made by Denise Martin and Ann Bruen (Metrifit) amongst others. Analysts are often guilty of putting more emphasis on the sophistication of their techniques rather than the practical relevance of their results. Analytics must always be decision-driven. Providing some simple useful input into a specific coaching decision will help build credibility, respect and coach buy-in, all vital ingredients to the successful evolution of an analytical capability in a team. Complexity can come later. As Ann reminded us, avoid the TMI/NEK problem of “too much information, not enough knowledge”. Elite teams are drowning in data these days and every day it gets worse. Just try to imagine how much data on physical performance of athletes in a single training session can be produced with wearable technology. The function of an analyst is to solve the data overload problem. Analysts are in the business of reducing (i.e. simplifying) a complex and chaotic mass of data into codified patterns of variation with practical importance. Start simple, and always finish simple.

A Simple Approach to Player Ratings

Executive Summary

  • The principal advantage of a statistical approach to player ratings is to ensure that information on performance is used in a consistent way.
  • However there are numerous difficulties in using statistical techniques such as regression analysis to estimate the weightings to construct an algorithm for combining performance metrics into a single player rating.
  • But research in decision science shows that there is little or no gain in using sophisticated statistical techniques to estimate weightings. Using equal weights works just as well in most cases.
  • I recommend a simple approach to player ratings in which performance metrics are standardised using Z-scores and then added together (or subtracted in the case of negative contributions) to yield a player rating that can then be rescaled for presentational purposes.

 

The basic analytical problem in contributions-based player ratings, particularly in the invasion-territorial team sports, is how to reduce a multivariate set of performance metrics to a single composite index. A purely statistical approach combines the performance metrics using weightings derived from a team-level win-contributions model of the relationship between the performance metrics and match outcomes, with these weightings usually estimated by regression analysis. But, as I have discussed in previous posts, numerous estimation problems arise with win-contributions models so much so that I seriously question whether or not a purely statistical approach to player ratings is viable. Those who have tried to produce player ratings based on win-contributions models in the invasion-territorial team sports have usually ended up adopting a “mixed-methods” approach in which expert judgment plays a significant role in determining how the performance metrics are combined. The resulting player ratings may be more credible but can lack transparency and so have little practical value for decision makers.

 

Decision science can provide some useful insights to help resolve these problems. In particular there is a large body of research on the relative merits of expert judgment and statistical analysis as the basis for decisions in complex (i.e. multivariate) contexts. The research goes back at least to Paul Meehl’s book, Clinical versus Statistical Predictions, published in 1954. Meehl subsequently described it as “my disturbing little book” in which he reviewed 20 studies in a wide range of areas, not just clinical settings, and found that statistical analysis in all cases provided at least as good predictions, and in most cases, more accurate predictions. More than 30 years later Dawes reviewed the research instigated by Meehl’s findings and concluded that “the finding that linear combination is superior to global judgment is strong; it has been replicated in diverse contexts, and no exception has been discovered”. More recently, the Nobel Prize laureate, Daniel Kahneman, in his best-selling book, Thinking: Fast and Slow, surveyed around 200 studies and found that 60% showed statistically-based algorithms produced more accurate predictions with the rest of the studies showing algorithms to be as good as experts. There is a remarkable consistency in these research findings unparalleled elsewhere in the social sciences yet the results have been ignored for the most part so that in practice confidence in the superiority of expert judgment remains largely undiminished.

 

What does this tell us about decision making? Decisions always involve prediction about uncertain future outcomes since we choose a course of action with no certainty over what will actually happen. We know the past but decide the future. We try to recruit players to improve future team performance using information on the player’s current and past performance levels. What decision science has found is that experts are very knowledgeable on the factors that will influence future outcomes but experts, like the rest of us, are no better and indeed are often worse, when it comes to making consistent comparisons between alternatives in a multivariate setting. Decision science shows that human beings tend to be very inconsistent, focusing attention on a small number of specific aspects of one alternative but then often focusing on different specific aspects of another alternative, and so on. Paradoxically experts are particularly prone to inconsistency in the comparison of alternatives because of their depth of knowledge of each alternative. Statistically-based algorithms guarantee consistency. All alternatives are compared used the same metrics and the same weightings. The implication for player ratings is very clear. Use the expert judgment of coaches and scouts to identify the key performance metrics but rely on statistical analysis to construct an algorithm (i.e. a player rating system) to produce consistent comparisons between players.

 

So far so good but this still does not resolve the statistical estimation problems involved in using regression analysis to determine the weightings to be used. However decision science offers an important insight in this respect as well. Back in the 1970s Dawes undertook a comparison of the predictive accuracy of proper and improper linear models. By a proper linear model he meant a model in which the weights were estimated using statistical methods such as multiple regression. In contrast improper linear models use weightings determined non-statistically such as equal-weights models where it is just assumed that every factor has the same importance. Dawes traces the equal-weights approach back to Benjamin Franklin who adopted a very simple method for deciding between different courses of action. Franklin’s “prudential algebra” was simply to count up the number of reasons for a particular course of action and subtract the number of reasons against, then choose that course of action with the highest net score. It is a very simple but consistent and transparent with a crucial role for expert judgment in identifying the reasons for and against a particular course of action. Using 20,000 simulations, Dawes found that equal weightings performed better than statistically-based weightings (and even randomly generated weightings worked almost as well). The conclusion is that it is consistency that really matters, more so than the particular set of weightings used. And as well as ensuring consistency, an equal-weights approach avoids all statistical estimation problem. Equal weights are also more likely to provide a method of general application that avoids the problem of overfitting i.e. weightings that are very specific to the sample and model formulation.

 

Applying these insights from decision science to the construction of player rating systems provides the justification for what I call a simple approach to player ratings. There are five steps:

  1. Identify an appropriate set of performance metrics involving the expert judgment of GMs, sporting directors, coaches and scouts
  2. Standardise the performance metrics to ensure a common measurement scale – my suggested standardisation is to calculate Z-scores
  3. Z-scores have been very widely used to standardise performance metrics with very different scales of measurement e.g. Z-scores have been used in golf to convert very different types of metrics such as driving distance (yards), accuracy (%) and number of putts into comparable measures that could be added together.
  4. Allocate weights of +1 to positive contributions and -1 to negative contributions (i.e. Franklin’s prudential algebra)
  5. Calculate the total Z-score for every player
  6. Rescale the total Z-score to make them easier to read and interpret. I usually advise avoiding negative ratings and reducing the dependency on decimal places to differentiate players.

 

I have applied the simple approach to produce player ratings for 535 outfield players in the English Championship covering the first 22 rounds of games in season 2015/16. I have used player totals for 16 metrics: goals scored, shots at goal, successful passes, unsuccessful passes, successful dribbles, unsuccessful dribbles, successful open-play crosses, unsuccessful open-play crosses, duels won, duels lost, blocks, interceptions, clearances, fouls conceded, yellow cards and red cards. The total Z-score for every player has been rescaled to yield a mean rating of 100 (and a range 5.1 – 234.2). Below I have reported the top 20 players.

 

Player Team Player Rating
Shackell, Jason Derby County 234.2
Flint, Aden Bristol City 197.9
Keogh, Richard Derby County 196.0
Keane, Michael Burnley 195.7
Morrison, Sean Cardiff City 193.8
Duffy, Shane Blackburn Rovers 191.1
Davies, Curtis Hull City 184.3
Onuoha, Nedum Queens Park Rangers 183.2
Morrison, Michael Birmingham City 179.1
Duff, Michael Burnley 175.6
Hanley, Grant Blackburn Rovers 175.2
Tarkowski, James Brentford 171.1
McShane, Paul Reading 169.8
Collins, Danny Rotherham United 168.3
Stephens, Dale Brighton and Hove Albion 167.4
Lees, Tom Sheffield Wednesday 166.0
Judge, Alan Brentford 164.4
Blackman, Nick Reading 161.9
Bamba, Sol Leeds United 160.1
Dawson, Michael Hull City 159.7

 

I hasten to add that these player ratings are not intended to be definitive. As always they are a starting point for an evaluation of the relative merits of players and should always be considered alongside a detailed breakdown of the player rating into the component metrics to identify the specific strengths and weaknesses of individual players. They should also be categorised by playing position and playing time but those are discussions for future posts.

 

 

Some Key Readings in Decision Science

Meehl, P., Clinical versus Statistical Predictions: A Theoretical Analysis and Revision of the Literature, Minneapolis: University of Minnesota Press, 1954.

Dawes, R. M. ‘The robust beauty of improper linear models in decision making’, American Psychologist, vol. 34 (1979), pp. 571– 582.

Dawes, R. M., Rational Choice in an Uncertain World, San Diego: Harcourt Brace Jovanovich, 1988.

Kahneman, D., Thinking, Fast and Slow, London: Penguin Books, 2012.

 

More on the Problems of Win-Contribution Player Rating Systems and a Possible Mixed-Methods Solution

Executive Summary

  • There are three main problems with the win-contribution approach to player ratings: (i) statistical estimation problems; (ii) the sample-specific and model-specific nature of the weightings; and (iii) measuring contribution importance as statistical predictive power.
  • A possible solution to these problems is to adopt a mixed-methods approach combining statistical analysis and expert judgment.
  • The EA Sports Player Performance Index and my own STARS player rating system are both examples of the mixed-methods approach.
  • Decision makers require credible data analytics but credibility does not depend solely on producing results that look right. Some of the most important results look wrong by defying conventional wisdom.
  • A credible player ratings system for use by decision makers within teams requires that differences in player ratings are explicable simply but precisely as specific differences in player performance.

 

In my previous post I discussed some of the problems of adopting a win-contribution approach to player ratings in the invasion-territorial team sports. Broadly speaking, there are three main issues: (i) statistical estimation problems; (ii) the sample-specific and model-specific nature of the weightings used to combine the different skill-activities into a single player rating; and (iii) measuring contribution importance as statistical predictive power. The first issue arises because win-contribution models in the invasion-territorial team sports are essentially multivariate models to be estimated using, for example, linear regression methods. Estimation problems abound with these types of models as exemplified by the regression results reported in my previous post for the Football League Championship 2015/16 which included “wrong” signs, statistically insignificant estimates, excessive weightings for actions mostly closely connected with goals scored/conceded (i.e. shots and saves), and low goodness of fit. These problems can often be resolved by restructuring the win-contribution model. In particular, a multilevel model can take account of the sequential nature of different contributions. Also it often works better to combine attacking and defensive contributions in a single model by treating goals scored (and/or shots at goal) as the outcome of own-team attacking play and opposition defensive play.

 

The second issue with the win-contribution approach is that the search for a better statistical model to avoid the various estimation problems may yield estimated contributions for the different skill-activities which may not be generalizable beyond the specific sample used and the specific model estimated. The estimated weightings derived from regression models can be very unstable and sensitive to what other skill-activities are included. This instability problem occurs when there is a high degree of correlation between some skill-activities (i.e. multicollinearity). The generalizability of the estimated weightings will be improved by using larger samples that include multiple seasons and multiple leagues.

 

The final issue with win-contribution models estimated using statistical methods such as regression analysis is that the weightings reflect statistical predictive power. But is the value of a skill-activity as a statistical predictor of match outcomes the appropriate definition of the value of the win-contribution of that skill-activity? I do not think that we give enough explicit attention to this issue. Too often we only consider it indirectly when, for example, we try to resolve the problem of certain skill-activities having excessive weightings because of the sequential nature of game processes. Actions near the end of a sequence tend naturally to have much greater predictive power for the final outcome. Typically shots at goal is the best single predictor of goals scored while the goalkeeper’s save-shot ratio is the best single predictor of goals conceded. Using multilevel models is, when all is said and done, just an attempt to reduce the predictive power of these close-to-outcome skill-activities. The issue is of particular importance in low-scoring, more unpredictable team sports such as (association) football.

 

All of these issues with win-contribution models raise severe doubts about the usefulness of relying on a purely statistical approach such a linear regression both to identify the relevant skill-activities to be included in the player rating system, and to determine the appropriate weighting system to combine the selected skill-activities. As a result some player rating systems have tended to adopt a more “mixed-methods” approach combining statistical analysis and expert judgment. One example of this approach is my own STARS player rating system that I developed around 12 years ago, initially applied to the English Premiership and then subsequently recalibrated for the MLS. The STARS player (and team) ratings were central to the work I did for Billy Beane and the Oakland A’s ownership group on investigating the scope for data analytics in football. The STARS player rating system is summarised in the graphic below.

Blog 12 Graphic.png

Regression analysis was used to estimate a multilevel model which provided the basic weightings for the skill-activities within the five identified groupings. Expert judgment was used to decide which skill-activities to include, the functional form for the metrics, and the weightings used to combine the five groupings. Essentially this weighting scheme was based on a 4-4-2 formation with attack and defence groupings each weighted as 4/11, striking as 2/11, and goalkeeping as 1/11. (Negative contributions were reassigned to the attack and defence groupings.) Expert judgment was also used to determine the weightings of some skill-activities for which regression analysis proved unable to provide reliable estimates.

 

A very detailed account of the problems of constructing a win-contribution player rating system in football using regression analysis is provided by Ian McHale who developed the EA Sports Player Performance Index (formerly the Actim Index) for the English Premiership and Championship (see I. G. McHale, P. A. Scarf and D. E. Folker, Interfaces, July-August 2012). McHale’s experience is also discussed in David Sumpter’s recently published book, Soccermatics: Mathematical Adventures in the Beautiful Game (Bloomsbury Sigma, 2016), a must-read for all of us with an interest in applying mathematics and statistics to football. At the core of the EA Sports player rating system is a match-contribution model in which regression analysis is used to estimate a model of shots as a function of crosses, dribbles and passes as well as opposition defensive actions (interceptions, clearances and tackle-win ratio) and opposition discipline (yellow cards and red cards). The estimated model of shots is combined with shot effectiveness and then rescaled in terms of league points. In their 2012 article McHale and his co-authors report the top 20 Premiership players for season 2008/09 based on the match-contribution model and show that the list is dominated by goalkeepers (7) and defenders (11) with Fulham’s goalkeeper, Mark Schwarzer, topping the list. Only the Aston Villa midfielder, Gareth Barry (ranked 2nd), and the Chelsea striker, Nicolas Anelka (ranked 10th), break the goalkeeper-defender domination of the top ratings.

 

McHale deals with the problems of a purely statistical approach by adopting what I am calling a mixed-methods approach combining statistical analysis and expert judgement. The final version of the EA Sports Player Performance Index consists of a weighted combination of six separate indices. The match-contribution model has a weighting of only 25%. There are two indices based on minutes played which have a combined weighting of 50% with most of that weighting (37.5%) allocated to the point-sharing index which takes into account the final league points of the player’s team thereby increasing the rating of players playing for more successful teams. The other indices capture goal-scoring, assists and clean sheets and have a combined weighting of 25%. All the indices are measured in terms of league points. For comparison McHale reports the top 20 Premiership players for 2008/09 using the final index and finds that the list is now much more evenly distributed across playing distributions with Anelka now topping the list and Schwarzer ranked only 17th.

 

McHale’s mixed-methods approach is a great example of the problems faced by win-contribution player rating systems and how statistical analysis and expert judgment need to be combined to produce a credible player rating system. Credibility is absolutely fundamental to data analytics. Decision makers will ignore evidence that does not appear credible and the use of sophisticated statistical techniques does not confer credibility to the analysis, often quite the opposite. McHale recognises that a purely statistical approach using predictive power to weight different skill-activities does not provide credible player ratings and, in consultation with his clients, introduces other performance metrics using expert judgment not statistical estimation.

 

I have one further concern over the credibility of player rating systems and that is the importance of transparency when the player ratings are to be used as an input for coaching, recruitment and remuneration decisions. This is not really an issue for McHale since the EA Sports Player Performance Index is primarily directed at the media and fans (although interestingly McHale shows that there was a very close match-up between a hypothetical England team based on the player ratings and England’s starting line-up in their first game in the 2010 World Cup Finals). McHale achieves credibility through a rigorous development process that produces ratings that “look right” to his clients and to the knowledgeable fan. But such a system of player ratings would have limited value for coaches because of the lack of immediate transparency. For example, it is not immediately clear how much of the difference between the ratings of two players is due to differences in their players’ own contributions and how much is due to differences between the league performances of their respective teams. Credibility for decision makers is not just about results that “look right”. At times the data analyst will throw up surprising results which “look wrong” by defying conventional wisdom but such surprises, if they can be substantiated, may provide a real source of competitive advantage. In such cases the rigour of the analysis is unlikely to be enough. The results will need to be transparent in the sense of being explicable to the decision maker in practical terms. A credible player ratings system for use by GMs, sporting directors and coaches requires that differences in player ratings are explicable simply but precisely as specific differences in player performance. My next post will set out a simple approach to constructing player rating systems to support coaching, recruitment and remuneration decisions.

The Problems of Estimating Win-Contributions in Football (Soccer)

Executive Summary

  • The practical problems of obtaining regression-based estimates of win-contributions are illustrated using data for the English Championship regular season in 2015/16.
  • The estimated regression models for both attacking and defensive play are subject to various problems – low explanatory power, “wrong” signs, statistical insignificance, and sequence effects.
  • But the ultimate problem with regression-based approaches to player rating systems is that they reflect statistical predictive power of individual skill-activities and this may not coincide with game importance from an expert coaching perspective.
  • My conclusion is that a regression-based approach to player rating systems in the invasion-territorial team sports is not recommended.

 

Developing a player rating system in the invasion-territorial team sports using win-contributions at least in principal seems a straightforward procedure involving two stages. The first stage is to estimate the team-level relationship between skill-activities and match outcomes in order to get the weightings to be applied to each type of contribution. The most obvious statistical procedure to use is multiple regression analysis. The second stage is to calculate the overall win-contributions of individual players as a linear combination of their skill-activity contributions using the weightings estimated in the first stage. But although seemingly a straightforward multivariate problem statistically, this approach is fraught with practical difficulties. Indeed I will argue that it is often so difficult to obtain an appropriate set of weightings that a regression-based approach to estimating player win-contributions is just not viable.

 

To demonstrate the difficulty of a regression-based player rating system in the invasion-territorial team sports, I am going to use football (soccer) and specifically data from the English Championship last season (2015/16). In the table below I have reported the results for four regression models estimated using Opta data for the 552 regular-season matches (i.e. 1,104 team performances). These four estimated regression models illustrate many of the problems that bedevil regression models of team performance in football.

 

The first issue is to decide on the appropriate measure of team performance. Using league points for individual matches would imply an outcome variable with only three possible values (win = 3, draw = 1, loss = 0) which is highly restrictive and not really amenable to linear regression. It would be more appropriate to use a form of limited dependent variable (LDV) estimation technique such as logistic regression. To avoid this problem I typically use goals scored and goals conceded as measures of attacking and defensive performance, respectively, estimating two separate regression model which can be combined subsequently. Given the low-scoring nature of football and the Poisson distribution of goals, linear regression remains a rather crude statistical tool but has the advantages of simplicity and ease of interpretation.

 

Model Attack (1) Attack (2) Attack (3) Defence
Outcome Goals Scored Goals Scored Total Shots Goals Conceded
Total Shots 0.0805894   (0.006886)** 0.0698052

(0.006165)**

Shot Accuracy 3.08113     (0.1916)** 3.43328

(0.1940)**

Attempted Passes -0.000960860 (0.0005349) 0.000752025

(0.002346)

0.000554711 (0.0006363)
Pass Completion 0.842777   (0.6249) 10.1551

(2.729)**

-0.146641     (0.6778)
Dribbles 0.00485571 (0.005885) 0.0466076

(0.02582)

Dribble Success Rate 0.141340 (0.1925) 0.334019 (0.8458)
Open Play Crosses     -0.0277967   (0.005021)** 0.213755   (0.02090)**
Open Play Cross Success Rate     0.251270     (0.2396) 6.75974     (1.032)**
Attacking Duels     -0.0109001   (0.002935)** 0.0237501   (0.01283)
Attacking Duel Success Rate     0.365114     (0.3961) 6.69654     (1.727)**
Yellow Cards       -0.0745656   (0.02191)** -0.226067   (0.09603)* 0.00894283   (0.02583)
Red Cards           -0.403386     (0.1045)** -0.755729     (0.4587) 0.376279     (0.1230)**
Total Clearances   -0.0126510   (0.003719)**
Blocks -0.0103887   (0.01644)
Interceptions -0.0136031   (0.006029)*
Defensive Duels   -0.00894992   (0.003265)**
Defensive Duel Success Rate     1.69258     (0.4240)**
Goodness of Fit

R2

32.74% 26.94% 26.20% 6.26%

* = significant at 5% level; ** = significant at 1% level

 

The Attack (1) model uses goals scored as the outcome variable with five skill-activities – shots, passes, dribbles, crosses and attacking duels – plus two disciplinary metrics (yellow cards and red cards). The five skill-activities are each measured by two metrics – an activity-level metric (i.e. number of attempts) and an effectiveness-ratio metric (i.e. proportion of successful outcomes). So, for example, in the case of shots the activity-level metric is total shots and the effectiveness-ratio metric is shot accuracy (i.e. the proportion of shots on target).

 

The Attack (1) model exemplifies a number of the problems in using regression analysis to derive a set of weightings for player rating systems:

  • Low goodness of fit – the R2 statistic is only 32.7% indicating that less than a third of the variation in goals scored can be explained by the five skill-activities and discipline
  • “Wrong” signs – the estimated coefficients for attempted passes, open play crosses and attacking duels are all negative
  • Statistical insignificance – half of the estimated coefficients are not statistically different from zero
  • Sequence effects – most of the goodness of fit in the Attack (1) model is due to the two end-of-sequence metrics, total shots and shot accuracy. As the Attack (2) model shows, total shots and shot accuracy jointly account for 26.9% of the variation in goals scored.

Similar problems of wrong signs and statistical insignificance occur in the Defence model which only captures 6.3% of the variation in goals conceded across matches in part because no goalkeeping metrics have been included. But of course if goalkeeping metrics such as the saves-to-shots ratio are included, these dominate in much the same way as shooting metrics dominate estimated regression models of goals scored.

 

One solution to the problem that regression models will tend to attribute the highest weight to the end-of-sequence variables is to break the causal sequence into components to be estimated separately. The Attack (2) and Attack (3) models are an example of this approach with the Attack (2) model estimating the relationship between goals scored (final outcome) and shots (total shots and shot accuracy), and then the Attack (3) model estimating the relationship between total shots (intermediate outcome) and passes, dribbles, crosses, attacking duels and discipline. This approach resolves some of the problems encountered in the Attack (1) model. Although goodness of fit remains low with only 26.2% of the variation in total shots across matches explained by the Attack (3) model, all of the variables now have the expected signs so that attempted passes, open play crosses and attacking duels now have positive coefficients. In addition pass completion, open play cross success rate and attacking duel success rate are now statistically significant. But attempted passes, although now attributed a positive contribution, has a very small and statistically insignificant coefficient which reflects the underlying playing characteristic of the English Championship that ball possession has little predictive power for goals scored and match outcomes. And this remains the core problem with regression-based estimates of the weightings to be used in win-contributions player rating systems. Regression-based weightings reflect statistical predictive power not game importance. Ultimately I have been driven to the conclusion that regression-based player rating systems are not to be recommended for the invasion-territorial team sports. An alternative approach is the subject of my next post.

The Practical Problems of Constructing Win-Contribution Player Rating Systems in the Invasion-Territorial Sports

Executive Summary

  • Effective data-based assessment of individual player performance in team sports must resolve the three basic conceptual problems of separability, multiplicity and measurability. These problems are most acute in the invasion-territorial sports.
  • In statistical terms, the win-contribution approach to player rating systems can be seen as a multivariate problem of identifying and combining a set of skill-activity performance metrics to model team performance.
  • Regression analysis is the simplest statistical method for estimating the skill-activity weightings to be used in a win-contribution player ratings system with multiple skill-activities.
  • There are three practical problems widely encountered when using the regression method: (i) defining an appropriate measure of team performance; (ii) the skill-activity coefficients often have the wrong sign and/or are statistically insignificant; and (iii) the weightings reflect relative predictive power which may not necessarily coincide with the relative game importance of the specific skill-activity.

 

Evaluating individual player performance in team sports using a systematic data-based approach faces three basic conceptual problems:

 

  1. Separability – team performance needs to be decomposed into individual player performances but the degree of separability of individual player performances depends crucially on the basic game structure of the sport. Separability is highest in the striking-and-fielding sports such as baseball and cricket in which the core of the game is a one-to-one contest between the batter and pitcher/bowler. In the invasion-territorial sports such as the various codes of football, hockey and basketball the interdependency of player actions and the necessity for tactical coordination of players makes separability much more problematic.
  2. Multiplicity – if the game structure is such that individual players specialise in one specific skill-activity which is the dominant component of their performance (e.g. pitching and hitting in baseball with fielding treated as of only secondary importance) then evaluating player performance comes down to identifying the best metric to measure the specific skill-activity performance. However, particularly in many of the invasion-territorial sports, players undertake a multiplicity of skill-activities so that the evaluation of player performance requires finding the appropriate combination of a set of performance metrics.
  3.  Measurability – by definition, data-based player rating systems focus only on those aspects of player performance that are directly observable and measurable. To some this isn’t an issue and they will justify their position with the well-known dictum: “If you can’t measure it, you can’t manage it”. But this just isn’t true. Coaching and managing is about knowing the people for whom you are responsible and how they are performing, and learning how best to facilitate improvements in their performance. You are likely to be less effective as a coach and manager if you ignore available data on performance but likewise you will also be less effective if you focus only on the measurable aspects of performance. As always it is about using all the available evidence as best you can to improve performance. Motivation and resilience may not be directly observable and easily measurable but I doubt that there are many coaches who would argue that they are not important aspects of player performance.

As I have discussed in my previous post, there are two broad approaches to constructing player rating systems – the win-attribution approach and the win-contribution approach. The win-attribution approach, principally plus-minus scores, effectively finesses all three conceptual problems – separability, multiplicity and measurability – by focusing on outcome not process, and attributing the match score pro rata based on players’ game time. By contrast, the win-contribution approach focuses on the process of how the team performance is generated by individual player performance. And as a consequence, the win-contribution approach has to deal with the separability, multiplicity and measurability problems. Ultimately it comes down to:

  • Identifying the appropriate set of specific skill-activity performance metrics; and
  • Determining the best way of combining this set of performance metrics particularly the weightings to be used to produce an overall composite index of player performance

 

From a statistical perspective the win-contribution approach to player rating systems is just a standard multivariate problem of determining the relationship between team performance (the outcome) and the aggregate contributions of players by skill-activity (the predictors). The simplest approach is to estimate a linear regression model of team performance:

Team Performance = a + b1P1 + b2P2 + … + bkPk + u

where

P1, P2, …, Pk = skill-activity metrics (team totals)

b1, b2, …, bk = skill-activity weightings

a = intercept

u = random error term capturing non-systematic influences on team performance

The estimated regression coefficients can then be used to combine the skill-activity metrics for individual players to produce an overall measure of player performance.

 

In principle regression analysis offers a very straightforward method of creating a win-contribution player rating system for the invasion-territorial sports. However there are a number of practical problems in implementing the method to produce meaningful and useful player ratings.

 

Practical Problem 1: Defining an appropriate measure of team performance

This is not a straightforward as it might seem. If the regression model of team performance is to be estimated using season totals in a league then total league points or win-percentage are the obvious outcome measures to use but it is highly likely that data for several seasons will need to be combined in order to have enough degrees of freedom if you intend to use a large number of skill-activity metrics. The alternative approach is to use individual match data. In this case using a measure of match outcome is too restrictive. You run into all of the usual problems associated with limited dependent variable (LDV) models and are better advised to use logistic regression (or related approaches) rather than linear regression. If you want to keep using linear regression with individual match data, it is better to model team performance using scores, either a single model of the final margin or two separate models of scores for and scores against. In my work on player ratings in rugby union and rugby league, I have used individual match data and estimated two separate models for points scored and points conceded, and then combined these two models to create a model of the final margin. I found that this worked better than just estimating a single model for the final margin and seemed better able to identify the impact of different skill-activity metrics. Of course any score-based approach is more problematic in (association) football because it is such a low-scoring sport. I still tend to use goals scored and goals conceded as my outcome measures but I have also used own and opposition shots on target as outcome measures.

 

Practical Problem 2: The estimated regression coefficients may have the “wrong” sign and/or be statistically insignificant

When regression models of team performance are estimated it is more likely than not that several of the skill-activity metrics have coefficients will have the “wrong” sign and/or are not statistically significantly different from zero. There are two common reasons for wrong signs and/or statistical insignificance. First, skill-activity metrics usually suffer from a multicollinearity problem where individual variables are highly correlated with each other either directly (i.e. simple bivariate correlations) or in linear combinations. For example, teams which defend more and make more tackles also tend to make more interceptions, clearances and blocks. High levels of multicollinearity can make estimated coefficients unstable including being more prone to switching sign, as well as being more imprecise (i.e. higher standard errors) and hence more likely to be statistically insignificant. Another reason for wrong signs is that some activity-skill variables may be acting as a proxy for opposition skill-activities. For example, more defending partly reflects more attacking play by the opposition, and the more the opposition attacks, the more goals are likely to be conceded. As a consequence, defensive variables may be positively correlated with goals conceded even although more (and better) defending should be negatively correlated with goals conceded.

 

Practical Problem 3: Regression coefficients define the relative importance of contributions purely in terms of predictive power

Ultimately regression analysis is a technique for finding the linear combination of a set of variables that can provide the best predictions of the outcome variable. So the estimated coefficients are indicative of the relative predictive power of each variable. However predictive power does not necessarily equate to the relative game importance of contributions when you are dealing with processes comprising a sequence of different skill-activities. For example, in football the best predictor of goals scored is shots on target inside the box and so inevitably in any linear regression model of goals scored, the number of shots on target (especially inside the box) will have the highest weighting. But of course shots depend on passing and moving the ball forward successfully to create shooting opportunities, all of which in turn depends on winning possession of the ball in the first place. But all of these skill-activities provide much less predictive power for goals scored because they are further back the causal chain. Similarly when it comes to goals conceded the dominant predictor is the goalkeeper’s saves per shot ratio but the number of opposition shots allowed depends on defensive play such as tackles, interceptions, clearances and blocks. Defensive play is critical as a contribution to match success but statistically will always to be treated as of only secondary importance as a predictor of match outcomes. One way around this within the linear regression method is to estimate hierarchical models to capture the sequential nature of the game.

 

Despite the practical problems, it may still be possible to use the regression method to produce a meaningful and useful player rating system. After estimating the initial regression model of team performance using the basic skill-activity metrics, it is vital to undertake a specification search to find a model with better properties, specifically statistically significant coefficients with the “correct” signs as well as good diagnostics (i.e. random residual variation). The specification search may involve the use of different functional forms such as logarithms and quadratics. It can also involve the transformation of the basic skill-activity metrics. For example, suppose you have data on the total number of successful passes and the total number of unsuccessful passes. Instead of using the data in this form, it might be better to transform the two variables into a total activity measure (i.e. the total number of attempted passes = successful passes + unsuccessful passes) and a success rate (successful passes as a % of attempted passes). A more radical solution would be to use factor analysis to reconstruct the original set of metrics into a smaller set of factors based on the collinearity between the initial variables.

 

The best way forward, as always with all practical problems, is to investigate alternatives to find out what works best in a specific context. So, in that spirit, my next post will be an exploration of using alternative regression-based player rating systems to identify the “best” outfield players in the Football League Championship last season.

 

29th September 2016

Player Rating Systems in the Invasion-Territorial Team Sports – What are the Issues?

Executive Summary

  • Player rating systems are important as a means of summarising the overall match performance of individual players.
  • Player performance in the invasion-territorial team sports is multi-dimensional so that a player rating system needs to be able to combine metrics for a number of skill-activities.
  • There are two broad approaches to the construction of player rating systems – win-attribution (or top-down/holistic) approaches and win-contribution (or bottom-up or atomistic) approaches.
  • The plus-minus approach is the most widely used win-attribution approach based on the points margin when a player is playing.
  • A player’s plus-minus score is sensitive to context especially the quality of team mates and opponents. This can be controlled using regression analysis to estimate adjusted plus-minus scores.
  • The plus-minus approach offers a relatively simple way of measuring player performance without the need for detailed player performance data. But the approach works best in high scoring sports with frequent player switches such as basketball and ice hockey.

 

A central issue in sports analytics is the construction of player rating systems particularly in the invasion-territorial team sports. Player rating systems are important as a means of summarising the overall match performance of individual players. Teams can use player rating systems to review performances of their own players as well as tracking the performance levels of potential acquisitions. Moneyball highlighted the possibilities of using performance metrics to inform player recruitment decisions. But the relatively simple game structure of baseball, in essence a series of one-to-one contests between hitters and pitchers, means that the analytical problem is reduced to finding the best metrics to capture hitting and pitching performances.

 

Once we move into invasion-territorial team sports, we are dealing with sports which involve the tactical coordination of players and player performance becomes multi-dimensional. The analytical problem is no longer restricted to identifying the best metric for a single skill-activity per player (i.e. pitching or hitting in baseball) but now involves identifying the full set of relevant skill-activities and creating appropriate metrics for each identified skill-activity.

 

There are essentially two broad approaches to constructing player rating systems when player performances are multi-dimensional. One approach is the win-contribution (or bottom-up or atomistic) approach which involves identifying all of the relevant skill-activities that contribute to the team’s win ratio, developing appropriate metrics for each of these skill-activities, and then combining the set of skill-activity metrics into a single composite measure of performance. Over the years many technical and practical problems have emerged in constructing win-contribution player rating systems. I plan to discuss these in more detail in a future blog. Suffice to say, the most general criticism of the win-contribution approach is the difficulty of identifying all of the relevant skill-activities particularly those that are not directly and/or easily observable such as teamwork and resilience.

 

The alternative approach is a more holistic or top-down approach that uses the match outcome as the ultimate summary metric for measuring team performance and then attributes the match outcome to those involved in its production. I call this the win-attribution approach to player rating systems. The analytical problem is now the choice of an attribution rule.

 

Plus-Minus Player Ratings

The best-known win-attribution approach is plus-minus which has been used for many years in both basketball and ice hockey. It is a very simple method. Just total up the points scored and the points conceded whenever a specific player is on court (or on the ice), and then subtract points conceded from points scored to give the points margin. This represents the player’s plus-minus score.

 

For those of you not familiar with the plus-minus approach, here’s a simple example. Consider the following fictitious data for the first three games of a basketball team with a roster of 10 players.

The results of the three games are:

Game 1: Won, 96 – 73

Game 2: Lost, 68 – 102

Game 3: Won, 109 – 57

The minutes played (Mins) for each player, and points scored (PS) and points conceded (PC) while each player is on court, are as follows:

 

Player Game 1 Game 2 Game 3
Mins PS PC Mins PS PC Mins PS PC
P1 32 54 58 28 35 64 12 27 18
P2 29 63 45 25 33 56 13 30 21
P3 27 48 43 20 36 47 13 29 23
P4 33 58 52 27 32 63 15 33 22
P5 35 63 54 36 37 82 25 54 33
P6 22 49 24 28 44 43 33 72 30
P7 20 45 20 22 35 37 35 76 32
P8 16 37 27 24 38 51 33 77 36
P9 15 35 23 23 36 50 35 82 38
P10 11 28 19 7 14 17 26 65 32

 

A player’s plus-minus score is just the points margin (= PS – PC). So in the case of player P1 in Game 1, he was on court for 32 minutes during which time 54 points were scored and 58 points were conceded. Hence his plus-minus score is -4 (= 54 – 58). Given that the team won the game with a points margin of 23, the plus-minus score indicates a well below average performance. The full set of plus-minus scores are as follows:

 

Player Plus-Minus Scores Average Benchmark Benchmark Deviation
Game 1 Game 2 Game 3 Total
P1 -4 -29 9 -24 8.50 -32.50
P2 18 -23 9 4 10.27 -6.27
P3 5 -11 6 0 12.85 -12.85
P4 6 -31 11 -14 12.94 -26.94
P5 9 -45 21 -15 18.35 -33.35
P6 25 1 42 68 26.46 41.54
P7 25 -2 44 67 31.92 35.08
P8 10 -13 41 38 26.42 11.58
P9 12 -14 44 42 28.81 13.19
P10 9 -3 33 39 28.48 10.52

 

As well as the plus-minus scores for each player in each game, I have also reported the total plus-minus score for each player over the three games. I have also calculated an average benchmark for each player by allocating the final points margin for each game pro rata based on minutes played. So, for example, player P1 played 32 out of 48 minutes in Game 1 which ended with a 23 winning margin. An average performance would have implied a plus-minus score of 15.33 (= 23 x 32/48). His average benchmarks in Games 2 and 3 were -19.83 (= -34 x 28/48) and 13.00 (= 52 x 12/48), respectively. Summing the average benchmarks for each game gives an overall average benchmark of 8.50 for player P1. The final column reports the deviation from benchmark of the player’s actual plus-minus score.

 

In this example players P1 – P5 were given the most game time in Games 1 and 2 but all five players have negative benchmark deviations. The allocation of game time in Game 3 better reflects the benchmark deviations with players P6 – P10 given much more game time.

 

Limitations and Extensions to Plus-Minus Player Ratings

The advantage of the plus-minus approach is its simplicity. It is not dependent on detailed player performance data but only requires information on the starting line-ups, the timing of player switches, and the timing of points scored and conceded. The very first piece of work that I did for Saracens in March 2010 was to rate their players using a plus-minus approach. I focused on positional combinations – front row, locks, back row, half backs, centres, and backs – and calculated the plus-minus scores for each combination. Brendan Venter, the Director of Rugby, was very positive on the results and commented that “your numbers correspond to our intuitions”. It was on the basis of this report that I was engaged to work as their data analyst for five years. The plus-minus approach was used for player ratings in the early stages of the 2010/11 season but was eventually discarded in favour of a win-contribution approach.

 

One of the problems with the simple plus-minus approach is that it will give high scores to players who regularly play with very good players. So, if a particular player was fortunate enough to be playing regularly alongside Michael Jordan, they would have had a high plus-minus score but this reflects the exceptional ability of their team mate more than their own performance. My dear friend, the late Trevor Slack, one of the top people in sport management and a prof at the University of Alberta in Edmonton, used to call it the Wayne Gretzky effect. Those of you who know their ice hockey history will know exactly what Trevor meant. Gretzky was one of the true greats of the NHL and brought the best out of his team mates whenever he was on the ice. The Edmonton Oilers won four Stanley Cups with Gretzky in the 1980s.

 

Similarly it can be argued that the basic plus-minus approach does not make any allowance for the quality of the opposing players. Rookie players given more game time against weaker opponents will have their plus-minus scores inflated just as those players who get proportionately more game time against stronger opponents will see their plus-minus scores reduced. One way around the problems of controlling for the quality of team mates and opponents is to use Adjusted Plus-Minus which involves using regression analysis to model the points margin during a “stint” (i.e. a time interval when no player switches are made) as the function of own and opposing players. The estimated coefficients represent the adjusted plus-minus scores. There have also been various attempts to include other performance data to create real adjusted plus-minus scores which represent a hybrid of the win-attribution and win-contribution approaches.

 

Overall the plus-minus approach offers a relatively simple way of measuring player performance without the need for detailed player performance data. But the approach works best in high scoring sports with frequent player switches such as basketball and ice hockey. The plus-minus approach is not well suited to football (soccer) which is low scoring and teams are restricted to only three substitutions.

 

15th September 2016