The Problems of Estimating Win-Contributions in Football (Soccer)

Originally Written: October 2016

Executive Summary

  • The practical problems of obtaining regression-based estimates of win-contributions are illustrated using data for the English Championship regular season in 2015/16.
  • The estimated regression models for both attacking and defensive play are subject to various problems – low explanatory power, “wrong” signs, statistical insignificance, and sequence effects.
  • But the ultimate problem with regression-based approaches to player rating systems is that they reflect statistical predictive power of individual skill-activities and this may not coincide with game importance from an expert coaching perspective.
  • My conclusion is that a regression-based approach to player rating systems in the invasion-territorial team sports is not recommended.

Developing a player rating system in the invasion-territorial team sports using win-contributions at least in principal seems a straightforward procedure involving two stages. The first stage is to estimate the team-level relationship between skill-activities and match outcomes in order to get the weightings to be applied to each type of contribution. The most obvious statistical procedure to use is multiple regression analysis. The second stage is to calculate the overall win-contributions of individual players as a linear combination of their skill-activity contributions using the weightings estimated in the first stage. But although seemingly a straightforward multivariate problem statistically, this approach is fraught with practical difficulties. Indeed I will argue that it is often so difficult to obtain an appropriate set of weightings that a regression-based approach to estimating player win-contributions is just not viable.

To demonstrate the difficulty of a regression-based player rating system in the invasion-territorial team sports, I am going to use football (soccer) and specifically data from the English Championship last season (2015/16). In the table below I have reported the results for four regression models estimated using Opta data for the 552 regular-season matches (i.e. 1,104 team performances). These four estimated regression models illustrate many of the problems that bedevil regression models of team performance in football.

The first issue is to decide on the appropriate measure of team performance. Using league points for individual matches would imply an outcome variable with only three possible values (win = 3, draw = 1, loss = 0) which is highly restrictive and not really amenable to linear regression. It would be more appropriate to use a form of limited dependent variable (LDV) estimation technique such as logistic regression. To avoid this problem I typically use goals scored and goals conceded as measures of attacking and defensive performance, respectively, estimating two separate regression model which can be combined subsequently. Given the low-scoring nature of football and the Poisson distribution of goals, linear regression remains a rather crude statistical tool but has the advantages of simplicity and ease of interpretation.

ModelAttack (1)Attack (2)Attack (3)Defence
OutcomeGoals ScoredGoals ScoredTotal ShotsGoals Conceded
Total Shots0.0805894   (0.006886)**0.0698052 (0.006165)**  
Shot Accuracy3.08113     (0.1916)**3.43328 (0.1940)**  
Attempted Passes-0.000960860 (0.0005349) 0.000752025 (0.002346)0.000554711 (0.0006363)
Pass Completion0.842777   (0.6249) 10.1551 (2.729)**-0.146641     (0.6778)
Dribbles0.00485571 (0.005885) 0.0466076 (0.02582) 
Dribble Success Rate0.141340 (0.1925) 0.334019 (0.8458) 
Open Play Crosses    -0.0277967   (0.005021)** 0.213755   (0.02090)** 
Open Play Cross Success Rate    0.251270     (0.2396) 6.75974     (1.032)** 
Attacking Duels    -0.0109001   (0.002935)** 0.0237501   (0.01283) 
Attacking Duel Success Rate    0.365114     (0.3961) 6.69654     (1.727)** 
Yellow Cards      -0.0745656   (0.02191)** -0.226067   (0.09603)*0.00894283   (0.02583)
Red Cards          -0.403386     (0.1045)** -0.755729     (0.4587)0.376279     (0.1230)**
Total Clearances     -0.0126510   (0.003719)**
Blocks   -0.0103887   (0.01644)
Interceptions   -0.0136031   (0.006029)*
Defensive Duels     -0.00894992   (0.003265)**
Defensive Duel Success Rate       1.69258     (0.4240)**
Goodness of Fit R232.74%26.94%26.20%6.26%

* = significant at 5% level; ** = significant at 1% level

The Attack (1) model uses goals scored as the outcome variable with five skill-activities – shots, passes, dribbles, crosses and attacking duels – plus two disciplinary metrics (yellow cards and red cards). The five skill-activities are each measured by two metrics – an activity-level metric (i.e. number of attempts) and an effectiveness-ratio metric (i.e. proportion of successful outcomes). So, for example, in the case of shots the activity-level metric is total shots and the effectiveness-ratio metric is shot accuracy (i.e. the proportion of shots on target).

The Attack (1) model exemplifies a number of the problems in using regression analysis to derive a set of weightings for player rating systems:

  • Low goodness of fit – the R2 statistic is only 32.7% indicating that less than a third of the variation in goals scored can be explained by the five skill-activities and discipline
  • “Wrong” signs – the estimated coefficients for attempted passes, open play crosses and attacking duels are all negative
  • Statistical insignificance – half of the estimated coefficients are not statistically different from zero
  • Sequence effects – most of the goodness of fit in the Attack (1) model is due to the two end-of-sequence metrics, total shots and shot accuracy. As the Attack (2) model shows, total shots and shot accuracy jointly account for 26.9% of the variation in goals scored.

Similar problems of wrong signs and statistical insignificance occur in the Defence model which only captures 6.3% of the variation in goals conceded across matches in part because no goalkeeping metrics have been included. But of course if goalkeeping metrics such as the saves-to-shots ratio are included, these dominate in much the same way as shooting metrics dominate estimated regression models of goals scored.

One solution to the problem that regression models will tend to attribute the highest weight to the end-of-sequence variables is to break the causal sequence into components to be estimated separately. The Attack (2) and Attack (3) models are an example of this approach with the Attack (2) model estimating the relationship between goals scored (final outcome) and shots (total shots and shot accuracy), and then the Attack (3) model estimating the relationship between total shots (intermediate outcome) and passes, dribbles, crosses, attacking duels and discipline. This approach resolves some of the problems encountered in the Attack (1) model. Although goodness of fit remains low with only 26.2% of the variation in total shots across matches explained by the Attack (3) model, all of the variables now have the expected signs so that attempted passes, open play crosses and attacking duels now have positive coefficients. In addition pass completion, open play cross success rate and attacking duel success rate are now statistically significant. But attempted passes, although now attributed a positive contribution, has a very small and statistically insignificant coefficient which reflects the underlying playing characteristic of the English Championship that ball possession has little predictive power for goals scored and match outcomes. And this remains the core problem with regression-based estimates of the weightings to be used in win-contributions player rating systems. Regression-based weightings reflect statistical predictive power not game importance. Ultimately I have been driven to the conclusion that regression-based player rating systems are not to be recommended for the invasion-territorial team sports. An alternative approach is the subject of my next post.