Improving Performance Ratio Analysis Part 2: The Structured Hierarchy Approach

 

Executive Summary
-Practitioners are often criticised for using performance ratios in a very piecemeal and fragmented fashion
-The problem is often compounded by more holistic approaches such as the balanced scorecard which encourages reporting of a very diversified set of performance metrics often with little understanding of their interdependencies and links to strategic goals
-The structured hierarchy approach provides a systematic framework for the forensic investigation of performance trends and benchmark comparisons
-The structured hierarchy approach can use a formalised mathematical structure but this is not necessary nor may it be appropriate in some contexts
-The structured hierarchy approach requires that the analyst has a clear understanding of the overall structure of the process being analysed and of how the performance ratios are related to the various components of the process
-Statistically significant differences between ratios at one level are not necessarily evident at other levels

As well as ignoring the various statistical and other methodological issues with performance ratio analysis, another strand of criticism directed at practitioners has been the tendency to use performance ratios in a very piecemeal and fragmented fashion. Again this has been a very common criticism of financial ratio analysis. And in some ways the problem was made worse by another criticism that there is too much emphasis on financial performance in assessing business performance. This led to Kaplan and Norton proposing the balanced scorecard approach in which four dimensions of business performance are identified – financial, customer, business processes, and learning and growth – with businesses encouraged to monitor a set of KPIs for each dimension. Although the need for a more holistic approach to performance is well taken, in practice the balanced scorecard approach has just compounded the problem by leading to a greater range of performance metrics being reported but still used in a very piecemeal and fragmented fashion. Ittner and Larckner, in particular, have been very critical of the balanced scorecard approach, arguing that the performance metrics are seldom linked to the strategic goals of a business, the supposed links between the metrics and overall performance are not validated and tend to be more articles of faith than evidence-based, and, as a consequence, the balanced scorecard does not lead to the right performance targets being set.
But again the problem of a fragmented approach to performance ratio analysis has been recognised in finance and a more structured approach has been adopted by some and often referred to as the Du Pont system recognising the chemical conglomerate that first popularised the approach. Others have called it the pyramid-of-ratios approach. The basic idea is to take an overall performance ratio and then decompose it into constituent ratios. For example, the return on assets (ROA) is calculated as the ratio of profits to assets. ROA can be decomposed into two constituent ratios – asset turnover (= sales/assets) and profit margin (= profit/sales). These two ratios capture the two fundamentals of any business – the ability to “sweat the assets” to generate sales (as measured by asset turnover) and the ability to extract profit from sales (measured by the profit margin). So if you want to understand changes in a company’s ROA over time or you want to explain the difference in ROA between companies, you can use this structured approach to determine whether the changes/differences in ROA are due mainly to changes/differences in asset turnover which reflects external market conditions, or changes/differences in the profit margin which reflects internal production conditions. The simple ROA pyramid is summarised in Figure 1. It can be extended in both directions, upwards by relating ROA to other rates of return, and downwards by further decomposing asset turnover and profit margin.
Figure 1: The ROA Pyramid

zzzz

The Du Pont/pyramid-of-ratios approach is an example of what I call the structured hierarchy approach and provides a systematic framework for the forensic investigation of performance trends and benchmark comparisons. In particular the hierarchical structure facilitates a more efficient analysis of performance by first identifying which aspects of performance primarily account for the differences/changes in performance overall and then tunnelling down into those specific aspects of performance in more detail.
Although the structured hierarchy approach as applied in financial performance analysis often uses a multiplicative decomposition in which performance ratios are decomposed into a sequence (or “chain”) of ratios, the product of which equals the higher-level ratio, there is no need to impose such a formalised mathematical structure. You don’t need to adopt a “one-size-fits-all” approach to creating a structured hierarchy. Multiplicative decomposition is particularly useful when dealing with processes that can be broken down into a sequence of sub-processes in which the output of one sub-process provides the input for the next sub-process in the sequence. In some cases it might be more useful to apply a linear decomposition in which a ratio is broken down into the sum of a set of constituent ratios. Linear decomposition is useful when a higher-level performance ratio depends of two or more activities that are separable and relatively independent of each other. But in many cases the structured hierarchy approach is best seen as a much more informal structure without any specific mathematical structure imposed on the relationships between performance ratios. The key point is that the structured hierarchy approach requires a clear understanding of the overall structure of the process being analysed and of how the performance ratios are related to the various components of the process.

zzz

Most of my work is in the invasion-territorial team sports mainly (association) football and rugby union. When putting together a system of KPIs to track performance, I always adopt a structured hierarchy approach. The approach is quite generic across both sports as I have summarised in Figure 2. The win percentage depends on the score difference between scores made and scores conceded (a linear decomposition). Typically these performance metrics are reported as game averages to facilitate comparisons between teams and over time. Scores made represents attacking effectiveness and naturally leads you to tunnel down into the different aspects of attacking play. In football I tend to separate attacking play into three dimensions – passing, other attacking play (e.g. crosses and dribbles), and shooting. When it comes to scores conceded I tend to separate this into exit play and defence. Exit play is a familiar term in rugby union but little used in football. Working across these sports I am particularly interested in their tactical commonalities especially the territorial dimension. I plan to post in more detail on this in the near future. But suffice to say at the moment that my experience working in rugby union particularly with Brendan Venter has made me even more acutely aware of the importance of play in possession deep in your own half. Lose possession there and you are going to cause yourself trouble. It’s what I call a SIW (self-inflicted wound). The same tactical considerations are equally applicable in football and underpin the use of a deep pressing game to maximise the number of times opponents can be pressurised into losing possession deep in their own half. A pressing game is all about reducing the effectiveness of opposition exit play. As I said I will pursue this line of thinking in more detail in a subsequent post.
Figure 2: A Generic Structured Hierarchy Approach for Invasion-Territorial Sports

zz

One final point to bear in mind when working with performance ratios as a structured hierarchy. Statistically significant differences between ratios at one level are not necessarily evident at other levels. Again this is a problem that has bedevilled research in financial performance analysis. For example, research on the impact of location on business performance usually found significant differences in profitability between urban and rural locations but the urban-rural differences were often no longer statistically significant when profitability was decomposed. But just because there are statistical differences in ratios at one level in a structured hierarchy does not in any way imply that these statistical differences should be observed at other levels. One sporting example of this is the score difference. By definition if you analyse differences between winning and losing performances, the score difference will always be statistically significant – positive when a team wins, negative when a team loses. However when you break this down for individual teams it does not always follow that there are statistically significant differences in scores made and scores conceded. For some teams winning and losing is much more about the variation in their attacking effectiveness than the effectiveness of their exit play or defence. So in a win-loss analysis these teams will tend to have statistically significant differences in scores made but not in scores conceded. It can go the other way for teams where defensive effectiveness is the crucial performance differentiator.

Improving Performance Ratio Analysis Part 1: Some Lessons from Finance

Executive Summary
• Performance ratios are widely used because they are easy to interpret and enhance comparability by controlling for scale effects on performance
• But performance ratios are susceptible to a number of potential problems that can seriously undermine their usefulness and even lead to misleading recommendations on how to improve performance
• The problems with performance ratios are well known in finance but are largely ignored by practitioners
• Crucially performance ratio analysis assumes that scale effects are linear
• Before using performance ratios, analysts should explore the shape of the relationship between performance and scale, and check for linearity
• If the performance relationship is non-linear, group performances by scale and use appropriate scale-specific benchmarks for each group
• Remember effective performance ratio analysis is always trying to compare like with like

It is very common for KPIs to be formulated as performance ratios. The reason for this is very simple. Ratios can enhance comparability when there are significant scale effects. For example, it tells us very little if we compare the total activity levels of two players with very different amounts of game time. We would naturally expect that players with more game time will tend to do more. In this situation it makes more sense to control for game time and compare instead activity levels per minute played.

Blog 18.03 Graphic (Box)

As well as controlling for scale effects on performance levels, ratios can also control for size effects on the degree of dispersion which can create problems for more sophisticated statistical modelling such as regression (the so-called heteroscedasticity problem).
However, despite the very widespread use of performance ratios, there are a number of potential problems with using ratios, some of which can seriously affect the validity of any conclusions drawn about performance and even lead to misleading recommendations on interventions to improve performance. The problems with performance ratios are well known in finance where financial ratio analysis is the standard method for analysing the financial performance of businesses. Hence I believe that there are lessons to be learnt from financial ratio analysis that can be applied to improve the use of performance ratios in sport.
One of the key messages in the debates on the use of performance ratios in finance is the importance of recognising that ratio analysis implies strict proportionality. What this means is best explained diagrammatically. Suppose that we want to compare two performances, A and B, where B is a performance associated with a larger scale. Suppose also that we know the expected (or benchmark) relationship between scale and outcome and that both of the observed performances lie on the benchmark relationship. In this case performance ratio analysis would be useful only if the outcome-to-scale ratio is equal for A and B. Graphically the outcome-to-scale ratio represents the slope of the line from the origin to the performance point. It follows that A and B can only have the same performance ratio if they both lie on the same line from the origin. This is strict proportionality and is shown in Figure 1(a). Comparing performance ratios against a single benchmark ratio value presupposes that the scale-outcome relationship is linear with a zero intercept. If either of these assumptions does not hold, then it is no longer valid to draw conclusions about performance by comparing performance ratios. This is a really important point but ignored by the vast majority of users of performance ratios.
The problems of non-zero intercepts and non-linear relationships are illustrated in Figures 1(b) and 1(c). In both cases A and B are on the benchmark relationship but their performance ratios (represented by the slopes of the blue lines) differ. In these cases performances ratios become much more difficult to interpret. It is no longer necessarily the case that differences between performance ratios can be interpreted as deviations from the benchmark, implying better/worse performance after controlling for scale effects. Effectively the problem is that the scale effects have not been fully controlled so that differences in performance ratios are still partly reflecting scale effects on performance.

Blog 18.03 Graphic (Fig 1)
So what is to be done? It becomes even more important to undertake exploratory data analysis to understand the shape of the relationship between performance and the relevant scale measure. At the very least you should always plot a scatter graph of performance against scale. If it looks as if there a non-zero intercept (i.e. there is a non-scale-related component in performance), then re-calculate the performance ratio using the deviation of performance from the non-zero intercept. If the performance relationship looks to be non-linear, then categorise your performances into different scale classes and use a range of values for the benchmark ratio appropriate for different scales. For example, in association football, the number of passes is often used as a scale measure for performance ratios. But we would expect very different ratio values for teams playing a possession-based, tiki-taka passing style compared to teams adopting a more direct style. Unless the underlying benchmark relationships exhibit strict proportionality, different benchmarks should be used to evaluate the performances of possession-based teams and direct-play teams. Always try to compare like with like.
There are two other statistical problems with ratio analysis that should also be noted. First, if the same scale measure is used in several performance ratios, this can influence the degree of association between the ratios. It is called the spurious correlation problem and was first identified in the late 19th Century in studies of evolutionary biology. Using common denominators in ratios can create the appearance of a much stronger relationship between different aspects of performance than there actually exists. In some circumstances common denominators can obscure the degree of relationship between different aspects of performance. Another statistical problem with ratio analysis is that ratios can exaggerate the degree of variation as the denominator gets close to zero and the ratio becomes very large. It is crucial to be aware of these outliers since they can have undue influence on the results of any statistical analysis of the performance ratios.
Some researchers in finance have recommended abandoning financial ratio analysis and using regression analysis. But regression analysis brings its own methodological issues and is not always applicable. It also ignores the reasons for the widespread use of performance ratios, mainly their simplicity. What is needed is better use and better interpretation of performance ratios informed by an awareness of the potential problems. In addition, we need to use performance ratio analysis in a more systematic fashion which is the subject of my next post.

Ranking Teams by Performance Rather than Results: Another Perspective on International Rugby Union Rankings for 2017

Executive Summary

• Competitor ranking systems tend to be results-based
• Performance-based ranking systems are more useful for coaches by providing a diagnostic tool for investigating the relative strengths and weaknesses of their own team/athletes and opponents
• Performance-based rankings can be calculated using a structured hierarchy in which KPIs are combined into function-based factors and overall performance scores
• A performance-based ranking of international rugby union teams in 2017 suggests that the All Blacks are still significantly ahead of England mainly due to their more effective running game

Most competitor ranking systems are results-based and use either generic ranking algorithms such as the Elo ratings (first developed to rank chess players) or sport-specific algorithms often developed by the governing bodies. As well as their general interest to fans and the media, these rating systems can often be of real practical significance when used to seed competitors in tournaments. These results-based ranking systems can be very sophisticated mathematically and usually incorporate adjustments for the quality of the opponent as well as home advantage and the status of matches/tournaments. These ranking systems also tend to include results from both the current season and previous seasons, usually with declining weights so that current results are more heavily weighted. A good example of an official results-based ranking system in team sports is the World Rugby rankings.
From a coaching perspective, results-based ranking systems are of very limited value beyond providing an overall comparison of competitor quality. What coaches really need to know is why their own team/athlete and opponents are ranked more highly or not. Opposition analysis is about identifying strengths and weaknesses of opponents in order to devise a game plan that maximises the opportunities created by opponent weaknesses, and minimises the threats from opponent strengths (i.e. SWOT analysis). Opposition SWOT analysis requires a performance-based approach that brings together a set of KPIs covering the various aspects of performance. A performance-based rankings system can provide a very useful diagnostic tool that allows coaches to investigate systematically the relative strengths and weaknesses of their own team/athletes or opponents, and help inform decisions on which areas to focus in the more detailed observation-based analysis (i.e. video analysis and/or scouting).
As an example of a performance-based ranking system, I have produced a set of rankings for the 10 Tier 1 teams in international men’s rugby union (i.e. the teams comprising the Six Nations and the Rugby Championship) for 2017. These rankings are based on 36 KPIs calculated for every match involving a Tier 1 team between 1st January 2017 and 31st December 2017. In total the rankings use 118 Tier 1 team performances from 69 matches. The ranking system comprises a three-level structured hierarchy. It is a bottom-up approach which starts with 36 KPIs which are combined into five function-based factors which, in turn, are combined into an overall performance score.

Blog 18.02 Graphic (Fig 1)

There are several alternative ways of combining the KPIs into function-based factors and an overall performance score. Broadly speaking the choice is between using expert judgment or statistical methods (as I have discussed in previous posts on player rating systems). In the case of my performance rankings for international rugby union, I have used a statistical technique, factor analysis, to identify 5 factors based on the degree of correlation between the 36 KPIs. Effectively factor analysis is a method of data reduction that exploits the common information across variables (as measured by the pairwise correlations). If two KPIs are highly correlated this suggests that they are essentially providing two measures of the same information and so could be usefully combined into a single metric. Factor analysis extracts the different types of common information from the 36 KPIs and restructures this into a smaller set of independent factors. The five factors can be easily interpreted in tactical/functional terms (with the dominant KPIs indicated in parentheses):
Factor 1: Attack (metres gained, defenders beaten, line breaks, Opp 22 entry rate)
Factor 2: Defence (tackles made, tackle success rate, metres allowed)
Factor 3: Exit Play, Kicking and Errors (Own 22 exit rate, kicks in play, turnovers conceded)
Factor 4: Playing Style (carries, passes, phases per possession)
Factor 5: Discipline (penalties conceded)
The factors are calculated for every Tier 1 team performance in 2017, averaged for each Tier 1 team, adjusted for the quality of the opposition, rescaled 0 – 100 with a performance score of 50 representing the average performance level of Tier 1 teams in 2017, and normalised so that around 95% of match performances are in the 30 – 70 range. The results are reported in Table 1 with the results-based official World Rugby rankings included for comparison. (It should be noted that the official World Rugby rankings cover all the rugby-playing nations, allow for home advantage and include pre-2017 results but exclude the tests between New Zealand and the British and Irish Lions.)

Blog 18.02 Graphic (Table 1)

Despite the differences in approach between my performance rankings and the official World Rugby rankings, there is a reasonable amount of agreement. Based only on 2017 performances, the gap between New Zealand and England in terms of performances remains greater than suggested by the official rankings. Also Ireland rank above England in performance but not in the official rankings, suggesting that Ireland’s narrow win in Dublin in March to deny England consecutive Grand Slams was consistent with the relative performances of the two teams over the whole calendar year.
Of course, the advantage of the performance-based approach is that it can be used to investigate the principal sources of the performance differentials between teams. For example, England rank above New Zealand in three out of five of the factors (Factors 2, 3, 5) and only lag slightly behind in another factor (Factor 4). The performance gap between England and the All Blacks is largely centred on Factor 1, Attack, and principally reflects the much more effective running game of the All Blacks which averaged 517m gained per game in 2017 (the best Tier 1 game average) compared to a game average of 471m gained by England (which ranks only 5th best). It should also be noted that All Blacks had a significantly more demanding schedule in 2017 in terms of opposition quality with 8 out 14 of their matches against top-5 teams (with the Lions classified as a top-5 equivalent) whereas England had only 2 out of 10 matches against top-5 opponents.

 

Small is Beautiful: Big-Data Analytics and the Big-to-Small Translation Problem

Happy New Year. And apologies for the lack of posts on Winning With Analytics over the last year. Put it down to my Indiana-Jones-type existence, a university prof by day, and a sports data analyst by night. This duality of roles became even more hectic in 2017 as I returned to rugby union to work again with Brendan Venter now Technical Director at London Irish as well as assisting South Africa and Italy. I have also continued my work with AZ Alkmaar in Dutch football. To some I might seem to be a bit of a dilettante, trying to work simultaneously at an elite level in two different sports. Far from it. Much of the insights on game tactics and analytical methods are very transferable across the two sports. The last 12 months have probably been one of my most productive periods in developing my understanding of how to best use data analytics as part of an evidence-based approach to coaching. I hope to share much of my latest thinking with you over the coming months with regular posts.

Executive Summary
• Data analytics is suffering from a fixation with big-data analytics.
• Big-data analytics can be a very powerful signal-extraction tool to discover regularities in the data.
• But big-data exacerbates the big-to-small translation problem; big-data, context-generic statistical analysis must be translated into practical solutions to small-data (i.e. unique), context-specific decision problems.
• Sports analytics is most effective when the analyst understands the specific operational context of the coach, produces relevant data analysis and translates that analysis into practical recommendations.

The growth in data analytics has been closely associated with the emergence of big data. Originally “big data” referred to those really, really big databases that were so big as to create significant hardware capacity problems and required clusters of computers to work together. But these days the “big” in big data is, much like beauty, in the eye of the beholder. IBM categorise big-data analytics in terms of the four V’s – Volume (scale of data), Velocity (analysis of streaming data), Variety (different forms of data), and Veracity (uncertainty of data). The 4 V’s capture the core problems of big-data analytics – trying to analyse large datasets that are growing exponentially with data captured from multiple sources of varying quality and reliability. I always like to add a fifth V – Value. Big-data analytics must be relevant to the end-user, providing an evidential base to support to the decision-making process.

Sports analytics, just like other applications of data analytics, seems to have been bitten by the big-data bug. In my presentation last November at the 4th Annual Sportdata & Performance Forum held in Zurich, I called it the “big-data analytics fixation”. I don’t work with particularly big datasets, certainly not big in the sense of exceeding the capacity of a reasonably powerful PC or laptop. The basic XML file produced by Opta for a single football match has around 250k data points so that a database covering all matches in a football league for one season contains around 100m data points. This is pretty small compared to some of the datasets used in business analytics but sizeable enough to have totally transformed the type of data analysis I am now able to undertake. But I would argue very strongly that the basic principles of sports analytics remain unchanged irrespective of the size of the dataset with which the analyst is working.

Big-data analytics exacerbates what I call the big-to-small translation problem. Big-data analytics is a very powerful signal-extraction tool to discover regularities in the data. Big-data analytics, like all statistical modelling, attempts to decompose observed data into systematic variation (signal) and random variation (noise). The systematic variation captures the context-generic factors common to all the observations in a dataset while the random variation represents the context-specific factors unique to each individual observation. But while analytical modelling is context-generic, decisions are always unique and context-specific. So it is important to consider both the context-generic signal and the context-specific noise. This is the big-to-small translation problem. Understanding the noise can often be just as important, if not more so, as understanding the signal when making a decision in a specific context. Noise is random variation relative to the dataset as a whole but random does not necessarily mean inexplicable.

I disagree profoundly with the rather grandiose end-of-theory and end-of-statistics claims made for big-data analytics. Chris Anderson in an article on the Wired website back in 2008 claimed that the data deluge was making the scientific method obsolete. He argued that there was no longer any need for theory and models since in the world of big data correlation supersedes causation. Indeed some have argued that big-data analytics represents the end of statistics since statistics is all about trying to make inferences about a population from a sample but big data renders sampling irrelevant when we are now working with population data not small samples. But evidence-based practice always requires an understanding of causation. Recommendations that do not take into account the specific operational context and the underlying behavioural causal processes are unlikely to carry much weight with decision-makers.

There is a growing awareness in sports analytics of the big-to-small translation problem. In fact the acceptance by coaches of data analytics as an important source of evidence to complement video analysis and scouting is crucially dependent on analysts being able to translate the results of their data analysis into context-specific recommendations such as player recruitment targets, game tactics against specific opponents, or training session priorities. It was one of the themes to emerge from the presentations and discussions at the Sportdata & Performance Forum in November 2017 (yet again an excellent and very informative event organised by Edward Abankwa and his team at the Pinnacle Group). As one participant put it so well, “big data is irrelevant unless you can contextualise it”. And in a similar vein, a representative of a company supplying wearable technologies commented that their objective is “making big data personally relevant”. Sports analytics is most effective when the analyst understands the specific operational context of the coach, produces relevant data analysis that provides an appropriate evidential base to support the specific decision, and translates that analysis into practical recommendations to the coach on the best course of action.