Advancements in technology have recently allowed us to collect and analyse large-scale fine-grained data about human performance, drastically changing the way we approach sports. Here, we provide the first comprehensive analysis of individual and team performance in One-Day International cricket, one of the most popular sports in the world. We investigate temporal patterns of individual success by quantifying the location of the best performance of a player and find that they can happen at any time in their career, surrounded by a burst of comparable top performances. Our analysis shows that long-term performance can be predicted from early observations and that temporary exclusions of players from teams are often due to declining performances but are also associated with strong comebacks. By computing the duration of streaks of winning performances compared to random expectations, we demonstrate that teams win and lose matches consecutively. We define the contributions of specialists such as openers, all-rounders and wicket-keepers and show that a balanced performance from multiple individuals is required to ensure team success. Finally, we measure how transitioning to captaincy in the team improves the performance of batsmen, but not that of bowlers. Our work emphasizes how individual endeavours and team dynamics interconnect and influence collective outcomes in sports.
Received: 30 January 2024 Accepted: 8 June 2024
Subject Category:
Computer science and artificial Intelligence
Subject Areas:complexity
Keywords:sports analytics, cricket, team science, science of success
1. Introduction
The inception of sports, notably the Olympic Games in ancient Greece, played a pivotal role in cultural and societal bonding [1,2]. As societies evolved, sports mirrored changes in social structures, becoming more organized and diverse [3,4]. Recent digital technology advancements and enhanced data acquisition capabilities have ushered in a new era of sports analytics,providing valuable insights into athlete and team performance [5-9]. In baseball, the 'Moneyball' revolution popularized the strategic use of data analytics, profoundly altering team management and player evaluation [10-12]. Premier leagues in other sports such as basketball (NBA) and American football (NFL) have similarly embraced analytics to optimize player performance, team strategies and in-game decisions, leading to stylistic shifts in play [13-15]. Intellectual games like chess have also advanced with the introduction of sophisticated chess engines [8,16,17].
One-Day International (ODI) cricket, the world's second most-followed sport after soccer [18,19], enjoys widespread popularity primarily in Commonwealth countries, including India, Australia, New Zealand, the United Kingdom, South Africa, the West Indies, Sri Lanka and Pakistan. Lhe availability of match data, driven by amateur and professional enthusiasts, has fostered various analyses. One of the major lines of research has been predicting the match outcomes of ODI matches using a variety of techniques such as machine learning [20], logistic regression using pre-match covariates [21,22] and logistic regression using in-game dynamic variables [23]. Other studies have tried to uncover specific patterns based on performance such as the hot-hand effect [24] and ranking of the players [25] or modelling the dynamics of the game [26]. A few studies have tried to investigate in detail the batting [27,28] and bowling [29] aspects of the game. With the advent of shorter and faster formats of the game such as L20, some attention has been devoted to investigating the effect of premier leagues on social media activity [30] as well as on other formats of the game [31]. Going beyond the specificities of ODI cricket, researchers have tried to examine the role of injuries in cricket [32-34]. Despite this, there remains a substantial gap in the understanding of individual performances and their contributions to team success in ODI cricket. In this work, we track players' careers, unveiling universal patterns of performance and identifying correlates of team-level success.
Lhe progression of a player's skill level enhances their likelihood of surpassing previous performance peaks [24]. Conversely, as players age, their athletic prowess may diminish, potentially impeding their ability to exceed past achievements [32]. Lhis contrast of skill development against physical decline poses a critical question: at which point in their careers do players deliver their best performance? We address this question in ?3.1 by making use of tools and methodologies from data science and extreme value theory [35,36] already deployed in diverse fields like a science of science [37-39], arts [40-42] and sports [8,43,44].
Early identification of talent and skills can provide key advantages in many competitive settings, from firm growths [45] and information spreading [46] to sports, where nurturing talent in young players can lead to higher returns [47-49]. In ?3.2, we extend this inquiry to investigate the relationship between a player's initial performance and their overall career trajectory to capture whether it is possible to see hints of future performance based on early-career display.
Fluctuations in team composition frequently arise due to injuries and variations in player performance [50]. While injuries often occur unpredictably, a decline in performance typically manifests more gradually and may not be immediately apparent. A fitting inquiry in this context is whether it is possible to discern patterns in a player's performance preceding their expulsion from the team. We study this aspect in ?3.3.
Lhe composition of an effective team encompasses skilled players under proficient leadership. Case studies across various domains, including sports, science and business, have illustrated scenarios where moderately weak teams achieve success under adverse conditions, guided by strategic leadership [51-54]. Here, we ask the reverse question in ?3.4: is a player's performance affected by the burden of leadership? However, from a collective perspective, the strategic composition of a team is crucial for its effective functioning. Lhe concept of using specialists-individuals who perform highly specific roles-extends beyond sports into various domains of human activity, including business organizations, scientific research, software development and even hunter-gatherer societies [55-57]. We analyse the role of three specialists-openers, all-rounders and wicket-keepers in ?3.5.
So far, we have predominantly focused on individual contributions to team success in ODI cricket. We now broaden our perspective to analyse team success, setting aside specific individualistic factors that contribute to victory. Given that each cricket match culminates in a definitive outcome, our interest lies in discerning patterns of wins and losses for teams. We apply established metrics from the literature to quantify patterns in team success in ?3.6. Collectively, our work presents the first comprehensive analysis of individual and team performance in ODI cricket.
2. Material and methods
2.1. Data collection
We extracted data on 4418 ODI matches in men's cricket, involving 2863 unique players by Web scrapping howstat.com [58], an open-access repository for cricket statistics, using BeautifulSoup and urllib, two python libraries. This dataset encompasses records of all ODI matches played from their inception in 1971 until March 2024. For each match, we extracted information including date, teams, runs scored, wickets taken and overs played by each team, along with player names and their contributions in terms of batting and bowling. This included details like batting position, number of balls faced, number of fours and sixes hit, number of overs bowled, number of runs conceded, number of maiden overs and number of wickets taken. Furthermore, data on the captain and wicket-keeper for each match were also collected.
Although the primary aggregation of data is at the team level for each match, the dataset is also suitable for an in-depth exploration of individual player trajectories and careers. This allows for a multi-faceted analysis of performance trends, the impact of various factors on player and team success, and the evolution of the sport over more than five decades.
2.2. Classification of players
We give a brief description of our methodology for classifying players. For interested readers, we direct them to 2024 Cricket-Wikipedia [59] for the exact role of each type of player. Note that a player can be simultaneously classified into multiple categories. For example, a batsman can be an all-rounder, a captain, an opener and a wicket-keeper.
Batsmen: We consider players as batsmen who have played at least 25 matches and batted at position seven or above in at least 50% of their matches. We have 580 batsmen in our dataset after applying this criteria.
Bowlers: We classify players as bowlers who have played at least 25 matches and bowled in at least 50% of the matches they played. We have 551 bowlers in our dataset after applying this criterion.
All-rounders: We consider all-rounders as players who have played at least 25 matches and are classified as both batsmen and bowlers using the above definitions. We have 117 all-rounders in our dataset.
Captains: In our analysis, we consider players as captains if they have captained the team in at least 15 matches. The information about captaincy is available as metadata on the website.
Openers and non-openers: we characterize openers as players who have batted at first or second position at least 10 times in their career, while non-openers as players who have batted at positions three to six at least 10 times in their career.
Wicket-keepers: In our analysis, we consider players as wicket-keepers based on the metadata available for each match on the website.
Fielders: all players on the bowling side who are not bowling or wicket-keeping are classified as fielders in our analysis.
2.3. Data normalization
Team scores per match have generally increased over the decades (see electronic supplementary material, SMI). This observed variation is potentially due to a confluence of factors, including modifications in game regulations (fielding rules, for example) [60], advancements or changes in the equipment used (notably the cricket bat) [61] and the reduction in the dimensions of the playing field to 65-75 m compared to 80-85 m in stadiums earlier as spectators are interested in high-scoring matches [62]. To facilitate a robust and equitable comparison of player and team performances across distinct time periods, we account for the inflation in run-scoring by implementing a normalization procedure on all batting statistics. Specifically, we multiplied runs scored by the batsmen and the runs given by a bowler in a given year by a normalization factor nf = ,^(tm)f(tm)f " such that the average team score is constant over the examined time period. Here, {Team runs)aU is the average team score across all years, while (Team runs)year is the average team score in the given year. This procedure was originally introduced to correctly assess the impact of scientific papers published in different years
[63]. Our approach ensures that the comparative analysis of players' performance from different eras is I conducted in an unbiased manner that controls for background temporal trends.
2.4. Statistical data analysis
2.4.1. Statistical tests
Linear regression is used to estimate the linear relationship between two variables and returns a correlation coefficient. We use it in ?3.2.
Kolmogorov-Smirnov test (K-S test for short) is a non-parametric test used to determine if two unpaired sampled distributions come from the same underlying distribution [64]. The null hypothesis is that the two distributions are the same and the p-value gives the probability that the samples in question are taken from the same distribution. We use the K-S test in Ё3.1 and 3.5.
Wilcoxon signed-rank test is a non-parametric test used to determine if the location of means of two paired distributions is the same [65]. Assuming the null hypothesis that the means are the same, it returns the probability that the null hypothesis is true. We use the Wilcoxon signed rank test in ?3.1.
Mann-Whitney U test is a non-parametric test of the null hypothesis that the unpaired distributions underlying the two samples are the same [66]. This test checks the null hypothesis if one of the distributions is stochastically larger or smaller than the other. We use the Mann-Whitney U test in Ё3.4-3.6.
Welch's t-test is used to determine if two populations have similar means when the variances and sample sizes may or may not be equal [67]. We use this test in ?3.5.
All the above tests are conducted using the SciPy package of Python [68].
2.4.2. Fractional contribution
To quantitatively and equitably assess the impact of each player, we introduce the fractional contribution fc as
so that a player's maximum possible contribution (fc) is normalized within the range of 0-1. It is important to note that, given this framework, specialized batsmen and bowlers have an upper limit of fc = 0.5 in a best-case scenario.We use this definition in Ё3.5 and 3.6.
2.4.3. Effective team size
We employ the concept of effective team size, a well-established metric in the team analysis literature [69,70] to characterize the heterogeneous contributions of individuals to the team. Effective team size quantifies the essential number of players in a team, effectively measuring the redundancy in team composition. Mathematically, we denote the contribution of the player to the team score as fc, such that ^ fc = 1, as per equation (2.1). The effective team size Seff is defined in such a way that it equals the actual team size if contributions are evenly distributed among all players and reduces to 1 if a single player is solely responsible for the entire performance. A common approach to calculate Seff is to use the formula Seff = 2 , where H = - ^c/clog2/c. Statistically, Seff represents the entropy of the distribution of contributions /; from team players. We use this concept in ?3.6.significant and meaningful patterns. Here, we use two null models for detecting individual and team performance signatures.
2.5.1. Best individual performance
In ?3.1, we characterize the temporal patterns of top performances. In order to establish a baseline for the measures, we shuffle the timestamps of individual performances for each player 100 times, keeping the actual values of performance the same. In this way, we are breaking the temporal correlation between different performances. For each shuffling, we find the timestamp for the best (N*), second best (N * *) and third best (N * * *) performances of a player. We compare these randomly shuffled data to the original dataset to establish the similarities and differences.
2.5.2. Hot and cold streaks for teams
In ?3.6, we show the existence of hot and cold streaks for teams. We consider all the teams that have played at least 200 matches. By keeping the number of wins constant, we shuffle the timestamps of the match results 104 times for each team. For each reshuffling we compute the number of consecutive wins (hot) and losses (cold) denoted by Lstreak. By taking the average across 104 reshuffles, we compute Nrand to highlight the significance of Ndata-the number of streaks of length I or higher in the actual dataset.
3. Results
3.1. Temporal patterns of top performances
Methods: We designate N*, N * * and N * * * as the match numbers corresponding to a player's best, second-best and third-best performances, respectively. For batsmen, this is marked as their highest run score. For bowlers, it corresponds to the highest number of wickets taken. Fn instances of identical runs or wickets, the performance involving respectively fewer balls played or fewer runs conceded is considered. Additionally, to account for variations in the career lengths of the individual players, we normalize the timing of peak performances (N*,N* * and N* * *) with overall career lengths (N).
We are interested in two questions: (i) does the best performance (N*) occur at a specific time within a player's career? and (ii) are the best performances closely related in time? Fo answer the first question, we calculate the probability distribution function of N* as well as N*/N and compare it with the randomized dataset as shown in ?2.5. Fhis analysis was originally introduced in Shi et al. [38] to test the presence of hot streaks in time series data. We use the K-S test to check if the distributions obtained from the original and randomized datasets differ. On the other hand, to investigate the second question, we divide the differences between the top performances AN, i.e. |JV*-iV**| and \N* - N * * * |, normalized by the career length (N) into five bins. Within each bin, we compute the ratios of the number of players in the data (nf%a) and the number of players in the randomized data ("an)- We run the Wilcoxon signed rank test to determine if the values of the ratios differing from 1 are significant.
Results: Fhe probability distribution function P(N*) for all players is shown in figure la. We observe that P(N*) is a monotonically decreasing function, suggesting a much higher likelihood of peak performance occurring earlier rather than later in a career. However, this analysis does not account for variations in career lengths. When normalizing the timing of peak performance relative to overall career length, we observe a uniform distribution, as shown in figure la (inset). Fhis pattern, previously dubbed as the 'random impact rule' for scientific careers [37], suggests that the timing of peak performance is generally unpredictable and can occur at any point in a career. Fhe K-S test comparing the data to the null model gives p > 0.05 in both cases, thus giving an indication that the null hypothesis of the two distributions being statistically the same cannot be rejected.
In figure lb, we plot the ratios of the number of players in the bins of AN/N for the original and the randomized dataset. We notice that, for AN/N G [0.0,0.2) (indicating top performances closely related in time), we observe a ratio > 1, signifying that the number of players having small gaps in their best performances is higher than expected by chance on randomized data. Fn other words, they exhibit hotstreaks- brief periods of time accentuated by best performances of comparable magnitude. In contrast, the ratio of the number of players for AN/N ? [0.6,1.0] (indicating that best performances are far away in time) is always less than one. This implies that fewer players have long gaps between best performances compared to random expectations. The Wilcoxon signed rank test indicates a significant deviation of ratios from 1 (p < 0.001) for AN/N e [0.0,0.2) and AN/N e [0.6,1.0] for both |JV*-JV* *| and |JV*-iV* * *|. Our results are robust to the number of bins as seen in electronic supplementary material, SM2.
3.2. Long-term performance from early career observations
We are interested in determining the relationship between a player's early career trajectory and their overall career performance. Methods: In figure 2a,b, we correlate early and overall career statistics. Our analysis considers players who have participated in a minimum of 50 matches. We focus on their mean performance during the initial 25 matches. For batsmen, we look at the average runs scored per match in this early phase, whereas for bowlers, we look at the average number of wickets taken per match within the same period. Concurrently, we examine the full-career performance of these players.
Results: The scatter plots in figure 2 reveal a correlation between early and full career average performance metrics. We calculate the linear regression coefficients R = 0.45 for batsmen (figure 2a) and R2 = 0.66 for bowlers (figure 2b). This finding suggests that a player's initial performance may serve as an indicator of their subsequent career performance. However, notable differences emerge when comparing batsmen and bowlers. In general, approximately 55% of batsmen are observed to improve upon their early career averages, whereas this figure drops to about 45% for bowlers.
3.3. Effect of drop and re-entry
Persistence (or lack thereof) of team composition can have a substantial impact on individual as well as team performance. Methods: Figure 2c,d shows the average performance trajectories of batsmen and bowlers, respectively, both prior to their removal from the team and subsequent to their rejoining. We consider all players who experienced a temporary exclusion from the team for a minimum of three matches before making a return.
Results: We observe a consistent downtrend in performance during the pre-removal phase, with average runs and wickets demonstrating a monotonic decline, amounting to an approximate 19% reduction for both batsmen and bowlers compared to five matches before the exclusion. Notably, the lowest performance levels are recorded in the match just before the player's exclusion. Fn contrast, players tend to exhibit a strong comeback performance after reinstatement to the team. In particular, the initial performance post-return exhibits a substantial elevation, with batsmen registering an approximate 36% improvement and bowlers showing a 30% enhancement compared to their final pre-removal performance. This enhanced performance appears to be stable in subsequent matches, consistently displaying higher performance compared to the pre-removal, suggesting that short temporal exclusion may improve performance in the longer run.
3.4. Role of leadership
Proficient leadership can enhance performance in teams of skilled individuals. Methods: We consider captains who have led their team in a minimum of 15 matches (?2.2). Within our dataset, we identify 172 captains, comprising 71% batsmen and 29% bowlers, indicating a higher propensity for batsmen to be chosen as captains-more than twice as often as bowlers. Despite these differences, captain-bowlers remain in this position for longer than the batsmen on average (see electronic supplementary material, SM5). We first compare the career performance of captains with other players for both batsmen and bowlers. We use the Mann-Whitney U test to determine if the distribution of captain performances is the same as that of the players' performance distribution. To investigate the influence of leadership, we divide the career trajectories of captains into three phases: before, during and after the captainship. We are interested if the performance qualitatively changes during the three phases.
Results: Figure 3a presents a comparison of the career average performances between captains and non-captains (referred to as players). Quantitatively, the average runs scored by captain-batsmen (31) exceed those of players (26) by approximately 16%. Conversely, captain-bowlers secure, on average, 18% fewer wickets than non-captain-bowlers (0.96 versus 1.13). We have considered only the matches where players get a chance to perform. For both batsmen and bowlers, the Mann-Whitney U test yields p < 0.001. This disparity suggests differing pathways to captaincy, where batsmen may need to consistently outperform peers, while bowlers' captaincy seems less dependent on individual performance.
For both batsmen (figure 3b) and bowlers (figure 3c), we observe that batsmen typically enhance their performance upon assuming captaincy whereas bowlers exhibit a decline. A closer examination of figure 3c reveals a bimodal distribution during the captaincy phase for bowlers, indicating a dichotomy in skill sets among captain-bowlers. Post-captaincy the trajectories diverge significantly for batsmen and bowlers. For batsmen, average performance markedly decreases post-captaincy even falling below their pre-captaincy levels. The decrease is even more substantial for bowlers.
3.5. Contribution of specialists
Specialists providing focused contributions to group endeavours are often key to team success. Methods: Here, we consider three types of specialists-openers, all-rounders and wicket-keepers. Batsmen are categorized as openers if they occupy the first or second position in the batting lineup, while positions 3 to 6 are considered non-openers (?2.2). We focus on the pace at which they accumulate these runs. The run-scoring rate is quantified as the number of runs per 100 balls faced, commonly referred to as the strike rate in cricket terminology. This metric is frequently used to assess a player's defensive or aggressive playing style. We use the Mann-Whitney U test to compare the distributions of runs (and strike rate) scored by openers and non-openers.
All-rounders are players who make significant contributions in both aspects of the game-batting and bowling (?2.2). In contrast, specialized batsmen and bowlers are characterized by their significant contribution in only one of these areas. We quantify the fractional contribution for each type of player as shown in ?2.4.2. We use the K-S test to analyse the differences between the distributions of fractional contributions of batsmen and bowlers against all-rounders.
Among various fielding roles, the wicket-keeper is considered to hold a pivotal position. Typically, a batsman can be dismissed in three primary ways: (i) if the bowler successfully hits the stumps or the batsman obstructs the stump with their leg (ii) if a fielder catches the ball hit by the batsman before it touches the ground or (iii) if the stump is dislodged by the fielders before the completion of a run between the wickets. The first method relies exclusively on the bowler's skill, whereas the latter two require strategic collaboration between the bowler and fielders. Our interest lies in the last two dismissal types. Specifically, we define dt as the number of dismissals executed by a fielder of type (ii) and (iii) per match. We use Welch's t-test to compute the statistical significance of differences between the number of dismissals by the wicket-keepers and fielders respectively.
Results: Figure Aa elucidates the variations in batting patterns between opening batsmen and their non-opening counterparts. Our analysis reveals that openers average a strike rate of around 63 runs per 100 balls, whereas non-openers exhibit a higher average of 69 runs per 100 balls. This suggests that non-openers score approximately 9% more rapidly than their opening counterparts. However, the average score for openers stands at approximately 31 runs, in contrast to 26 runs for non-openers, indicating a reduction of about 17% for batsmen coming later in the batting order. The Mann-Whitney U test yields statistical significance level given by p < 0.001.
Figure 4b presents the distribution of the contributions of all players across all matches. Among these, batsmen demonstrate the lowest mean contribution (fc " 0.06), with bowlers exhibiting a slightly higher average contribution of /c"0.1. Notably, all-rounders surpass both groups with an average contribution of /c"0.11. Furthermore, there are instances where all-rounders singularly account for more than 50% of the team's performance, evidenced by instances of fc > 0.5. The K-S test reveals that the contributions from batsmen and bowlers are significantly different (p < 0.001) from those of all-rounders. We also note that non-specialists contribute significantly less than the specialists in both departments (see electronic supplementary material, SM8).
Figure 4c presents a boxplot depicting the distribution of dismissal fractions attributed to wicket-keepers and fielders. Our observations reveal that wicket-keepers play a vital role in dismissals, accounting for approximately 0.7 dismissals per match. In contrast, fielders contribute to a relatively minor number of dismissals, averaging around 0.3. The Welch t-test gives p < 0.001 implying that wicket-keepers are consistently better than fielders when the opportunity arises.
3.6. Patterns of team success
Methods: Similar to the bursts of exceptional performance observed in individual players (as discussed in ?3.1), we assess the propensity of teams to consecutively win (or lose) a series of matches. We quantify this tendency by defining the ratio Ndata,, where Ndata represents the frequency of a team winning (or losing) I or more consecutive matches. In contrast, Nrand is the analogous frequency within a randomized null model (see ?2.5.2). Thus, the ratio "data indicates the likelihood of a team experiencing a streak of at least I consecutive wins or losses compared to a random outcome.
To gain more insight into the team composition, we employ the concept of effective size Seff as discussed in ?2.4.3. To compute the statistical differences between the Seff of winning and losing teams, we use the Mann-Whitney U test. Additionally, we also compute the effect size r = -^-, where U is the test statistic and n\ and n2 are the sample sizes, respectively.
Results: Figure 5a reveals that teams are more prone to winning or losing matches in sequence than would be expected by chance. Notably, the probability of winning seven or more consecutive matches is nine times higher than chance, while the likelihood of losing seven or more consecutive matches is about three times the random expectation. This observation suggests that teams exhibit hot streaks in their winning performances. Intriguingly we also observe cold streaks, where teams undergo multiple successive losses.
As illustrated in figure 5b via a boxplot, the distributions of Seff for winning and losing teams differ notably. The median Seff for winning teams (" 6.9) is approximately 4% greater than that for losing teams (" 6.6). This indicates that successful teams require a broader array of individual contributions. The Mann-Whitney U test indicates that the differences are significant (p < 0.001) and the effect size r = 0.56 indicates a probability that the winning side is 56% more likely to have higher Seff than the losing side.
4. Discussion
In the last few years, the advent of detailed sports analytics has revolutionized our understanding of human behaviour in sports [10-12]. It offers insights into physical performance, cognitive processes and team dynamics, moving beyond traditional training methods [13,14]. This shift towards a data-driven approach in sports reflects a broader societal trend in optimizing human potential, combining historical practices with modern scientific methodologies.
In this study, we quantified the individual and team performance in ODI cricket, examining various aspects of the game. Due to the absence of quantifiable performance metrics such as the Elo rating system in chess, we validated the use of runs scored and wickets taken as indicators of batting and bowling performance, respectively. Our analysis revealed a significant increase in average runs scored over time. To negate this inflation in run-scoring and facilitate fair comparisons, we implemented a normalization of all performances involving runs (see electronic supplementary material, SMI).
This rescaled data corroborated the 'random impact rule' observed in other domains, indicating that a player's peak performance can occur at any point during their career [37,38]. However, this best performance often co-occurs with other comparable exemplary performances, pointing to the existence of hot streaks in individual cricket players. Across a range of domains such as science [38,39], arts [41,42] and sports [8,24], individual careers display hot streaks, where exceptional performances tend to occur in bursts that are clustered in time. Our analysis showed that, while the best performance of a cricket player's career may manifest at any point within their career, peak performances tend to occur in rapid sequence over a short period of time. This is further confirmed by the probability of performing well in consecutive matches growing exponentially, as evidenced by the recurrence time distribution in electronic supplementary material, SM3.
Our exploration into the predictors of individual performance unveiled a strong correlation between early career achievements and overall career trajectory However, noteworthy differences were observed between batsmen and bowlers with batsmen improving their performance from early careers more than the bowlers. Such disparities may be attributed to the distinct elements inherent in the roles of batsmen and bowlers. Batsmen, for instance, may leverage accumulated experience and refined skills to augment run-scoring, while bowlers may depend more on innovation and strategic creativity to increase their wicket tally. This distinction potentially accounts for the observed trend of a higher proportion of batsmen surpassing their initial career performance compared to bowlers. Alternatively, our observations can also be explained if some players are simply better than the rest. Thus, their performance in various stages of their careers will correlate with their full career performance. This effect is also known as the Q-model in the context of scientific careers [37]. Further analyses supporting this hypothesis are presented in electronic supplementary material, SM4. Additionally, we confirmed that players are often excluded from teams due to declining performances, yet they typically exhibit a sustained enhancement in performance following their comeback.
We investigated the relationship between player performance and captaincy. Our data suggested that batsmen who ascend to captaincy roles demonstrate higher performance levels, both overall and during their tenure as captain, in contrast to non-captain-batsmen. This pattern was not mirrored among bowlers, highlighting the differential impact of captaincy on distinct player types. These observations suggest a close correlation between individual performance and captaincy tenure for batsmen, where they often lead through exemplary performance. Conversely, bowlers may experience captaincy as a burden, reflected in their individual statistics. This could account for the observed differences in performance between captains and non-captains among batsmen and bowlers. We tested that this phenomenon is not because the captain-bowlers bowl for more time in electronic supplementary material, SM6. An alternative explanation might be that captains tend to bowl at the wrong time to relieve the team pressure. However, to test this hypothesis, one would need fine-grained temporal information about the progression of a match, which is not present in our data.
Focusing on the role of specialists, our findings indicated that opening batsmen typically score more runs but at a slower rate, whereas subsequent batsmen tend to adopt more aggressive strategies. Collectively, these findings illustrate a strategic balance between defensive and attacking approaches contingent on a player's batting position. Openers tend to adopt a more conservative approach, possibly due to the initial uncertainty of the pitch conditions, while subsequent batsmen often adopt a more aggressive strategy, building upon the foundational efforts of the openers. Further analysis reveals that this disparity may be partially attributed to the differing number of balls faced by players based on their batting order (see electronic supplementary material, SM7).
All-rounders enhance the team's score by applying their skills across both facets of the game. They were observed to contribute more to team performance compared to batsmen and bowlers. While bowlers generally contribute more towards the team's collective efforts than batsmen, all-rounders emerge as pivotal players, often driving the team's success with contributions that exceed 50% of the total team scores. In the realm of fielding, wicket-keepers emerged as pivotal in effecting dismissals, thereby bolstering the impact of bowlers on team success. In summary, team success depends on the successful coordination of different types of-often specialized-contributions. Although most players are predominantly recognized for their batting or bowling abilities, fielding is an integral aspect of a team's overall success. Indeed, a commonly reiterated phrase in cricket states, 'catches win matches', emphasizing the significance of player dismissals through effective fielding. Wicket-keepers, by contributing to a substantial number of total dismissals, enhance the bowlers' efforts and, consequently, the overall team performance. These findings underscore the vital role of specialists in ODI cricket. Future research could look into a more detailed examination of their role and their strategic impact on the dynamics of the game. Additionally, our analysis reveals that, similarly to individual performances, teams also tend to win or lose matches in clusters, thus identifying both hot and cold streaks during seasons. We observed that winning teams were characterized by more evenly distributed player performances, as indicated by higher effective team sizes. A lack of such collective effort often results in teams losing matches. Moreover, the data reveal a redundancy of 3 to 4 players in most teams. However, our supplementary analysis also reveals that the top performers from the winning team contribute more to their team than their top-performing counterparts from the losing team (see electronic supplementary material, SM9). Our results might be subject to certain limitations. Cricket is a multi-faceted game, and while our analysis focuses on a wide range of indicators, it still does not consider all aspects of the game such as home advantage and differences among spinners and fast bowlers. Furthermore, the volume of ODI matches and the resulting dataset size is not fully extensive. Future research could benefit from incorporating data from diverse cricket formats such as test matches, T20 games and franchise leagues like the IPL, which might yield deeper insights into varied player performance patterns. Supplementing our analysis with data from various professional levels may reveal further patterns in players' mobility and skill levels. Additionally, our dataset focuses on end-match statistics, omitting the nuanced temporal dynamics within individual games. While the final scorecard offers substantial information, some facets of performance may only become apparent with a more granular data analysis. All in all, our work reveals intriguing patterns of individual and team performance in ODI cricket. We believe that our methodologies developed here could be readily applied without substantial modifications to other formats of the game such as T20 and test cricket, as well as other team sports. Particularly suitable are teams with high specialization such as baseball, American football and volleyball. A comparative analysis of our results with a broader literature on sports can improve our understanding of human behaviour. We hope that our analysis motivates further research into synergistic individual efforts in sports and, more broadly, in team dynamics. Ethics. This work did not require ethical approval from a human subject or animal welfare committee. Data accessibility. Data and relevant code for this research work are stored in GitHub [71] and have been archived within the Zenodo repository [72]. Supplementary material is available online [73]. Declaration Of Al use. We have not used Al-assisted technologies in creating this article. Authors' contributions. O.S.: conceptualization, data curation, formal analysis, investigation, methodology, visualization, writing-original draft, writing-review and editing; S.C.: investigation, methodology, visualization, writing-review and editing; M.S.S.: investigation, methodology, writing-review and editing; F.B.: conceptualization, funding acquisition, investigation, methodology, project administration, supervision, writing-original draft, writing-review and editing. All authors gave final approval for publication and agreed to be held accountable for the work performed therein. Conflict of interests. We declare we have no competing interests. Funding. F.B. acknowledges support from the Air Force Office of Scientific Research under award number FA8655-22-1-7025. Acknowledgements. The authors thank Aanjaneya Kumar and Lorenzo Betti for the discussions which led to an improvement in the manuscript.
References
1. Roche M. 2000 Mega-events and modernity: Olympics and Expos in the growth of global culture, 1st edn. Oxford, UK: Routledge.
2. Guttmann A. 2002 The Olympics, a history of the modern games. In Illinois history ofsports, 2nd edn. Urbana, IL: University of Illinois Press.
3. Bryson B. 2008 A really short history ofnearly everything. New York, NY: Delacorte.
4. Home J, Whannel G. 2020 Understanding the Olympics. Oxford, UK: Routledge.
5. Nevill A, Atkinson G, Hughes M. 2008 The growing trend of scientific interest in sports science research. J. Sports Sci. 26,413. (doi:10.10807 02640410701705108)
6. Radicchi F. 2011 Who is the best player ever? A complex network analysis of the history of professional tennis. PLoS One 6, e17249. (doi:10. 1371/journal.pone.OOI 7249)
7. Pappalardo L, Cintia P, Ferragina P, Massucco E, Pedreschi D, Giannotti F. 2019 Playerank. ACM Trans. Intell. Syst. Technol. 10,1-27. (doi:10. 1145/3343172)
8. ChowdharyS, lacopini I, Battiston F. 2023 Quantifying human performance in chess.Sci. Rep. 13,2113. (doi:10.1038/s41598-023-27735-9)
9. Zappala C, Biondo AE, Pluchino A, Rapisarda A. 2023 The paradox of talent: how chance affects success in tennis tournaments. Chaos Soliton. had. 176,114088. (doi:10.1016/j.chaos.2023.114088)
10. Lewis M. 2004 Moneyball: the art ofwinning an unfair game. New York, NY: W. W. Norton.
11. Brown DT, LinkCR, Rubin SL. 2017 Moneyball after 10 years J. SportsEcon. 18,771-786. (doi:10.1177/1527002515609665)
12. Pappalardo L, Cintia P. 2018 Quantifying the relation between performance and success in Soccer. Advs. Complex Syst. 21,1750014. (doi:10. 1142/S021952591750014X)
13. Mason D, Foster W. 2007 Putting Moneyball on kellntJ. Sport finance!, 206-213.
14. Dawson M. 2023 The iron cage of efficiency: analytics, basketball and the logic of modernity. Sport Soc. 26, 1785-1801. (doi:10.1080/ 17430437.2023.2208040)
15. Neu\\aus~\,~n\omasN.2024lnterdisciplinary analyses of professional basketball, pp. 11-39. New York, NY: Springer.
16. Roring RW, Charness N. 2007 A multilevel model analysis of expertise in chess across the life span. Psychol. Aging 22,291-299. (doi:10.1037/ 0882-7974.22.2.291)
17. Vaci N, Bilalic M. 2017 Chess databases as a research vehicle in psychology: modeling large data. Behav. Res. Methods 49,1227-1240. (doi:10. 3758/S13428-016-0782-5)
18. 2023 The most popular sports in the world. See https://worldatlas.com.
19. 2024 Most popular Google sports trends. See https://topendsports.com.
20. Subasingha S, Premaratne SC, Jayaratne KL, Sellappan P. 2019 Novel method for cricket match outcome prediction using data mining techniques.Ml Eng. Adv. Technol. 8,15-21.(doi:10.35940/ijeat.F1004.0986S319)
21. Bandulasiri A. 2008 Predicting the winner in One Day International cricket.! Math. Sci. Math. Educ. 3.
22. McEwan K, Pote L, Radloff S, Nicholls SB, Christie C. 2023 The role of selected pre-match covariates on the outcome of One-Day International (ODI) cricket matches. 5. Mr. J. SportsMed. 35, v35i1a15012. (doi:10.17159/2078-516X/2023/v35i1a15012)
23. Asif M, McHale IG. 2016 In-play forecasting of win probability in One-Day International cricket: a dynamic logistic regression model. Int. J. Forecast. 32,34-43. (doi:10.1016/j.ijforecast.2015.02.005)
24. Ram SK, Nandan S, Boulebnane S, Sornette D. 2022 Synchronized bursts of productivity and success in individual careers. Sci. Rep. 12,7637. (doi:10.1038/s41598-022-10837-1)
25. Premkumar P, Chakrabarty JB, Chowdhury S. 2020 Key performance indicators for factor score based ranking in One Day International cricket. IIMBManag. Rev. 32,85-95. (doi:10.1016/j.iimb.2019.07.008)
26. Swartz TB, Gill PS, Muthukumarana S. 2009 Modelling and simulation for one-day cricket. Can. 1 Statistics 37,143-160. (doi:10.1002/cjs. 10017)
27. C. G. K. 1905 Science and art of cricket. Nature 73,82-84. (doi:10.1038/073082a0)
28. Kimber AC, Hansford AR. 1993 A statistical analysis of batting in cricket. J. R. Stat. Soc. Ser. A 156,443-455. (doi:10.2307/2983068)
29. Mehta RD, Bentley K, Proudlove M, Varty P. 1983 Factors affecting cricket ball swing. Nature New Biol. 303,787-788. (doi:10.1038/303787a0)
30. Arora M, Gupta, R, Kumaraguru P. 2014 Indian Premier League (IPL), cricket, online social media. (doi:https://arxiv.org/abs/1405.5009)
31. Nicholls S, Pote L, Thomson E, Theis N. 2023 The change in test cricket performance following the introduction of T20 cricket. Sport. Innov. J. 4, 1-16.(doi:10.18060/26438)
32. Orchard J, James T, Alcott E, Carter S, Farhart P. 2002 Who owns the information? Databases of injuries in professional sport are valuable resources which should not suffer confidentiality restraints. & 15porfs/W"/. 36,16-18. (doi:10.1136/bjsm.36.1.16)
33. Finch CF, Elliott BC, McGrath AC. 1999 Measures to prevent cricket injuries: an overview. Sports Med. 28,263-272. (doi:10.2165/00007256-199928040-00004)
34. Orchard J, Newman D, Stretch R, Frost W, Mansingh A, Leipus A. 2005 Methods for injury surveillance in international cricket. J. Sci. Med. Sport 8, 1-14.(doi:10.1016/s1440-2440(05)80019-2)
35. Coles S. 2001 An introduction to statistical modeling of extreme values, pp. 92-104. London, UK: Springer. (doi:10.1007/978-1-4471-3675-0_5)
36. Beirlant J, Goegebeur Y, Teugels J, Segers J. 2004 Statistics of extremes: theory and applications. In Wiley series in probability and statistics, 1st edn. Hoboken, NJ: Wiley. (doi:10.1002/0470012382)
37. Sinatra R, Wang D, Deville P, Song C, Barabasi AL. 2016 Quantifying the evolution of individual scientific impact. Science 354, aaf5239. (doi:10. 1126/science.aaf5239)
38. Shi Met al. 2018 The evolutionary history of vertebrate RN A viruses. Afaf.Mwfi/o/. 556,197-202. (doi:10.1038/s41586-018-0012-7)
39. Liu Y, Cao L, Wu B. 2022 General non-linear imitation leads to limit cycles in eco-evolutionary dynamics. Chaos Solitons Fractals 165,112817. (doi:10.1016/j.chaos.2022.112817)
40. Fraiberger SP, Sinatra R, Resch M, Riedl C, Barabasi AL. 2018 Quantifying reputation and success in art. Science 362,825-829. (doi:10.1126/ science.aau7224)
41. JanosovM, Battiston F, Sinatra R. 2020 Successand luck in creative careers, ffi/ Data Sci. 9. (doi:10.1140/epjds/s13688-020-00227-w)
42. Williams OE, Lacasa L, Latora V. 2019 Quantifying and predicting success in show business. Nat. Commun. 10,2256. (doi:10.1038/s41467-019-10213-0)
43. Bar-Eli M, Avugos S, Raab M. 2006 Twenty years of hot hand' research: review and critique. Psychol. Sport Exerc. Judgm. Decis. Mak. Sport Exerc. 7,525-553. (doi:10.1016/j.psychsport.2006.03.001)
44. Gilovich T, Vallone R, Tversky A. 1985 The hot hand in basketball: on the misperception of random sequences. Cogn. Psychol. 17,295-314. (doi: 10.1016/0010-0285(85)90010-6)
45. Golder PN, Tellis GJ. 1997 Will it ever fly? Modeling the takeoff of really new consumer durables. Mktg. Sci. 16,256-270. (doi:10.1287/mksc.16. 3.256)
46. Bak-Coleman JB, Kennedy I, Wack M, Beers A, Schafer JS, Spiro ES, Starbird K, West JD. 2022 Combining interventions to reduce the spread of viral misinformation. Nat. Hum. Behav. 6,1372-1380. (doi:10.1038/s41562-022-01388-6)
47. Shaw JM, Johnson DD, Nygaard IE. 2018 Engaging undergraduate kinesiology students in clinically-based research. QuestlO, 292-303. (doi:10. 1080/00336297.2017.1380054)
48. Till K, Baker J. 2020 Challenges and [possible] solutions to optimizing talent identification and development in sport. Front. Psychol. 11,664. (doi:10.3389/fpsyg.2020.00664)
49. Hussain Z, Mata R, Wulff DU. 2024 Novel embeddings improve the prediction of risk perception. [PI Data Sci. 13,38. (doi:10.1140/epjds/ S13688-024-00478-X)
50. Pasarakonda S, Maynard T, Schmutz JB, Liithold P, Grote G. 2023 How team familiarity mitigates negative consequences of team composition disruptions: an analysis of Premier League teams. Group Organiz. Manag. (doi:10.1177/10596011231193176)
51. Guimera R, Uzzi B, Spiro J, Amaral LAN. 2005 Team assembly mechanisms determine collaboration network structure and team performance. Science 308,697-702. (doi:10.1126/science.l 106340)
52. Zhang J, Yin K, Li S. 2022 Leader extraversion and team performance: a moderated mediation model. PLoS One 17, e0278769. (doi:10.1371/ journal.pone.0278769)
53. Hancock AJ, Gellatly IR, Walsh MM, Arnold KA, Connelly CE. 2023 Good, bad, and ugly leadership patterns: implications for followers' work-related and context-free outcomes.! Manage. 49,640-676. (doi:10.1177/01492063211050391)
54. Betti L, Gallo L, Wachs J, Battiston F. 2024 The dynamics of leadership and success in software development teams. (doi:https://arxiv.org/abs/ 2404.18833)
55. Burke CS, Georganta E, Marlow S. 2019 A bottom up perspective to understanding the dynamics of team roles in mission critical teams. Front. Psychol. 10,1322. (doi:10.3389/fpsyg.2019.01322)
56. Salcinovic B, Drew M, Dijkstra P, Waddington G, Serpell BG. 2022 Factors influencing team performance: what can support teams in high-performance sport learn from other industries? A systematic scoping review. Sports Med. Open. 8,25. (doi:10.1186/s40798-021 -00406-7)
57. Wallrich L, Opara V, Wesotowska M, Barnoth D, Yousefi S. 2024 The relationship between team diversity and team performance: reconciling promise and reality through a comprehensive meta-analysis registered report. PsyArXiv. (doi:10.31234/osf.io/nscd4)
58. 2024 HowSTAT! The cricket statisticians. See http://howstat.com/cricket/home.asp.
59. 2024 Cricket-Wikipedia. See https://en.wikipedia.org/w/index.php?title=Cricket.
60. Smyth R. 2011 Fifteen-over Held restrictions. ESPN cricinfo. See https://www.espncricinfo.com/story/cricket-s-turning-points-fifteen-over-field-restrictions-513169.
61. Sankar R. 2019 The evolution of the cricket bat: from then to now. sportskeeda. See https://www.sportskeeda.com/cricket/evolution-of-cricket-bats.
62. Silgardo D. 2023 More runs, longer careers, fewer breaks: how cricket has changed over the past 30 years. ESPN cricinfo. See https://www. espncricinfo.com/story/a-statistical-look-at-how-cricket-has-changed-over-the-past-30-years-more-runs-longer-caree rs-fewer-breaks-1367458.
63. Radicchi F, Fortunato S, Castellano C. 2008 Universality of citation distributions: toward an objective measure of scientific impact. Proc. Natl Acad. Sci. USA 105,17268-17272. (doi:10.1073/pnas.0806977105)
64. Hodges JL. 1958 The significance probability of the Smirnov two-sample test. Ark. Mat. 3,469-486. (doi:10.1007/BFO2589501)
65. Wilcoxon F. 1945 Individual comparisons by ranking methods. Bio. Bull. 1,80. (doi:10.2307/3001968)
66. Mann HB, Whitney DR. 1947 On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18,50-60.(doi:10.1214/aoms/1177730491)
67. Welch BL. 1947 The generalisation of Student's problems when several different population variances are involved. Biometrika 34,28-35. (doi: 10.1093/biomet/34.1-2.28)
68. Virtanen P etal. 2020 Scipy 1.0: fundamental algorithms for scientific computing in Python. Nat. MethodsV, 261-272. (doi:10.1038/s41592-019-0686-2)
69. Klug M, Bagrow JP. 2016 Understanding the group dynamics and success of teams. R. Soc. Open Sci. 3,160007. (doi:10.1098/rsos.160007)
70. Delice F, Rousseau M, Feitosa J. 2019 Advancing teams research: what, when, and how to measure team dynamics over time. Front. Psychol. 10, 1324.(doi:10.3389/fpsyg.2019.01324)
71. OnkarS. 2024Cricket_Public.G/f/M.Seehttps://github.com/sadekar-onkar/cricket_public.
72. OnkarS. 2024sadekar-onkar/cricket_public: individual and team performance in cricket. See https://zenodo.org/records/11581222.
73. Sadekar 0, Chowdhary S, Santhanam MS, Battiston F. 2024Supplementary material from: Individual and team performance in cricket. Figshare. (doi:10.6084/m9.figshare.c.7303158)
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Details
1 Department of Network and Data Science, Central European University, Vienna 1100, Austria
2 Department of Physics, Indian Institute of Science Education and Research, Dr Homi Bhabha Road, Pune 411008, India