Comparison of imputation methods for missing rate of perceived exertion data in rugby
Date
2022
Authors
Epp-Stobbe, Amarah
Tsai, Ming-Chang
Klimstra, Marc
Journal Title
Journal ISSN
Volume Title
Publisher
Machine Learning & Knowledge Extraction
Abstract
Rate of perceived exertion (RPE) is used to calculate athlete load. Incomplete load data,
due to missing athlete-reported RPE, can increase injury risk. The current standard for missing RPE
imputation is daily team mean substitution. However, RPE reflects an individual’s effort; group
mean substitution may be suboptimal. This investigation assessed an ideal method for imputing
RPE. A total of 987 datasets were collected from women’s rugby sevens competitions. Daily team
mean substitution, k-nearest neighbours, random forest, support vector machine, neural network,
linear, stepwise, lasso, ridge, and elastic net regression models were assessed at different missingness
levels. Statistical equivalence of true and imputed scores by model were evaluated. An ANOVA of
accuracy by model and missingness was completed. While all models were equivalent to the true RPE,
differences by model existed. Daily team mean substitution was the poorest performing model, and
random forest, the best. Accuracy was low in all models, affirming RPE as multifaceted and requiring
quantification of potentially overlapping factors. While group mean substitution is discouraged,
practitioners are recommended to scrutinize any imputation method relating to athlete load.
Description
Keywords
sports, football, athletic performance, statistical models, machine learning
Citation
Epp-Stobbe, A., Tsai, M., & Klimstra, M. (2022). “Comparison of imputation methods for missing rate of perceived exertion data in rugby.” Machine Learning & Knowledge Extraction, 4(4), 827-838. https://doi.org/10.3390/make4040041