On Leveraging Machine Learning in the Sports Sciences in the Hypothetico-deductive Framework
Supervised machine learning (ML) offers an exciting suite of algorithms that could benefit research in the sports sciences. In principle, supervised ML approaches were designed for pure prediction, as opposed to explanation, leading to a rise in powerful, but opaque, algorithms. Recently, two subdomains of ML–explainable ML, which allows us to “peek into the black box,” and interpretable ML, which encourages using algorithms that are inherently interpretable–have grown in popularity. Given this increased transparency into ML algorithms, can supervised ML be used in place of statistical methods in the hypothetico-deductive framework? This paper shows why ML algorithms are fundamentally different from statistical methods, even when using explainable or interpretable approaches. While supervised ML cannot be used in place of statistical methods, we propose ways in which the sports science community can take advantage of supervised ML in the hypothetico-deductive framework. In this manuscript we argue that supervised machine learning can and should augment our exploratory investigations in the sports sciences, but should not replace statistical reasoning in the hypothetico-deductive framework. We justify our position through a careful examination of supervised machine learning, and provide a useful analogy to help elucidate our findings. Three case studies are provided to demonstrate how supervised machine learning can be integrated into exploratory analysis. Supervised machine learning should be integrated into the scientific workflow with requisite caution. The approaches described in this paper provide ways to safely leverage the strengths of machine learning while avoiding potential pitfalls.