A Survey of Baseball Machine Learning: A Technical Report

Koseler, Kaan; Stephan, Matthew

A Survey of Baseball Machine Learning: A Technical Report

Files

TechReport__An_SLR_on_the_use_of_Machine_Learning_in_Baseball_Analytics.pdf (1.23 MB)

Downloads: 2929

Date

2017-11-22

Authors

Koseler, Kaan

Stephan, Matthew

Item Details

Usage Statistics
Detailed item record

Abstract

Statistical analysis of baseball has long been popular, albeit only in limited capacity until relatively recently. The recent proliferation of computers has added tremendous power and opportunity to this field. Even an amateur baseball fan can perform types of analyses that were unimaginable decades ago. In particular, analysts can easily apply machine learning algorithms to large baseball data sets to derive meaningful and novel insights into player and team performance. These algorithms fall mostly under three problem class umbrellas: Regression, Binary Classification, and multiclass classification. Professional teams have made extensive use of these algorithms, funding analytics departments within their own organizations and creating a multi-million dollar thriving industry. In the interest of stimulating new research and for the purpose of serving as a go-to resource for academic and industrial analysts, we have performed a systematic literature review of machine learning algorithms and approaches that have been applied to baseball analytics. We also provide our in- sights on possible future applications. We categorize all the approaches we encountered during our survey, and summarize our findings in two tables. We find two algorithms dominated the literature, 1) Support Vector Machines for classification problems and 2) Bayesian Inference for both classification and Regression problems. These algorithms are often implemented manually, but can also be easily utilized by employing existing software, such as WEKA or the Scikit-learn Python library. We speculate that the current popularity of neural networks in general machine learning literature will soon carry over into baseball analytics, although we found relatively fewer existing articles utilizing this approach when compiling this report.

URI

http://hdl.handle.net/2374.MIA/6218

This item appears in the following collections

Stephan, Matthew
Computer Science and Software Engineering Technical Reports

A Survey of Baseball Machine Learning: A Technical Report

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Item Details

Abstract

Description

Keywords

Citation

URI

This item appears in the following collections