Scholarly Commons at Miami University Scholarly Commons @ MU
    • Login
    • Scholarly Commons FAQs
    • SHERPA/RoMEO
    • SPARC Author Addendum Engine
    View Item 
    •   SC Home
    • Faculty Research and Scholarship
    • Stephan, Matthew
    • View Item
    •   SC Home
    • Faculty Research and Scholarship
    • Stephan, Matthew
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A Survey of Baseball Machine Learning: A Technical Report

    Thumbnail
    View/Open
    Technical Report (1.231Mb)
    Author
    Koseler, Kaan
    Stephan, Matthew
    Metadata
    Show full item record
    Abstract
    Statistical analysis of baseball has long been popular, albeit only in limited capacity until relatively recently. The recent proliferation of computers has added tremendous power and opportunity to this field. Even an amateur baseball fan can perform types of analyses that were unimaginable decades ago. In particular, analysts can easily apply machine learning algorithms to large baseball data sets to derive meaningful and novel insights into player and team performance. These algorithms fall mostly under three problem class umbrellas: Regression, Binary Classification, and multiclass classification. Professional teams have made extensive use of these algorithms, funding analytics departments within their own organizations and creating a multi-million dollar thriving industry. In the interest of stimulating new research and for the purpose of serving as a go-to resource for academic and industrial analysts, we have performed a systematic literature review of machine learning algorithms and approaches that have been applied to baseball analytics. We also provide our in- sights on possible future applications. We categorize all the approaches we encountered during our survey, and summarize our findings in two tables. We find two algorithms dominated the literature, 1) Support Vector Machines for classification problems and 2) Bayesian Inference for both classification and Regression problems. These algorithms are often implemented manually, but can also be easily utilized by employing existing software, such as WEKA or the Scikit-learn Python library. We speculate that the current popularity of neural networks in general machine learning literature will soon carry over into baseball analytics, although we found relatively fewer existing articles utilizing this approach when compiling this report.
    URI
    http://hdl.handle.net/2374.MIA/6218
    Collections
    • Computer Science and Software Engineering Technical Reports
    • Stephan, Matthew

    Browse

    All of Scholarly CommonsCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    Statistics

    View Usage Statistics

    - Miami University Libraries
    - Center for Digital Scholarship
    - Contact Us
    DSpace software
    Mirage 2 Theme
    htmlmap