Machine learning for pattern discovery in management research
Supervised machine learning (ML) methods are a powerful toolkit for discovering robust patterns in quantitative data. The patterns identified by ML could be used for exploratory inductive or abductive research, or for post‐hoc analysis of regressionresults to detect patterns that may have gone unnoticed. However, ML models should not be treated as the result of a deductive causal test. To demonstrate the application of ML for pattern discovery, we implement ML algorithms to study employee turnover at a large technology company. We interpret the relationships between variables using partial dependence plots, which uncover surprising nonlinear and interdependent patterns between variables that may have gone unnoticed using traditional methods. To guide readers evaluating ML forpattern discovery, we provide guidance for evaluating model performance, highlighthuman decisions in the process, and warn of common misinterpretation pitfalls. An online appendix provides code and data to implement the algorithms demonstrated in the paper.
Managerial Summary
Supervised machine learning (ML) methods are a powerful toolkit that might help managers and researchers discover interesting patterns in large and complex data. We demonstrate this by using several ML algorithms to investigate the drivers of employee turnover at a large technology company. We evaluate the performance of the models, and use visual tools to interpret the patterns revealed. These patterns can be useful in understanding turnover, but we caution not to confuse correlation with causation. These methods should be viewed as "exploratory" and not conclusive proof of relationships in the data. Our guidance can be helpful for managers evaluating analysis conducted by data scientists in their organizations.
This article is protected by copyright. All rights reserved.
Publisher URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/smj.3215
DOI: 10.1002/smj.3215
Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.
Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.