Searching for Gazelles in the Forest: The Potential of Random Forests to Identify High-Growth Firms
2025 (English)In: AI, Analytics and Strategic Decision-Making / [ed] Anna Ujwary-Gil & Anna Florek-Paszkowska, London: Routledge, 2025, p. 25-56Chapter in book (Refereed)
Abstract [en]
This chapter discusses the recent rise of machine learning (ML) techniques, specifically decision trees and Random Forests (RFs). RFs do not pursue causal inference but are powerful techniques for making predictions. RFs are discussed from various angles and compared with regression techniques like Classification and Regression Trees (CART), Random Forests, Synthetic Minority Oversampling TEchniques (SMOTE), Confusion Matrix, and Receiver Operating Characteristic (ROC) curve. As a running example, the value of RFs is illustrated throughout the chapter by focusing on the context of the prediction of high-growth firms (HGFs, also known as “gazelles”) using Swedish data (limited liability firms, 2004–2015). RFs appear to outperform previously used regression techniques when the task is a prediction of HGFs. ML techniques such as RFs have “raised the bar” in terms of the accuracy of prediction tasks. Improved prediction accuracy can be expected to lead to better decision-making by various stakeholders, such as (actual and potential) entrepreneurs, investors, academics, and policy-makers. We contribute to the emergence of a new class of quantitative analysis tools for making predictions that are expected to have superior predictive power in many contexts, including the prediction of HGFs.
Place, publisher, year, edition, pages
London: Routledge, 2025. p. 25-56
Series
Routledge Studies in Innovation Organizations and Technology
Keywords [en]
Curve fitting, Decision trees, Forestry, Investments, Learning systems, Random forests, Regression analysis, Causal inferences, Classification trees, Confusion matrix, High growth, Machine learning techniques, Receiver operating characteristic curves, Regression techniques, Regression trees, Synthetic minority over-sampling techniques, Forecasting
National Category
Computer and Information Sciences Economics and Business
Identifiers
URN: urn:nbn:se:hj:diva-70603DOI: 10.4324/9781003507840-3ISI: 001577272700002Scopus ID: 2-s2.0-105015683619ISBN: 9781032831107 (print)ISBN: 9781032831138 (print)ISBN: 9781003507840 (electronic)OAI: oai:DiVA.org:hj-70603DiVA, id: diva2:2028843
2026-01-152026-01-152026-01-15Bibliographically approved