Identifying the Factors Influencing IPO Underpricing using Explainable Machine Learning Techniques

We study the underpricing phenomenon in the U.S. Initial Public Offering (IPO) market using an interpretative machine learning approach. Inadequate conventional modelling of IPOs efficiency can lead to suboptimal investment decisions and a poor understanding of underlying factors that drive IPO underpricing. We employ interpretative machine learning for both predicting the numeric underpricing levels and classifying IPOs into underpriced and overpriced categories. We use the SHapley Additive exPla-nations method to provide concise insights into the underlying factors that contribute to underpricing. Our numerical study reveals that offer price has the most predictive power for the models’ output, followed by equity retained and assets. Additionally, underpricing is more pronounced in technology-based sectors, and a higher dispersion in quality among firms during IPO surges results in higher levels of underpricing. Our analyses emphasize the importance of considering both the industry sector and market conditions when evaluating the firms going public.


Introduction
An evergreen question in the Initial Public Offering (IPO) market is what factors drive the phenomenon of underpricing, which is the tendency for offering IPOs at a discount, resulting in abnormal returns on the first trading day.One well-known theory suggests that underpricing, as a signalling tool for high-quality firms, is deliberate and relates to information asymmetry about the intrinsic value of an IPO among participants before the IPO day [1].As such, a rich body of literature relies on a set of information that is mainly related to firm and deal characteristics to model the underpricing [2].However, the lack of data and the large set of unknown predictors make it challenging to accurately evaluate the firms going public and explain the phenomenon.Furthermore, the traditional models, such as ordinary least squares and logit, have weak modelling power and are unable to efficiently handle problems, such as outliers, multicollinearity, and nonlinearity in data [3].Consequently, obtained results are often conflicting, which attribute the underpricing to various factors that are difficult to justify, leading to unreliable advice based on models' decisions.Previous studies consider a wide range of methodologies to investigate the IPO underpricing problem including genetic algorithms [4], tree ensembles [5,6], and neural networks [7,8].These works mainly rely on tabular data extracted from the financial records of the associated company as well as those from the macroeconomic indicators and stock market.Different from these approaches, Katsafados et al. [9] incorporate the textual data to numeric financial inputs to classify U.S. IPOs as overpriced and underpriced using machine learning (ML) methods.
The practicality and reliability of ML methods in financial economics are highly dependent on the knowledge of the models' decisions.Nonetheless, previous studies on IPO underpricing largely neglect the analysis of the impact of fundamentals and the associated participants' behaviours.Few recent studies provide post-hoc analysis for their prediction models to identify the most important underpricing drivers.Bastı, Kuzey, and Delen [10] and Baba and Sevil [11] develop tree-based methods for IPO underpricing prediction considering the IPOs issued on Borsa Istanbul.Both studies employ variable importance methods to investigate the major factors affecting IPOs' performance and identify the most important predictors of IPO initial returns.In another study, Colak, Fu, and Hasan [3] use random forest (RF) and gradient boost tree (GBT) to predict the underpricing and the post-issuance stock underperformance of Chinese companies listed in the U.S. stock exchanges.They sort the most influential factors using variable importance scores and calculate the marginal effect of a feature by comparing the prediction performances of the models with and without the feature.However, these studies provide limited results for feature importance based on the generic understandings gained from the entire sample.Consequently, they do not employ state-of-the-art artificial intelligence (AI) interpretability methods, such as LIME and SHAP [12], which present enhanced capabilities for interpretability, e.g., to identify the directions of the effects of each factor, hence contributing to a theoretically supported analysis for IPO underpricing prediction.
In this study, we consider ML models and AI interpretability methods for the prediction of underpricing in the U.S. IPO market.We extend the previous studies and contribute to the literature in several ways.First, we examine an extensive set of ML models including, linear, tree-based, and deep-learning models, for which, to the best of our knowledge, there is no comprehensive comparative analysis in the literature.Second, regarding the input data, in addition to well-documented factors related to company, performance, and IPO deal, we incorporate two sets of factors/features into the ML models: (1) several market conditions, to reconsider the well-established timing theory about the dependency of IPO efficiencies on the current economic conditions, and (2) industry, to account for disparities in terms of risk and investment prospects across different sectors.Third, we adopt a new explainability method, namely, SHapley Additive exPlanations (SHAP), which has the ability to identify the contribution of each feature to individual predictions, hence providing a fair and transparent explanation for the ML model's decision.We utilize SHAP values and draw the impact, contribution, and sign of the most important drivers of IPO underpricing considering their distribution over all predictions.To the best of our knowledge, this is the first study that leverages SHAP as an interpretability method in this domain and examines the implications of decisive factors on IPO underpricing accordingly.Lastly, our dataset consists of all IPOs issued in the U.S. over a long period, rather than focusing only on foreign companies in this market.

Methodology
In this section, we first describe our IPO dataset and provide details on dataset characteristics.Subsequently, we elaborate on the regression and classification models employed in our numerical study, as well as the experimental design and setup.

Dataset
We use several sources to obtain the data, including the Securities Data Company database to identify new issues in the U.S. market, Compustat for financial statement data, the Centre for Research in Security Prices database for trading data, and Jay Ritter's website 1 to gather information on the age of firms at the time of listing, underwriter reputation rankings, and average first-day returns.We also obtain data on private non-residual fixed investment from the Federal Reserve Bank of St. Louis to reflect firms' need for capital.Our final IPO dataset contains 3,456 IPOs listed from 1979 to 2018.In this dataset, the initial first-day return serves as a measure of mispricing and it is computed by subtracting the offer price from the closing price on the first day of trading and dividing the result by the offer price.If the mispricing is positive, it means that the offer price is lower than the close price and the corresponding IPO is classified as "Underpriced", otherwise, it is labeled as "Overpriced".In our dataset, there are 2631 underpriced and 825 overpriced IPOs.Panel A in Table 1 demonstrates the prevalence of underpricing with a highly positive mean of initial returns of 17%, and over 76% of the IPOs being underpriced.We also utilize a range of factors that potentially affect IPO underpricing as inputs for the models.These variables are placed into five categories: performance, firm-specific, deal characteristics, industry, and market condition.Panel B in Table 1 provides the variable definitions and descriptive statistics including details on how they are measured.
* * In the market condition variables, the values within parentheses indicate the statistics for the 'cold' state.

Machine Learning Methods
We consider nine models that can be grouped into three categories: linear models, treebased ensembles, and neural networks.The linear models include Ordinary Least Squares (OLS), Lasso regression, and Ridge regression for the regression task, and Logistic Regression (LR) for the classification task.The tree-based ensembles include Random Forest (RF), Extremely Randomized Trees (ETR), XGBoost (XGB), Category Boosting (CatB), and LightGBM (LGBM).We consider a Multi-Layer Perceptron (MLP) model as a representative neural network model that can capture complex non-linear relationships in the input data.These models are used to predict the underpricing value, and to classify the firms into underpriced and overpriced IPOs.To interpret the predictions of these models, we employ SHAP (SHapley Additive exPlanations), a game theoretic approach to explaining the output of the models.Specifically, SHAP assigns an importance value to each feature that corresponds to its contribution to the prediction.These values account for the interactions between the features and provide a comprehensive explanation of the model's decision.The SHAP method is model-agnostic and can be applied to a variety of ML models.

Experimental Setup
We evaluate the robustness of the models by dividing the dataset into 5 folds, with each fold serving as the test set and the rest as the training sets, providing 5 different out-ofsample results.To prevent overfitting and underfitting, we conduct an extensive hyperparameter tuning using a grid search over a range of plausible values for important parameters of each model.We implement cross-validation on the train sets to select the best model parameters.Additionally, for the classification task, we take into account the data imbalance between the "Underpriced" and "Overpriced" classes by adjusting the corresponding parameters in the ML models (e.g., see "class_weight = balanced" in RF).The regression models are evaluated using three performance metrics: Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R 2 ).The classification methods are evaluated using Precision, Recall, and F1 score as the performance metrics.Note that, to account for the data imbalance issue, we also report the Weighted and Macro F1 scores.

Results
The results from our detailed numerical study are provided below.We first compare the predictive performances of a large number of regression and classification models for IPO underpricing.Then, we examine model interpretability results.All the implementations are done in Python using the Scikit-learn library.

Performance Comparison
The performances of various regression and classification models are provided in Table 2.In Panel A, tree-based models, on average, outperform the others followed by MLP in regression.Specifically, CatB demonstrates the best performance, with the lowest prediction errors (MSE: 0.065 and MAE: 0.152), and the highest coefficient of determination (R 2 ) of 0.323.The superior performance of CatB can be attributed to its robustness in handling outliers, compared to other gradient boosting algorithms.The weak performance of linear models suggests that the relationships between the variables are not linear.In the classification of underpriced and overpriced IPOs, LGBM demonstrates, on average, the highest Weighted F1 score compared to other models.In terms of Macro F1, linear models exhibit competitive results, particularly RIDGE, which shows the highest value of 0.557.Note that in this table, each linear classifier refers to the LR model with the corresponding adjusted penalty parameter (e.g., the LASSO classifier is LR with the penalty of ℓ1-regularization).
We select the best-performing models in regression and classification tasks and evaluate their class-specific performance in Panel B of Table 2.The CatB model proves to be more effective in accurately predicting negative mispricing values for overpriced IPOs.This is attributed to the fewer outliers on the negative side of the initial return distribution.However, the model exhibits a low R 2 value, indicating a limited correlation between the factors used in this study and overpricing.In the classification problem, the LGBM algorithm demonstrates proficiency in capturing underpriced IPOs while experiencing lower performance values for the "Overpriced" class.This discrepancy suggests that data imbalance is a significant factor contributing to inconsistent performance across different classes.

Interpreting Model Predictions
In this section, we provide an analysis of the relationships among the variables using the entire IPO sample.Figure 1 shows the results obtained from the SHAP method for regression and classification models in the left and right panels, respectively.These two plots are consistent about the importance of "Offer price" in underpricing values and "Underpriced" class predictions, despite its ranking as sixth in [11].The positive association between "Offer price" and initial return seems counterintuitive, weakening the main argument that relates abnormally high initial returns to the low offer prices.Our finding suggests that increasing the offer price signals to the market that the company is more confident about its prospects and creates a larger price appreciation potential in investors' view.This leads to increased investor demand, and as a result, higher initial returns."Assets" (similar to [11] and in contrast to [10]) and "RET" are the next important factors in both plots.The positive relationship between "RET" and underpricing supports the previous theories that the firm insiders, having more insight into the firm's potential than outside investors, utilize equity retention as a means of conveying information regarding the firm's value to outside investors.Consequently, the high returns generated by IPOs with retained shares are more reflective  of aftermarket overvaluation rather than deliberate premarket underpricing."Assests" can be considered as a measure of the size and consequently a risk factor, and it is found to have the most strong inverse relationship with underpricing in both plots of Figure 1.It corroborates the fact that smaller IPOs tend to be riskier than larger ones, and they need to offer their shares at a discount to tempt investors."Leverage" is another risk factor that shows a clear negative association with underpricing.Among industry sectors, we find that "Computers" and "Scientific instrument" have a higher probability of positive returns.This is explained by the demand for these sectors, where investors are willing to pay a premium for a portion of the earnings of unique business models.Thus, the more positive mispricing in these sectors is attributed to the high demand on the first day rather than underpricing in the premarket.On the other hand, IPOs in "Chemical products" have higher values due to higher earnings potential, leading to lower underpricing.For the market condition, we observe that 'hot' and 'cold' demand for capital are the most important cycles, which have positive and negative effects on underpricing, respectively.The increased demand for capital among private firms during 'hot' periods results in a higher inflow of firms entering the IPO market.During these waves of IPOs, there is a high degree of variation in firm quality, exacerbating the asymmetric information problem and underpricing during these cycles.

Conclusion
In this paper, we investigate IPO underpricing using a comprehensive set of ML techniques and adopt an interpretative approach to uncover the most important variables and their impacts on underpricing predictions.Our findings indicate that tree-based models outperform other models in both regression and classification tasks.Moreover, we validate the theoretical implications of the predictions by analyzing the outputs of the models, thereby increasing the robustness and practicality of the novel methods in the context of the IPO markets.Adopting an interpretative framework enables investors, institutions, and the financial supervisors of companies that are preparing to go public to anticipate the extent of underpricing before the start of trading.As a result, the overall process can be better managed to ensure a successful IPO.Future research directions for this work include extending our analysis to feature-rich propriety datasets, incorporating textual from relevant stock news and applying other interpretability methods such as omission and LIME.

Figure 1 .
Figure 1.Contribution of variables to the predictions by CatB (left) and LGBM (right).

Table 1 .
Variable definitions and summary statistics.

Table 2 .
Regression and classification results.The values are averaged over 5 folds and are reported as "mean ± standard deviation".