# A Study on the Insolvency Prediction Model for Korean Shipping Companies

## Article information

## Abstract

To develop a shipping company insolvency prediction model, we sampled shipping companies that closed between 2005 and 2023. In addition, a closed company and a normal company with similar asset size were selected as a paired sample. For this study, data of a total of 82 companies, including 42 closed companies and 42 general companies, were obtained. These data were randomly divided into a training set (2/3 of data) and a testing set (1/3 of data). Training data were used to develop the model while test data were used to measure the accuracy of the model. In this study, a prediction model for Korean shipping insolvency was developed using financial ratio variables frequently used in previous studies. First, using the LASSO technique, main variables out of 24 independent variables were reduced to 9. Next, we set insolvent companies to 1 and normal companies to 0 and fitted logistic regression, LDA and QDA model. As a result, the accuracy of the prediction model was 82.14% for the QDA model, 78.57% for the logistic regression model, and 75.00% for the LDA model. In addition, variables ‘Current ratio’, ‘Interest expenses to sales’, ‘Total assets turnover’, and ‘Operating income to sales’ were analyzed as major variables affecting corporate insolvency.

**Keywords:**Korean shipping companies; insolvency prediction; financial ratio analysis; logistic regression; LDA; QDA

## 1. Introduction

According to the Korean Shipowner’s Association(2023), the degree of dependence upon foreign trade of Korea is 63.51%, and 99% of import and export cargo is transported by ship. It also said that the shipping industry plays an important role in the Korean economy because 100% of Korea's raw materials, such as crude oil, steel, and coal fuel, are transported by sea. According to the Bank of Korea(2023), The total foreign exchange earning of the service industry was about $131.6 billion, and shippng service was about $48.3 billion. The foreign exchange earning from the shipping services accounted for approximately 36.7% of total services. The Korean shipping merchant fleet is 97.14 million DWT as of 2022, making it the 6th largest shipowner in the world with a global market share of 4.3%(UNCTAD, 2023). The Korean shipping industry is also an industry that has led the development of forward and backward industries such as port, shipbuilding, and finance. Additionally, the shipping industry has played a role in transporting national strategic materials in the event of a national emergency such as war.

Risk management is essential for the shipping industry, which plays such an important role, to secure competitiveness. In particular, the shipping industry is a capital-intensive industry, so it reacts sensitively to environmental changes and has high risks. Therefore, discovering variables that affect the failure of shipping companies and developing an insolvency prediction model to manage risk factors will be helpful in managing the risk of shipping companies.

Since Beaver(1966) and Altman(1968) developed a corporate insolvency prediction model using financial ratios, much research has been conducted on corporate insolvency prediction. In this study, research on corporate insolvency prediction will be reviewed and a shipping company insolvency prediction model using financial ratio variables will be developed. In particular, there are not many recent studies on predicting insolvency of Korean shipping companies, and this study attempts to do so. We plan to select insolvent companies that occurred from 2005 to 2023 and develop a model to predict insolvency of shipping companies by using normal companies as pairwise samples. This paper develops as follows. Chapter 2 reviews previous research on the corporate insolvency prediction models. Chapter 3 presents the research design, Chapter 4 performs empirical analysis, and Chapter 5 summarizes the conclusions and implications of this study.

## 2. Literature Review

Beaver(1966) used univariate discriminant analysis to predict corporate insolvency and selected variables such as Net profit/Total assets, Current assets/Current liabilities, Cash flow/Current liabilities, and Total assets/Borrowings as major financial variables. Altman(1968) attempted multivariate discriminant analysis for a failure prediction model. Net profit/Total assets, Operating profit/Total capital, Net working capital/Total capital, Equity capital/Current liabilities, and Sales/Total capital were selected as major financial variables. Ohlson(1980) developed a corporate insolvency prediction model by applying logit analysis. Net profit/Total assets, Net working capital/Ttotal capital, Current assets/Current liabilities, Equity capital/Current liabilities, and Total assets/Borrowings are included in the model.

Since then, various studies have been developed to predict corporate insolvency. There are many studies approaching the development of insolvency prediction models targeting listed companies or focusing on specific industries.

There are papers that develop corporate insolvency prediction models targeting KOSPI or KOSDAQ listed companies(Kang and Hong, 1999; Song, 1999; Jang, 2000; Kim et al., 2001; Kim et al., 2003; Cho and Ryu, 2007; Cho and Kang, 2007; Park and Kang, 2009; Jeon et al., 2011; Lee, 2023) In this case, the analysis includes market-related variables in addition to general financial ratio variables.

There are many papers that use industry-specific approaches to predict corporate insolvency. There are studies focusing on information and communication technology industry(Lee and Park, 2001), land passenger transportation industry(Jeong and Choi, 2006), technologically innovative small and medium-sized enterprises(Nam, 2008; Lee and Yu, 2022), construction industry(Yu et al., 2009) and restaurant industry(Kim, 2019).

Studies on corporate insolvency prediction models for the shipping companies are as follows.

Haider et al.(2019) developed a corporate insolvency prediction model targeting global listed shipping companies. A total of 40 samples were selected by selecting 20 failed companies from 2007 to 2014 and matching with 20 normal companies with similar asset size. They considered delisted companies as insolvent companies and analyzed the relationship between corporate failure and financial risk using a binary logit model. The estimated coefficient of Total Debt/Total Assets is significant at the 5% level and was analyzed to have a positive effect on corporate insolvency.

Kim and Lee(2016) selected 32 companies that closed down and 32 normal companies from 2000 to 2014 and performed an insolvency prediction model. Financial statements were collected from the Financial Supervisory Service's Data Analysis, Retrieval and Transfer System (DART), 22 financial ratios were selected, and a logistic regression model and linear discriminant analysis model were applied. As a result of the analysis, Current ratio, Interest expenses to sales, Sales operating income to sales, and Growth rate of equity were found to be significant variables. When using business closure data from a year ago, the accuracy of the test data was higher for the 75% logit model than the 70% linear discriminant model.

Park et al.(2021) used the logit model to predict the default risk of Korean shipping companies. The data used was the financial data of Korean shipping companies from 2001 to 2019 provided by the Korean Shipowner’s Association. More than 30 variables were selected and abbreviated, and ultimately, Ratio of net income to total assets, Ratio of operating income to sale, Ratio of interest expenses to liabilities, Current ratio, Debt ratio, Ratio of freight income to chartering cost, and Ratio of chartering cost to sales variables were found to be significant in the insolvency of shipping companies.

Park and Oh(2022) examined the difference in cash flow between insolvent companies and normal companies. As a result of the analysis, it was found that there was a difference in operating cash flow between normal and insolvent companies. He also divided bankruptcies into two groups based on when the bankruptcy occurred. As a result of analyzing cash flow for the three years before insolvency, it was found that there was a difference in cash flow from investment and financing activities between the two groups depending on the time of insolvency.

Kwon and Park(2022) attempted to develop a failure prediction model for Korean shipping companies using five machine learning models. They used Korean shipping company data from 2000 to 2019. They compared the forecasting power using financial ratios known to have the ability to predict shipping companies' insolvency. As a result of comparing each model, Lasso and Logit showed better performance than other technologies, suggesting that the failure prediction ability can be improved by applying machine learning technology.

Kwon and Park(2023) analyzed the impact of macroeconomic shocks on the financial stability of the Korean shipping companies using a stress testing framework. They analyzed risk with Altman's K-score and performed scenario analysis using panel regression model. As a result, they found that Korean shipping companies were vulnerable to GDP, BDI and oil price shocks.

were vulnerable to GDP, BDI and oil price shocks. Summarizing previous studies on the prediction of insolvency of shipping companies, financial variables, macroeconomic variables, and shipping market characteristic variables were used as independent variables as factors affecting corporate failure. The prediction model used LDA, binary logit model, panel regression model, and machine learning techniques.

## 3. Methodology ^{1)}

### 3.1 Shrinkage for variables

LASSO(least absolute shrinkage and selection operator) is a regression analysis method that performs both variable selection and normalization to increase the predictive accuracy and interpretability of statistical models. When using the LASSO method, the regression coefficient for a meaningless variable among p independent variables is set to 0. Therefore, unnecessary variables can be removed from the regression model. The objective function is to minimize the sum of RSS and LASSO penalty.

Here, *RSS* is the residual sum of squares of a p-dimensional multiple regression model. ǀβ_{j}ǀ is the LASSO penalty and λ is tuning parameter. When λ=0, the penalty term has no effect and LASSO regression will produce the least squares estimates. However, as λ→∞, the impact of the shrinkage penalty grows, and the LASSO regression coefficient estimates will be zero.

### 3.2 Logistic regression

Logistic regression is a method of selecting a closed company, extracting paired samples of normal companies of similar size, and then estimating the logistic function to determine the probability of bankruptcy. The logistic function is expressed as:

where X=(X_{1}, X_{2}, ⋯ , X_{n})are n predictors, p(X) is the probability of bankruptcy. We could then fit a linear regression to binary response, and predict Default if Ŷ>0.5 and Normal otherwise. Equation (2) can be rewritten as:

The left-hand side is called log odds and the right-hand side is linear X.

### 3.3 LDA(Linear Discriminant Analysis)

LDA seeks to estimate P(Y=k|X=x) using the Bayes’ theorem through the distribution of X when each Y is given, that is, P(X|Y). In the case of LDA with multiple independent variables, it is assumed that X=(X_{1}, X_{2}, ⋯ , X_{n}) follows a multivariate normal distribution. The average for each group is different, and the variance assumes a common variance.

LDA, which includes multiple variables, ensures that the distribution P=(X=(x_{1}, x_{2}, ⋯ , x_{n}) | Y=k)f_{k}(x). Assume that X~N(μ_{k}, Σ). π_{k} represents the prior probability that a randomly chosen observation comes from the kth group. µ_{k} is the mean parameter for the kth group. Using this, the probability of the Bayes classifier can be obtained through Bayes' theorem, that is, P(Y=k | X=x)=

Because the functional expression δ_{k}(x) involved in group decisions is a linear function of x, it is called LDA (linear discriminant analysis).

### 3.4 QDA(Quadatic Discriminant Analysis)

Like LDA, QDA also assumes a multivariate normal distribution for P(X=(x_{1}, x_{2}, … , x_{n}) | Y=k)=f_{k}(x) and performs classification using Bayes’ theorem. However, QDA allows each group(k) to have its own covariance matrix. In other words, it has X~N(µ_{K}, Σ_{k}). In the final decision function of LDA, ^{-}x_{k}, which was excluded because it is all the same, only needs to be added. Expressed as a formula, it is as follows:

The above equation classifies the input into the group that appears the largest. Because this decision function is not linear for x, it is called QDA.

In this study, we attempt to fit the insolvency prediction model for the Korean shipping company using the representative classification model methods logistic regression, LDA, and QDA.

## 4. Empirical Analysis

### 4.1 Data

The Financial Supervisory Service's Electronic Disclosure System(DART) provides the overall status and financial information of companies. In the case of companies registered in DART, the classification process of companies subject to empirical analysis is important because information on companies that are not currently in business, such as mergers and acquisitions, closure of business, or company change, is also included.

For this study, we sampled shipping companies that closed between 2005 and 2023 for which financial data was available in DART. In addition, a closed company and a normal company with similar asset size were selected as a paired sample. Financial data were collected one and two years prior to business closure. The sample for this study consisted of data from a total of 82 companies, including 42 closed companies and 42 normal companies.

These data were randomly divided into a training set (2/3 of data) and a testing set (1/3 of data). Training data were used to develop the model while test data were used to measure the accuracy of the model.

### 4.2 Variables

Table 1 shows the independent variables constructed through previous research.

In addition to the company's size, sales, and business history used in previous studies, a total of 24 financial ratio variables representing major financial indicators such as liquidity, stability, activity, profitability, and growth were used as independent variables(Haider et al., 2019; Kim and Lee, 2016; Park and Oh, 2022; Kwon and Park, 2023). The dependent variable is whether insolvency occurred, and the independent variable is the financial ratio variable one year before the company's insolvency occurred.

### 4.3 Selection of the variables

The results of selecting key variables that affect corporate insolvency using financial data from one year before the occurrence of corporate insolvency are as follows. According to LASSO logistic regression, Assets, Current ratio, Non-current assets to equity and non-current liabilities, Interest expenses to sales, Total assets turnover, Net income to sales, Operating income to sales, Growth rate of total assets, and Growth rate of equity were selected as important variables.

### 4.4 Results of analysis

#### 1) Logistic regression

The results of fitting the logistic regression model are shown in Table 2 below.

Among the 9 independent variables, the variables that affect the insolvency of the Korean shipping companies at the 5% significance level are Current ratio, Interest expenses to sales, Total assets turnover, and Operating income to sales. It was analyzed that the lower Current ratio and Operating income to sales and the higher Interest expenses to sales and Total assets turnover, the higher the probability of corporate insolvency. The results of this study show that, unlike previous studies, the higher the total asset turnover, the higher the probability of bankruptcy. This can be interpreted to mean that, despite good management activities, companies with a large burden of financial costs such as interest costs have a high possibility of failure.

First, the model accuracy of the train data was analyzed as 89.29%, and the confusion matrix is shown in Table 3.

Next, the model accuracy of the test data was analyzed as 78.57%, and Table 4 is confusion matrix.

#### 2) LDA

Table 5 and 6 show the confusion matrices of train and test data for LDA. The accuracy of the train data is 87.50% and the accuracy of the test data is 75.00%.

#### 3) QDA

Table 7 and 8 present the confusion matrices of training and testing data for LDA. The accuracy of the train data is 91.07% and the accuracy of the test data is 82.14%.

In this study, the logistic regression, LDA, and QDA were applied as classification models to predict the insolvency of Korean shipping industry. As shown in Table 9, the QDA showed the highest accuracy at 82.14%, followed by the logistic regression at 78.57% and the LDA at 75.00%.

The five models for predicting insolvency of shipping companies, the logit model showed the highest accuracy with a value of 78.35%(Kwon and Park, 2022). The logistic regression analysis results of this study showed almost similar accuracy to Kwon and Park(2022). Park et al.(2021) showed prediction accuracy of 73.71% in the logit models that include financial variables, but 75.6% and 81.98% in the models that considers both financial and market variables.

## 5. Conclusion

In this study, a prediction model for Korean shipping insolvency was developed using financial ratio variables frequently used in previous studies. First, the LASSO method was used to reduce the main variables out of 24 independent variables to 9. Next, the logistic regression model, LDA, and QDA were fitted by setting insolvent companies as 1 and normal companies as 0. As a result, the accuracy of the prediction model was found to be QDA (82.14%), logistic regression(78.57%), and LDA (75.00%).

QDA is a model that excludes the assumption of a common covariance structure from LDA. In other words, it is discriminant analysis used when the covariance structures are different. As a result of this study, the better performance of QDA compared to LDA can be interpreted as having a different covariance structure.

From the perspective of risk management, the prediction accuracy of the shipping company insolvency prediction model is important, but it is also important to infer and interpret which variables have a significant impact on company insolvency. Therefore, the interpretation of variables in the logistic regression model is also important. As a result of the analysis, Current ratio, Interest expense to sales, Total asset turnover, and Operating profit to sales were analyzed as major variables that affect corporate insolvency. If shipping companies or stakeholders manage major variables that affect shipping company insolvency, it will help manage risks not only for shipping companies but also for the shipping industry. In other words, this study on the predicting insolvency of shipping companies is expected to be useful as a risk management tool for various stakeholders.

Nevertheless, this study has several limitations. As models with improved prediction and inference are developed, the need to build models using various methods is emerging. Additionally, in future research, it is necessary to discover more variables such as macroeconomic variables and market variables and include them in the model. In the case of this study, since the sample was selected by selecting bankrupt companies and then matching them with normal companies, market variables were not entered because the number of normal companies and bankrupt companies was the same during the same period. Also, since market variables were not obtained from DART, it was difficult to reflect them in this study. This is an area that needs improvement in the future.

## Notes

^{1}

This part was written with reference to James et al.(2021).