Establishment of an ESG Index for Shipping Companies Using Big Data Analysis
Article information
Abstract
This study aims to assess the ESG index of listed Korean shipping companies utilizing text mining techniques. ESG assessment methods greatly differ among institutions, resulting in inconsistent evaluations for the same company. To enhance ESG measurement reliability, this study benchmarks the MSCI ESG Leaders Index. The analysis utilizes structured data from ESG reports and applies text mining techniques. We employ MSCI’s classification framework to calculate ESG scores in the categories of Environment (E), Social (S), and Governance (G). The findings suggest that shipping companies generally have lower ESG scores than other industries, particularly in the environmental and social categories, highlighting areas for potential improvement. Nevertheless, governance performance aligns with that seen in other sectors. Moreover, although the ESG performance in the shipping industry has gradually improved, there remains a need for more robust strategies for sustainable management. This research enhances ESG evaluation methodologies and underscores the necessity for shipping companies to fortify their ESG strategies, including reducing carbon emissions, adopting eco-friendly vessels, and improving labor conditions.
1. Introduction
ESG scores have gained significant attention both socially and academically, yet challenges persist in constructing reliable ESG indices. The Korea Corporate Governance Service, Sustinvest, and Daishin Economic Research Institute measure and release ESG scores for listed Korean companies. One major issue is the absence of a universally accepted, consistent, and standardized evaluation methodology. There are some cases that the same company may receive contrasting ESG ratings from different assessment organizations. A notable example is the electric vehicle manufacturer Tesla, which received the lowest ESG rating from the UK-based FTSE (Financial Times Stock Exchange Index) due to the high carbon emissions associated with its vehicle production process. In contrast, MSCI (Morgan Stanley Capital International) assigned Tesla the highest ESG score, focusing on the environmental benefits of electric vehicles.
ESG scores from MSCI is widely utilized by foreign investors to evaluate investment opportunities. However, MSCI discloses ESG scores for fewer than ten Korean companies, limiting the ability to identify comprehensive ESG ratings. Assessing the relative ESG performance of Korean firms on a global scale remains challenging. This lack of extensive ESG disclosure for Korean companies presents a barrier for foreign investors seeking to allocate their assets in the Korean stock market.
This study measures the ESG scores of Korean listed firms including shipping companies by benchmarking the MSCI ESG Leaders Index, one of the most influential ESG indices. Originally introduced in 1999, MSCI’s ESG index assesses approximately 2,800 companies worldwide using publicly available corporate data, government databases, and macroeconomic indicators. To develop an efficient and reliable ESG index, we adopt MSCI’s methodology, which evaluates ESG performance based on voluntary ESG disclosures through big data analysis.
The purpose of this study is to measure and establish the ESG index in shipping companies using text mining techniques. Big data analysis with text mining techniques in financial markets is first introduced by Kim and Joh (2019). Studies on ESG controversies in Korean listed companies include Bang and Ryu (2022), and their analysis has been limited to unstructured text data. It remains difficult to find prior research that utilizes structured text data for ESG evaluation. Seo (2013) highlights the necessity of utilizing big data and Kim (2014) develops a financial product recommendation system using big data technology. Yang (2019) emphasizes the need for big data utilization in building an account monitoring system. Despite the growing attention to big data applications, studies on its use in ESG assessment remain limited.
Notably, studies on ESG in the shipping industry remains extremely limited. Little concerns about ESG of shipping companies may make that shipping companies tend to engage in ESG activities less actively than firms in other industries. Hong (2024) argues that shipping companies demonstrate a relatively passive approach to ESG initiatives, particularly in the Environmental and Social dimensions. Despite the shipping industry’s crucial role in the global supply chain, it exhibits structural vulnerabilities concerning environmental responsibilities (e.g., carbon emissions, marine pollution, and ship waste management) and social responsibilities (e.g., seafarer working conditions, human rights protection, and compliance with safety regulations). As global ESG regulations continue to tighten, shipping companies must enhance their commitment not only to governance but also to environmental and social responsibilities.
In addition to the shipping sector, ESG-related research in shipping, logistics, and port industries has also gained increasing attention. Existing studies, such as Lee and Lee (2022), have employed text mining techniques to identify key ESG-related terms in logistics firms but have primarily focused on mapping corporate priorities rather than establishing a systematic ESG evaluation framework. Similarly, Kim (2023) examines the impact of ESG management on the corporate performance of shipping and logistics companies, focusing on both liner and tramp shipping firms as well as integrated logistics companies. However, the study relies on survey-based ESG measurement rather than developing concrete ESG indicators, highlighting a limitation in its methodological approach. Sohn (2022) explores ESG management strategies to strengthen competitiveness in shipping and logistics; however, the study is limited to reporting the current status of ESG management practices in shipping and logistics companies without proposing a structured evaluation methodology. Seo (2023) also examines ESG strategies for port authorities, but the study remains limited to describing current ESG practices rather than developing an actionable framework for implementation. These studies highlight the significant challenge of objectively evaluating and measuring ESG value. Since ESG factors encompass non-financial elements such as environmental, social, and governance aspects, quantifying them through objective numerical metrics is inherently difficult. This indicates the necessity of further research on constructing a robust ESG index to enhance the accuracy and reliability of ESG evaluation.
This study addresses these limitations by constructing an ESG index tailored to the shipping industry, incorporating the MSCI ESG Leaders Index methodology to ensure consistency and reliability in assessment. By leveraging text mining techniques and structured data analysis, our approach not only provides a standardized framework for evaluating ESG performance but also enhances the comparability of ESG scores across firms. This study focuses on five shipping companies, representing a diverse cross-section of the industry, including firms specializing in different types of maritime transport. This contribution is particularly significant given the absence of universally accepted ESG evaluation methodologies in the shipping, logistics, and port industries, and it offers a benchmark for ESG performance assessment within the shipping sector.
2. Data and Methodology
2.1 Sample data
This study conducts a big data analysis to measure ESG scores based on ESG reports from publicly listed companies. We obtain ESG reports from 89 companies listed on the Korea Exchange (KRX) over a three-year period, from 2022 to 2024. In particular, we focuse on shipping companies operating container ships, bulk carriers, and tankers. To evaluate the ESG performance of shipping companies, five major firms are selected for analysis. These include one firm operating both container ships and tankers, one specializing in container shipping, two operating bulk carriers, and one focused on tanker operations (Kim, 2024).
Table 1 presents the distribution of sample firms used in this study, categorized by industry. A total of 89 publicly listed companies were analyzed, with the largest proportion belonging to the manufacturing sector (44.9%), followed by service (12.4%), finance (9%), and construction (9%). The transportation & warehousing sector, which includes shipping companies, accounts for 7.9% of the sample.
2.2 Methodology
We conduct a big data analysis to measure ESG pillars and establish an ESG index. Big data analytics has been widely applied across various domains, including climate forecasting, financial risk assessment, healthcare diagnostics, and market trend analysis, demonstrating its ability to extract meaningful insights from large-scale datasets (Moreno and Caminero, 2022; Gupta et al., 2020; Moradi and Mokhatab Rafiei, 2019).
Among the various big data techniques, text mining is particularly effective for analyzing ESG-related data, as it enables the identification of patterns and relationships within vast amounts of textual information. The primary function of text mining is to extract and structure meaningful information from unstructured data, allowing for deeper interpretation and hypothesis generation (Gaikwad et al., 2014; Kiriu and Nozaki, 2020; Inzalkar and Sharma, 2015; Buehlmaier and Whited, 2018; Li, 2010). These developments enhance the ability to analyze ESG disclosures, corporate sustainability reports, and regulatory filings, ensuring a more systematic and data-driven approach to ESG index construction.
Table 2 presents the ESG classification framework provided by MSCI. MSCI categorizes ESG factors into three pillars (Environmental, Social, and Governance), 10 Sub-pillars, and 35 key issues. The three pillars encompass mid-level themes such as climate change, natural resources, pollution and waste, environmental opportunities, human capital, product liability, stakeholder conflicts, social opportunities, corporate governance, and corporate behavior. Each of these themes is further divided into 35 key issues, which form the basis for the ESG evaluation.
We establish an ESG index through a detailed text mining process, structured as follows: First, we use a web crawler to systematically collect ESG reports from each company's official website. Subsequently, Intelligent Document Processing (IDP; Mohammadshirazi, 2024) is utilized to extract textual data from the ESG reports provided in PDF format. After extraction, we perform noise removal to eliminate headers, footers, page numbers, and other non-content elements that might affect analysis quality.
The extracted text is then processed through morphological analysis using the Korean NLP library (KoNLPy; Park & Cho, 2014), specifically employing the Mecab tokenizer for its accuracy with domain-specific terminology. For English text segments commonly found in Korean companies' reports, we apply separate English tokenization. Due to the unique nature of the Korean language—where words can appear differently depending on post-positions and endings—both morpheme-based N-grams (ranging from 1-gram to 4-gram) and word-based N-grams (also from 1-gram to 4-gram) are generated, filtering out stopwords and non-meaningful combinations. The frequency of occurrence for each N-gram is then calculated.
The text mining process can be mathematically formulated as follows, where we calculate ESG scores based on the frequency of ESG-related terms and their respective weights:
1. Term Frequency Calculation:
freq(t,d) = frequency of term t in document d
2. Document-specific ESG Category Score:
ESG_score(d,c) = ∑(t∈C) freq(t,d) × w(t,c)
where:
- d is the ESG report being analyzed
- c is the ESG category (Environment, Social, Governance)
- C is the set of all terms related to category c
- w(t,c) is the industry-specific term weight (detailed later)
3. Final ESG Score:
ESG_final(company) = normalize(∑(c∈{E,S,G}) ESG_score(d,c) × w(c))
where:
- w(c) is the category weight
- normalize() is a score normalization function (0-10 scale)
MSCI assigns weights to pillars of ESG by considering three key elements: (1) the materiality of key issues, (2) their expected realization period (short-term, medium-term, long-term), and (3) industry-specific adjustments. These factors ensure that the weighting process accurately reflects both the financial significance and urgency of ESG issues across industries.
First, MSCI evaluates the materiality of ESG issues by assessing their impact on a company’s financial performance. ESG pillars with greater materiality receive higher weights, as they pose more significant financial risks or opportunities for businesses. Second, the expected realization period of ESG issues is considered, categorizing them as short-term, medium-term, or long-term based on their urgency. Issues with immediate financial implications (short-term) are assigned higher weights, while those with longer realization periods receive lower weights, reflecting their gradual influence on corporate sustainability. Third, industry-specific adjustments are applied to capture sectoral differences in ESG priorities. For example, carbon emissions and climate change are weighted more heavily in the energy sector, while data privacy and responsible lending are more critical in the financial sector. Similarly, carbon emissions hold greater importance in manufacturing and energy industries, whereas privacy and data security are prioritized in financial and IT sectors.
By integrating these three dimensions—materiality, expected realization period, and industry-specific adjustments—MSCI ensures that ESG assessments accurately account for both sector-specific risks and the urgency of ESG issues, resulting in a more comprehensive and financially relevant weighting approach. The weighting system is structured into three categories: Highest(High-Short-Term), High(High-Medium-Term), and Medium(Medium-Long-Term) in Table 3. Each category is assigned a numerical weight, with Highest receiving a weight of 3, High a weight of 2, and Medium a weight of 1.
The Highest (High-Short-Term) category includes ESG factors that pose immediate risks or opportunities with a substantial industry-wide impact. These issues have a high probability of occurring in the short term and require prompt corporate responses. For instance, carbon emissions, labor standards, data security, product safety, and corporate ethics fall into this category, as they demand urgent attention and have significant implications for business operations.
The High (High-Medium-Term) category covers ESG factors that present substantial mid-term risks, requiring continuous monitoring and strategic management. These issues are not as immediate as those in the Highest category, but they still require proactive efforts from companies to mitigate potential challenges over time.
The Medium (Medium-Long-Term) category consists of ESG factors that represent long-term risks, gradually increasing in impact over time. Although their immediate risk level is low, they may become more critical due to regulatory changes or market trends. Examples include green finance, sustainable investment, healthcare accessibility, supply chain ethics, and human capital development. These factors may not pose urgent threats today, but they are expected to gain importance as global sustainability standards evolve.
This weighting system ensures that ESG factors are assessed based on their immediacy and overall significance, allowing for a more structured and industry-specific evaluation.
3. Results
This study applies weighted adjustments to ESG factors based on the materiality of key issues, their expected realization period (short-term, medium-term, long-term), and industry-specific adjustments. The scores for each pillar by industry estimate using these frameworks. Table 4 compares the ESG scores before and after applying these adjustments. Initially, the scores are 3.03 for Environment, 3.85 for Social, and 2.08 for Governance. After incorporating industry-specific weightings, they increase to 6.73, 5.07, and 5.94, respectively. These changes reflect the varying importance of ESG issues across industries, ensuring that sector-specific priorities are more accurately represented.
Table 5 presents the ESG scores in industries after applying the weighted adjustments. The telecommunications sector recorded the highest ESG scores, with 8.42 for Environment, 9.65 for Social, and 9.28 for Governance. This can be attributed to the sector’s strong emphasis on data security, network sustainability, and regulatory compliance. In contrast, the transportation and warehousing sector had relatively lower ESG scores, with 3.74 for Environment, 4.87 for Social, and 4.95 for Governance. These lower scores indicate that the sector has been less proactive in addressing carbon emissions, labor conditions, and supply chain management. The construction and manufacturing sectors demonstrated relatively high scores in the environmental and social categories, reflecting the stringent ESG requirements related to energy consumption, environmental impact, and workplace safety in these industries.
Table 6 presents the average ESG scores for the shipping industry, a subcategory of the transportation and warehousing sector. The scores for the shipping industry are 3.80 for Environment, 4.13 for Social, and 6.91 for Governance. Compared to other industries, the Governance score is relatively high, while the Environment and Social scores remain low. This suggests that while corporate governance in the shipping industry is relatively well-managed, there is significant room for improvement in environmental and social responsibility, particularly in areas such as carbon emissions, labor conditions, and sustainable practices.
Table 7 illustrates the changes in the average ESG scores for the shipping industry from 2022 to 2024. The Environment (E) score gradually improved from 3.03 in 2022 to 4.65 in 2024, likely reflecting efforts to reduce carbon emissions and adopt eco-friendly fuels. The Social (S) score declined slightly from 4.55 in 2022 to 4.35 in 2023, before increasing to 5.71 in 2024. This improvement may be attributed to enhanced labor conditions for seafarers and stricter safety regulations. The Governance (G) score decreased from 5.01 in 2022 to 4.85 in 2023, then showed a slight rebound to 4.98 in 2024. This fluctuation suggests that corporate transparency policies and ESG-driven management strategies have had some influence over time.
This study confirms that the ESG scores of the transportation and warehousing sector, particularly the shipping industry, are relatively low compared to other industries. In particular, improvements are needed in the Environment (E) and Social (S) categories, emphasizing the necessity of ESG strategies such as reducing carbon emissions, adopting eco-friendly vessels, and improving labor conditions.
An analysis of the year-over-year trends indicates gradual improvements in the Environment (E) and Social (S) dimensions, while the Governance (G) dimension exhibits volatility. This suggests that the ongoing development of ESG regulations and corporate ESG management strategies is shaping industry trends. To enhance ESG performance in the shipping industry, it is essential to continuously monitor ESG levels, promote the adoption of green technologies, and strengthen corporate social responsibility policies.
4. Conclusion
This study utilized big data and text mining techniques to measure ESG scores and analyze ESG performance across various industries, including shipping companies. The findings indicate that the ESG scores of shipping companies are relatively lower than those of other industries, with particularly weak performance in the Environment and Social dimensions. Analyzing year-over-year trends, the E and S scores showed gradual improvement, whereas the Governance dimension exhibited volatility. This suggests that ESG regulations and corporate ESG management strategies are continuously evolving. Furthermore, industry-level ESG performance comparisons reveal that the telecommunications and manufacturing sectors achieved high ESG scores, while the transportation and warehousing sector, particularly the shipping industry, recorded relatively lower scores. This reflects the unique challenges of the shipping industry, such as high carbon emissions, supply chain risks, and a lack of robust social responsibility and sustainability strategies.
This study offers several key contributions. First, this study provides a comprehensive analysis of ESG performance in shipping companies, offering insights into areas that require improvement. Previous research has shown that shipping companies are less engaged in ESG activities compared to non-shipping firms, and this study confirms similar findings. The results highlight the need for shipping companies to adopt carbon neutrality policies, eco-friendly fuels, and improved labor conditions to enhance their ESG performance.
Third, this study introduces industry-specific ESG weighting adjustments, presenting a more realistic ESG evaluation framework. Existing ESG assessments often lack consistency, with the same company receiving different ESG ratings from various institutions. By incorporating industry-specific ESG issue prioritization, this study proposes a fairer and more objective ESG evaluation system.
Future research should explore the long-term impact of ESG scores on corporate financial performance and sustainability, further investigating whether ESG strategies contribute to long-term value creation for firms.
Notes
Acknowledgement
This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2022S1A5B5A16056785).