年報文字基礎溝通價值與公司信用評等預測:機器學習模型之應用
Other Title
Text-Based Communicative Value of Annual Reports and Corporate Credit Rating Predictions: Using Machine Learning Models
Journal
證券市場發展季刊
Journal Volume
36
Journal Issue
2
Start Page
159
End Page
206
ISSN
1023-280X
Date Issued
2024-06
Author(s)
Abstract
在以往信用評等預測相關文獻中,大多以統計模型來進行信用評等的分類預測,並財務特徵變數作為主要的解釋變異來源。然而,根據標準普爾的信用評等準則可知,企業信用評等乃先以企業經營風險及財務風險決定一初值後,再經由多面向的非量化因素調整而得,是故財務變數並無法捕抓信用評等資訊的全貌。不同於以往文獻,本研究以機器學習模型(隨機森林、支持向量機、極限梯度提升、羅吉斯回歸)為基礎,並在既有財務會計特徵變數為標竿設定下,額外引入年報文字基礎溝通價值變數(e.g.可讀性與語意)來捕抓信評公司所考量的非量化調整因素,特別是不完全會計資訊的部份。本研究利用美國市場1994年至2017年的公司信用評等資料來進行分析,實證結果顯示:在額外投入年報文字基礎溝通價值變數後,模型預測效力(e.g. F1分數)均有一定程度的提升,其中又以隨機森林及極限梯度提升模型總體表現最好(F1分數可達0.76~0.77,增額提升約6%)。此顯示年報文字基礎溝通價值資訊對信用評等有不同於傳統財務變數的額外解釋能力。此外,本研究亦發現年報文字基礎溝通價值資訊能進一步降低非投資級公司被誤判為投資級公司的比率,亦即對非投資級公司的評等分類有更高的預測效力。因此,本研究驗證年報文字基礎溝通價值變數可捕抓到信評公司的非量化因素調整之資訊內涵。
Different from the previous literature, this study employs American firm credit rating data from 1994 to 2017 to explore whether additionally introducing the text-based communicative value (TCV) variables of annual reports (e.g. readability and tones) improves the effectiveness of credit rating predictions based on the machine learning models only with financial characteristic variables as input ones. Empirical results show that after additionally introducing the TCV variables of annual reports, the model prediction effectiveness has a certain improvement, and the random forest and XGBoost models perform the best overall (e.g. F1 score reaches 0.76-0.77, which increases by about 6%). This shows that the TCV information of annual reports has an additional explanatory power for credit ratings compared to the traditional financial variables. In addition, this study also finds that the TCV information of annual reports can further reduce the ratio of non-investment grade firms being misjudged as investment grade ones. That is, TCV information of annual reports has the greater predictive power for non-investment grade firms. Therefore, this study verifies that the TCV information of annual reports captures the information contents of the non-quantitative factor adjustments of credit rating agencies.
Subjects
年報文字基礎溝通價值
信用評等
機器學習
不完全資訊
Text-based communicative value of annual reports
Credit rating
Machine learning
Incomplete information
Type
journal article
