Abstract
Financial distress prediction is of great importance for companies to take timely measures. It is also useful for banks and investors to avoid potential risk. Mining potential distressful companies in this fast-changing business environment is crucial for timely intervention, but it is also very challenging. Because in real scenario, only few companies have been assessed to be financial distressful, while most companies remain unlabeled. Traditional supervised learning and unsupervised learning are no longer suitable under such circumstances. Since this problem can be viewed as an anomaly detection problem with partially observed anomalies, this paper proposes a semi-supervised learning framework adopting PU-learning method and unsupervised method combined with improved feature selection procedure. The proposed system makes full use of limited observed data and is robust to unknown novel anomalies at the same time. This system outperforms traditional supervised and unsupervised methods as well as some other semi-supervised methods. Meanwhile, the framework provides explanation for the detected anomalous companies which is useful for further analysis.
Sponsored by Key Laboratory of Applied Mathematics of Fujian Province University (Putian University) (NO.SX202102)
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tatiana, D., Federico, H., Mauricio, L., Sergio, L.: Financing firms in hibernation during the COVID-19 pandemic. J. Finan. Stabil. 53 (2021)
Geng, R., Bose, I., Chen, X.: Prediction of financial distress: an empirical study of listed Chinese companies using data mining. In: European Journal of Operational Research, pp. 236–247. Elsevier, Netherlands (2015)
Kou, Y., Lu, C.T., Sirwongwattana, S., Huang, Y.P.: Survey of fraud detection techniques. In: IEEE International Conference on Networking, Sensing and Control, 2004, vol. 2, pp. 749–754. IEEE (2004)
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection. In: Barbará, D., Jajodia, S. (eds.) Applications of Data Mining in Computer Security. Advances in Information Security, vol. 6, pp. 77–101. Springer, Boston, MA (2002). https://doi.org/10.1007/978-1-4615-0953-0_4
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Proceeding of the 8th IEEE International Conference on Data Mining, ICDM, pp. 413–422. Institute of Electrical and Electronics Engineers Inc., Pisa, Italy (2008)
Markus, M.B., Kriegel, H.P., Raymond, T.N., Sander, J.: LOF: identifying density-based local outliers. ACM. Sigmod. Record. 29, 93–104 (2000)
Juszczak, P., Duin, R.P.: Uncertainty sampling methods for one-class classifiers. In: Proceedings of ICML 2003, Workshop on Learning with Imbalanced Data Sets II, pp. 81–88. AAAI, Washington (2003)
Liu, X., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE. Trans. Syst. Man. Cybern. Part B 39(2), 539–550 (2009)
Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220. Association for Computing Machinery, Nevada USA (2008)
Aggarwal, C.: An Introduction to Outlier Analysis. Presented at the (2017). https://doi.org/10.1007/978-3-319-47578-3_1
Puggini, L., McLoone, S.: An enhanced variable selection and Isolation Forest based methodology for anomaly detection with OES data. Eng. App. Artif. Intell. 67, 126–135 (2018)
Lundberg, S., Lee, S.I.: A unified approach to interpreting model predictions. In: 31st Annual Conference on Neural Information Processing Systems, NIPS 2017, pp. 4766–4775. Neural information processing systems foundation, CA, USA (2017)
Zhang, Y.L., Li, L., Zhou, J., Li, X., Zhou, Z.H.: Anomaly detection with partially observed anomalies. In: 27th International World Wide Web, pp. 639–646. Association for Computing Machinery, Lyon (2018)
Mordelet, F., Vert, J. P.: A bagging SVM to learn from positive and unlabeled examples. Pattern Recognition Letters, pp.201–209. Elsevier, Netherlands (2014)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988. Elsevier, England (2017)
Liu, B., Lee, W. S., Yu, P.S., Li, X.: Partially supervised classification of text documents. In: ICML, pp. 387–394. Morgan Kaufmann, Sydney (2002)
Aggarwal, C.: High-Dimensional Outlier Detection: The Subspace Method. Presented at the (2017). https://doi.org/10.1007/978-3-319-47578-3_5
Financial distress detection dataset. https://www.kaggle.com/shebrahimi/financial-distress. Accessed 26 Apr 2021
Fawcett, T.: An introduction to ROC analysis. In: Pattern Recognition Letter, pp. 861–874. Elsevier, Netherlands (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhu, X., Liu, F., Niu, Z. (2021). Financial Distress Detection and Interpretation with Semi-supervised System. In: Huang, DS., Jo, KH., Li, J., Gribova, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2021. Lecture Notes in Computer Science(), vol 12837. Springer, Cham. https://doi.org/10.1007/978-3-030-84529-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-84529-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84528-5
Online ISBN: 978-3-030-84529-2
eBook Packages: Computer ScienceComputer Science (R0)