Automated Data Mapping Based on FastText and LSTM for Business Systems

Liu, Zhibin; Hu, Huijun

doi:10.1007/978-3-031-23585-6_7

Zhibin Liu¹⁰ &
Huijun Hu¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13734))

Included in the following conference series:

International Conference on Cognitive Computing

252 Accesses

Abstract

With the continuous development of information technology, massive information processing has become an important problem in business systems. However, the metadata information from different business systems lacks a unified and standardized description method. Mapping data by the manual way greatly reduces the efficiency. Therefore, an automated data mapping method is very necessary. In this paper, we regard data mapping as a text classification problem based on the following reasons: 1) the text classification technology has become more and more mature in the field of the natural language processing (NLP), which is very suitable for processing massive data; 2) a large number of heterogeneous mapping data can be treated as text. In order to implement automated data mapping, in this paper, we propose a classification model based on FastText and long-short term memory (LSTM) for data mapping in business systems. By observing the characteristics of mapping data in business systems, we firstly use FastText to learn word representation containing semantic information, and then adopt the LSTM model to extract features for text classification automatically. Experimental results show that the proposed method can automatically classify mapping data in business systems with common quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Text Classification Using Lifelong Machine Learning

Automatic Text Document Classification by Using Semantic Analysis and Lion Optimization Algorithm

Business text classification with imbalanced data and moderately large label spaces for digital transformation

Article Open access 30 April 2024

References

El-Sappagh, S.H.A., Hendawi, A.M.A., El Bastawissy, A.H.: A proposed model for data warehouse ETL processes. J. King Saud Univ.-Comput. Inf. Sci. 23(2), 91–104 (2011)
Google Scholar
Sreemathy, J., Nisha, S., Gokula, P.R.M.: Data integration in ETL using TALEND. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), pp. 1444–1448. IEEE (2020)
Google Scholar
Zhang, W., Tang, X., Yoshida, T.: Text classification with support vector machine and back propagation neural network. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science – ICCS 2007. ICCS 2007. Lecture Notes in Computer Science, vol. 4490. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72590-9_21
Paccanaro, A., Hinton, G.E.: Learning distributed representations of concepts using linear relational embedding. IEEE Trans. Knowl. Data Eng. 13(2), 232–244 (2001)
Article Google Scholar
Kuang, Q., Xu, X.: Improvement and application of TF•IDF method based on text classification. In: 2010 International Conference on Internet Technology and Applications, pp. 1–4. IEEE (2010)
Google Scholar
Zhang, X., Wu, B.: Short text classification based on feature extension using the n-gram model. In: 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 710–716. IEEE (2015)
Google Scholar
Bengio, Y.: Neural net language models. Scholarpedia 3(1), 3881 (2008)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Partridge, C., Mitchell, A., Cook, A., et al.: A survey of top-level ontologies-to inform the ontological choices for a foundation data model (2020)
Google Scholar
Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), pp. 136–140. IEEE (2015)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., et al.: Fasttext. zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Zhang, Y., Yuan, H., Wang, J., et al.: YNU-HPCC at EmoInt-2017: using a CNN-LSTM model for sentiment intensity prediction. In: Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 200–204 (2017)
Google Scholar
Li, Y., Wang, X., Xu, P.: Chinese text classification model based on deep learning. Future Internet 10(11), 113 (2018)
Article Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Chen, L., Li, J.: Text feature selection methods based on word vector. J. Chin. Comput. Syst. 39(5), 991–994 (2018)
Google Scholar
Liu, H., Yin, Q., Wang, W.Y.: Towards explainable NLP: a generative explanation framework for text classification. arXiv preprint arXiv:1811.00196 (2018)
Liang, J., Chai, Y., Yuan, H., et al.: Emotional analysis based on polarity transfer and LSTM recursive network. J. Chin. Inf. Sci 29(5), 152–159 (2015)
Google Scholar
Lu, C., Huang, H., Jian, P., Wang, D., Guo, Y.D.: A P-LSTM neural network for sentiment classification. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds.) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science, vol. 10234. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57454-7_41
Yang, M., Qu, Q., Chen, X., et al.: Feature-enhanced attention network for target-dependent sentiment classification. Neurocomputing 307, 91–97 (2018)
Article Google Scholar
Bojanowski, P., Grave, E., Joulin, A., et al.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar

Download references

Acknowledgements

This paper is supported by the Shenzhen Development and Reform Commission subject (XMHT20200105010).

Author information

Authors and Affiliations

Kingdee Research, Kingdee International Software Group Company Limited, Shenzhen, China
Zhibin Liu & Huijun Hu

Authors

Zhibin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Huijun Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhibin Liu .

Editor information

Editors and Affiliations

Tsinghua University, Shenzhen, Tsinghua, China
Yujiu Yang
University of Science and Technology Beijing, Beijing, China
Xiaohui Wang
Kingdee International Software Group Co.,Ltd, Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Z., Hu, H. (2022). Automated Data Mapping Based on FastText and LSTM for Business Systems. In: Yang, Y., Wang, X., Zhang, LJ. (eds) Cognitive Computing – ICCC 2022. ICCC 2022. Lecture Notes in Computer Science, vol 13734. Springer, Cham. https://doi.org/10.1007/978-3-031-23585-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-23585-6_7
Published: 01 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23584-9
Online ISBN: 978-3-031-23585-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automated Data Mapping Based on FastText and LSTM for Business Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Text Classification Using Lifelong Machine Learning

Automatic Text Document Classification by Using Semantic Analysis and Lion Optimization Algorithm

Business text classification with imbalanced data and moderately large label spaces for digital transformation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Automated Data Mapping Based on FastText and LSTM for Business Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Text Classification Using Lifelong Machine Learning

Automatic Text Document Classification by Using Semantic Analysis and Lion Optimization Algorithm

Business text classification with imbalanced data and moderately large label spaces for digital transformation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation