Skip to main content

Automated Data Mapping Based on FastText and LSTM for Business Systems

  • Conference paper
  • First Online:
Cognitive Computing – ICCC 2022 (ICCC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13734))

Included in the following conference series:

  • 252 Accesses

Abstract

With the continuous development of information technology, massive information processing has become an important problem in business systems. However, the metadata information from different business systems lacks a unified and standardized description method. Mapping data by the manual way greatly reduces the efficiency. Therefore, an automated data mapping method is very necessary. In this paper, we regard data mapping as a text classification problem based on the following reasons: 1) the text classification technology has become more and more mature in the field of the natural language processing (NLP), which is very suitable for processing massive data; 2) a large number of heterogeneous mapping data can be treated as text. In order to implement automated data mapping, in this paper, we propose a classification model based on FastText and long-short term memory (LSTM) for data mapping in business systems. By observing the characteristics of mapping data in business systems, we firstly use FastText to learn word representation containing semantic information, and then adopt the LSTM model to extract features for text classification automatically. Experimental results show that the proposed method can automatically classify mapping data in business systems with common quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. El-Sappagh, S.H.A., Hendawi, A.M.A., El Bastawissy, A.H.: A proposed model for data warehouse ETL processes. J. King Saud Univ.-Comput. Inf. Sci. 23(2), 91–104 (2011)

    Google Scholar 

  2. Sreemathy, J., Nisha, S., Gokula, P.R.M.: Data integration in ETL using TALEND. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), pp. 1444–1448. IEEE (2020)

    Google Scholar 

  3. Zhang, W., Tang, X., Yoshida, T.: Text classification with support vector machine and back propagation neural network. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science – ICCS 2007. ICCS 2007. Lecture Notes in Computer Science, vol. 4490. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72590-9_21

  4. Paccanaro, A., Hinton, G.E.: Learning distributed representations of concepts using linear relational embedding. IEEE Trans. Knowl. Data Eng. 13(2), 232–244 (2001)

    Article  Google Scholar 

  5. Kuang, Q., Xu, X.: Improvement and application of TF•IDF method based on text classification. In: 2010 International Conference on Internet Technology and Applications, pp. 1–4. IEEE (2010)

    Google Scholar 

  6. Zhang, X., Wu, B.: Short text classification based on feature extension using the n-gram model. In: 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 710–716. IEEE (2015)

    Google Scholar 

  7. Bengio, Y.: Neural net language models. Scholarpedia 3(1), 3881 (2008)

    Article  Google Scholar 

  8. Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  9. Partridge, C., Mitchell, A., Cook, A., et al.: A survey of top-level ontologies-to inform the ontological choices for a foundation data model (2020)

    Google Scholar 

  10. Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), pp. 136–140. IEEE (2015)

    Google Scholar 

  11. Joulin, A., Grave, E., Bojanowski, P., et al.: Fasttext. zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)

  12. Zhang, Y., Yuan, H., Wang, J., et al.: YNU-HPCC at EmoInt-2017: using a CNN-LSTM model for sentiment intensity prediction. In: Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 200–204 (2017)

    Google Scholar 

  13. Li, Y., Wang, X., Xu, P.: Chinese text classification model based on deep learning. Future Internet 10(11), 113 (2018)

    Article  Google Scholar 

  14. Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  15. Chen, L., Li, J.: Text feature selection methods based on word vector. J. Chin. Comput. Syst. 39(5), 991–994 (2018)

    Google Scholar 

  16. Liu, H., Yin, Q., Wang, W.Y.: Towards explainable NLP: a generative explanation framework for text classification. arXiv preprint arXiv:1811.00196 (2018)

  17. Liang, J., Chai, Y., Yuan, H., et al.: Emotional analysis based on polarity transfer and LSTM recursive network. J. Chin. Inf. Sci 29(5), 152–159 (2015)

    Google Scholar 

  18. Lu, C., Huang, H., Jian, P., Wang, D., Guo, Y.D.: A P-LSTM neural network for sentiment classification. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds.) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science, vol. 10234. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57454-7_41

  19. Yang, M., Qu, Q., Chen, X., et al.: Feature-enhanced attention network for target-dependent sentiment classification. Neurocomputing 307, 91–97 (2018)

    Article  Google Scholar 

  20. Bojanowski, P., Grave, E., Joulin, A., et al.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

Download references

Acknowledgements

This paper is supported by the Shenzhen Development and Reform Commission subject (XMHT20200105010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhibin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Z., Hu, H. (2022). Automated Data Mapping Based on FastText and LSTM for Business Systems. In: Yang, Y., Wang, X., Zhang, LJ. (eds) Cognitive Computing – ICCC 2022. ICCC 2022. Lecture Notes in Computer Science, vol 13734. Springer, Cham. https://doi.org/10.1007/978-3-031-23585-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23585-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23584-9

  • Online ISBN: 978-3-031-23585-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics