Skip to main content
Log in

Deep learning model for unstructured knowledge classification using structural features

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Automatic text classification is widely used as the basic method for analyzing data. While classification methods like the support vector machine (SVM) have exhibited impressive performance in the area, the recent use of deep learning has led to considerable progress in text classification. This study proposes a deep learning–based classification model called DEEP-I to classify information on national research and development with complex structural features, a large amount of text, and large-scale classes. In addition to the word–sentence structure of a simple document, the number of stacking layers of the deep model is increased in light of the higher-level structure of the items. Experiments on 180,000 datasets and 366 classification schemes showed that the proposed model can improve classification performance by 22.7% over the traditional SVM and 15.7% over a deep learning model that uses only structured features of word sentences. This improvement was achieved because the multi-layered stacking method was applied to enhance learning by increasing depth by five to 10 times that of the conventional deep learning model and effectively combining features of heterogeneous items. The proposed model is also applicable to datasets containing documents with complex structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Kounadis T (2016) IDC stacks up top object storage vendors, https://www.ibm.com/blogs/cloud-computing/2016/12/idc-stacks-top-object-storage-vendors/

  2. Sebastiani F (1999) A tutorial on automated text categorization. In Proc. of ASAI-99: 1st Argentinian symposium on artificial intelligence

  3. Zhang W, Yang J, Su H, Kumar M, Mao Y (2018) Medical data fusion algorithm based on Internet of things. Pers Ubiquit Comput 22(5–6):895–902

    Article  Google Scholar 

  4. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. In IEEE transactions on pattern analysis and machine intelligence, special issue learning deep architectures

  5. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press

  6. Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proc. of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 69–78

    Google Scholar 

  7. Furini M, Montangero M (2018) Sentiment analysis and Twitter: a game proposal. Pers Ubiquit Comput 22(4):771–785

    Article  Google Scholar 

  8. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proc. of the 2014 conference on empirical methods in natural language processing, pp 1746–1751

    Google Scholar 

  9. Johnson R, Zhang T (2015) Effective use of word order for text categorization with convolutional neural networks. In: Proc of NAACL, vol 2015

    Google Scholar 

  10. Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. In: Proc of NIPS, vol 2015

    Google Scholar 

  11. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification, In Proc. of the 15th conference of the European chapter of the association for computational linguistics: Vol. 2, Short Papers

  12. Kalchbrenner N, Grefenstette E, and Blunsom P (2014) A convolutional neural network for modelling sentences, In Proc. of the 52nd annual meeting of the Association for Computational Linguistics (ACL 2014), pp 655–665

  13. Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proc. of the conference on empirical methods in natural language processing (EMNLP)

    Google Scholar 

  14. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proc. of twenty-ninth AAAI conference on artificial intelligence (AAAI), pp 2267–2273

    Google Scholar 

  15. Frege G (1892) On sense and reference. Ludlow 1997:563–584

    Google Scholar 

  16. Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proc. of the 2015 conference on empirical methods in natural language processing, pp 1422–1432

    Chapter  Google Scholar 

  17. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proc of NAACLHLT, pp 1480–1489

    Google Scholar 

  18. Mitchell J, Lapata M (2010) Composition in distributional models of semantics. Cogn Sci 34:1388–1429

    Article  Google Scholar 

  19. Park SB, Jang BT (2002) Text classification with co-trained support vector machines. In: Proc. of the 29th KISS spring conference, pp 259–261

    Google Scholar 

  20. Stamatatos E, Fakotakis N, Kokkinakis G (2001) Automatic text categorization in terms of genre and author. Computational Linguistics 26(4):471–495

    Article  Google Scholar 

  21. Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 1–9

    Google Scholar 

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proc. of neural information processing systems (NIPS)

    Google Scholar 

  23. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proc. of 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9

    Google Scholar 

  24. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large scale image recognition. In: Proc. of 3rd international conference on learning representations (ICLR2015)

    Google Scholar 

  25. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition, arXiv preprint arXiv:1512.03385

  26. Wang W, Yang C, Wu Y (2018) SVM-based classification method to identify alcohol consumption using ECG and PPG monitoring. Pers Ubiquit Comput 22(2):275–287

    Article  Google Scholar 

  27. NTIS, http://www.ntis.go.kr

  28. Kim YH et al (2015) A study on the improvement and application of the national science and technology standards classification system. Report 2015-00, KISTEP

  29. Zhou P, Qi Z, Zheng S, Xu J, Bao H, Xu B (2016) Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Proc. of COLING 2016, the 26th international conference on computational linguistics, Osaka, Japan, pp 3485–3495

  30. Yin W, Schutze H (2015) Multichannel variable-size convolution for sentence classification. In: Proc. of the 19th conference on computational language learning, pp 204–214

    Google Scholar 

  31. Zhang R, Lee H, Radev D (2016) Dependency sensitive convolutional neural networks for modeling sentences and documents. In: Proc of NAACL-HLT, vol 2016, pp 1512–1521

  32. Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

Download references

Acknowledgments

This research reconstructed data from the PhD dissertation by WonKyun Joo in 2018. It was supported by the projects titled “Development of Data-driven Solution for Social Issues” and “National Science and Technology Information Service” of the Korea Institute of Science and Technology Information (KISTI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Young-Kuk Kim.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Joo, W., Choi, K. & Kim, YK. Deep learning model for unstructured knowledge classification using structural features. Pers Ubiquit Comput 26, 247–258 (2022). https://doi.org/10.1007/s00779-019-01244-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-019-01244-x

Keywords

Navigation