Stochastic gradient-CAViaR-based deep belief network for text categorization

Srilakshmi, V.; Anuradha, K.; Shoba Bindu, C.

doi:10.1007/s12065-020-00449-x

Stochastic gradient-CAViaR-based deep belief network for text categorization

Research Paper
Published: 12 July 2020

Volume 14, pages 1727–1741, (2021)
Cite this article

Evolutionary Intelligence Aims and scope Submit manuscript

V. Srilakshmi¹,
K. Anuradha² &
C. Shoba Bindu¹

192 Accesses
3 Citations
Explore all metrics

Abstract

Text categorization is defined as the process of assigning tags to text according to its content. Some of the text classification approaches are document organization, spam email filtering, and news groupings. This paper introduces stochastic gradient-CAViaR-based deep belief networks for text categorization. The overall procedure of the proposed approach involves four steps, such as pre-processing, feature extraction, feature selection, and text categorization. At first, the pre-processing is carried out from the input data based on stemming, stop-word removal, and then, the feature extraction is performed using a vector space model. Once the extraction is done, the feature selection is carried out based on entropy. Subsequently, the selected features are given to the text categorization step. Here, the text categorization is done using the proposed SG-CAV-based deep belief networks (SG-CAV-based DBN). The proposed SG-CAV is used to train the DBN, which is designed by combining conditional autoregressive value at risk and stochastic gradient descent. The performance of the proposed SGCAV + DBN is evaluated based on the metrics, such as recall, precision, F-measure and accuracy. Also, the performance of the proposed method is compared with the existing methods, such as Naive Bayes, K-nearest neighbours, support vector machine, and deep belief network (DBN). From the analysis, it is depicted that the proposed SGCAV + DBN method achieves the maximal precision of 0.78, the maximal recall of 0.78, maximal F-measure of 0.78, and the maximal accuracy of 0.95. Among the existing methods, DBN achieves the maximum precision, recall, F-measure and accuracy, for 20 Newsgroup database and Reuter database. The performance of the proposed system is 10.98%, 11.54%, 11.538%, and 18.33% higher than the precision, recall, F-measure, and accuracy of the DBN for 20 Newsgroup database, and 2.38%, 2.38%, 2.37%, and 0.21% higher than the precision, recall, F-measure and accuracy of the DBN for Reuter database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text classification based on deep belief network and softmax regression

Article 14 June 2016

An Improved DBN Method for Text Classification

A multi-label text classification method via dynamic semantic representation model and deep neural network

Article 05 March 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Al-Salemi B, Ayob M, Noah SAM (2018) Feature ranking for enhancing boosting-based multi-label text categorization. Expert Syst Appl 113:531–543
Article Google Scholar
Tellez ES, Moctezuma D, Miranda-Jiménez S, Graff M (2018) An automated text categorization framework based on hyper parameter optimization. Knowl-Based Syst 149:110–123
Article Google Scholar
Saad MK, Ashour W (2010) Arabic text classification using decision trees. In: Proceedings of 12th international workshop on computer science and information technologies CSIT, Moscow-Saint Petersburg, Russia
Mohammad AH, Alwadan T, Al-Momani O (2016) Arabic text categorization using support vector machine. Naïve Bayes Neural Netw 5(1):108–115
Google Scholar
Tang B, He H, Baggenstoss PM, Kay S (2016) A Bayesian classification approach using class-specific features for text categorization. IEEE Trans Knowl Data Eng 28(6):1602–1606
Article Google Scholar
Lee J, Yu I, Park J, Kim DW (2019) Memetic feature selection for multilabel text categorization using label frequency difference. Inf Sci 485:263–280
Article Google Scholar
Alwehaibi A, Roy K (2018) Comparison of pre-trained word vectors for arabic text classification using deep learning approach. In: Proceedings of 17th IEEE international conference on machine learning and applications (ICMLA), Orlando, FL, pp 1471–1474
Hu Y, Yi Y, Yang T, Pan Q (2018) Short text classification with a convolutional neural networks based method. In: Proceedings of 15th international conference on control, automation, robotics and vision (ICARCV), Singapore, pp 1432–1435
Xu Z, Li J, Liu B, Bi J, Li R, Mao R (2017) Semi-supervised learning in large scale text categorization. J Shanghai Jiatong Univ 22(3):291–302
Article Google Scholar
Attaccalite C, Cannuccia E, Grüning M (2017) Excitonic effects in third-harmonic generation: the case of carbon nanotubes and nanoribbons. Phys Rev B 95(12):125403
Article Google Scholar
Nguyen HM, Khoa BT (2019) The relationship between the perceived mental benefits, online trust, and personal information disclosure in online shopping. J Asian Finance 6(4):261–270
Article Google Scholar
Tu F, Yin S, Ouyang P, Tang S, Liu L, Wei S (2017) Deep convolutional neural network architecture with reconfigurable computation patterns. IEEE Trans Very Large Scale Integr Syst 25(8):2220–2233
Article Google Scholar
Ninu Preetha NS, Praveena S (2018) Multiple feature sets and SVM classifier for the detection of diabetic retinopathy using retinal images. Multimed Res 1(1):17–26
Google Scholar
Alzubi J, Nayyar A, Kumar A (2018) Machine learning from theory to algorithms: an overview. J Phys: Conf Ser 1142:012012
Google Scholar
Bhopale AP, Kamath SS, Tiwari A (2018) Concise semantic analysis based text categorization using modified hybrid union feature selection approach. In: Proceedings of 4th international conference on recent advances in information technology (RAIT), Dhanbad, pp 1–7
Haryanto AW, Mawardi EK, Muljono (2018) Influence of word normalization and chi squared feature selection on support vector machine (SVM) text classification. In: Proceedings of international seminar on application for technology of information and communication, Semarang, pp 229–233
Zheng T, Wang L (2018) Unlabeled text classification optimization algorithm based on active self-paced learning. In: Proceedings of IEEE international conference on big data and smart computing (BigComp), pp 404–409
Parmar PS, Biju PK, Shankar M, Kadiresan N (2018) Multiclass text classification and analytics for improving customer support response through different classifiers. In: Proceedings of international conference on advances in computing, communications and informatics (ICACCI), Bangalore, pp 538–542
Bigi B (2003) Using Kullback–Leibler distance for text categorization. In: Advances in information retrieval, vol 2633. Springer, Berlin, pp 305–319
Ma T, Motta G, Liu K (2017) Delivering real-time information services on public transit: a framework. IEEE Trans Intell Transp Syst 18(10):2642–2656
Article Google Scholar
Kouretas GP, Zarangas L (2005) Conditional autoregressive value at risk by regression quantiles estimating market risk for major stock markets, no. 0521
Kim S-B, Han K-S, Rim H-C, Myaeng SH (2006) Some effective techniques for naive Bayes text classification. IEEE Trans Knowl Data Eng 18(11):1457–1466
Article Google Scholar
Liu C, Wang W, Tu G, Xiang Y, Wang S, Lv F (2017) A new centroid-based classification model for text categorization. Knowl Based Syst 136:15–26
Article Google Scholar
Tang X, Dai Y, Xiang Y (2019) Feature selection based on feature interactions with application to text categorization. Expert Syst Appl 120:207–216
Article Google Scholar
Zheng T, Zheng T, Wang L (2018) Unlabeled text classification optimization algorithm based on active self-paced learning. In: Proceedings of IEEE international conference on big data and smart computing
Liu B, Xiao Y, Hao Z (2018) A selective multiple instance transfer learning method for text categorization problems. Knowl-Based Syst 141:178–187
Article Google Scholar
Kim K, Zhang SY (2018) Trigonometric comparison measure: a feature selection method for text categorization. Data Knowl Eng 119:1–12
Article Google Scholar
Feng G, Li S, Sun T, Zhang B (2018) A probabilistic model derived term weighting scheme for text classification. Pattern Recogn Lett 110:23–29
Article Google Scholar
Yang J, Yang G (2018) Modified convolutional neural network based on dropout and the stochastic gradient descent optimizer. Algorithms 11(3):28
Article MathSciNet Google Scholar
Dai W, Xue G-R, Yang Q, Yu Y (2007) Transferring Naive Bayes classifiers for text classification. In: AAAI, vol 7, pp 540–545
Camastra F, Razi G (2019) Italian text categorization with lemmatization and support vector machines. In: Neural approaches to dynamics of signal exchanges, vol 151, pp 47–54
Jo T (2019) Improving K nearest neighbor into string vector version for text categorization. In: 21st international conference on advanced communication technology (ICACT), PyeongChang Kwangwoon_Do, Korea (South)
Berge GT, Granmo O-C, Tveit TO, Goodwin M, Jiao L, Matheussen BV (2019) Using the Tsetlin machine to learn human-interpretable rules for high-accuracy text categorization with medical applications. In: IEEE Access, vol 7, pp 115134–115146
Engle RF, Manganelli S (2004) CAViaR: conditional autoregressive value at risk by regression quantiles. J Bus Econ Stat 22(4):367–381
Article MathSciNet Google Scholar
Ranjan NM, Prasad RS (2018) LFNN: lion fuzzy neural network-based evolutionary model for text classification using context and sense based features. Appl Soft Comput J 71:994–1008
Article Google Scholar
Huang D, Yu B, Fabozzi FJ, Fukushima M (2009) CAViaR-based forecast for oil price risk. Energy Econ 31:511–518
Article Google Scholar
Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Article MathSciNet Google Scholar
Zinkevich M, Weimer M, Li L, Smola AJ (2010) Parallelized stochastic gradient descent. In: Advances in neural information processing systems 23 (NIPS 2010)
Newsgroup database. http://qwone.com/~jason/20Newsgroups/. Accessed October 2018
Reuter database. https://archive.ics.uci.edu/ml/machine-learningdatabases/reuters21578-mld/. Accessed October 2018
Wajeed MA, Adilakshmi T (2011) Using KNN algorithm for text categorization. In: Proceedings of international conference on computational intelligence and information technology, pp 796–801
Parmar PS, Biju PK, Shankar M, Kadiresan N (2018) Multiclass text classification and analytics for improving customer support response through different classifiers. In: Proceedings of international conference on advance in computing, communications, and informatics (ICACCI)

Download references

Author information

Authors and Affiliations

CSE, JNTUA, Anantapur, India
V. Srilakshmi & C. Shoba Bindu
CSE, GRIET, Hyderabad, India
K. Anuradha

Authors

V. Srilakshmi
View author publications
You can also search for this author in PubMed Google Scholar
K. Anuradha
View author publications
You can also search for this author in PubMed Google Scholar
C. Shoba Bindu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. Srilakshmi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Srilakshmi, V., Anuradha, K. & Shoba Bindu, C. Stochastic gradient-CAViaR-based deep belief network for text categorization. Evol. Intel. 14, 1727–1741 (2021). https://doi.org/10.1007/s12065-020-00449-x

Download citation

Received: 25 November 2019
Revised: 24 March 2020
Accepted: 02 July 2020
Published: 12 July 2020
Issue Date: December 2021
DOI: https://doi.org/10.1007/s12065-020-00449-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic gradient-CAViaR-based deep belief network for text categorization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Text classification based on deep belief network and softmax regression

An Improved DBN Method for Text Classification

A multi-label text classification method via dynamic semantic representation model and deep neural network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Stochastic gradient-CAViaR-based deep belief network for text categorization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Text classification based on deep belief network and softmax regression

An Improved DBN Method for Text Classification

A multi-label text classification method via dynamic semantic representation model and deep neural network

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation