Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm

Yin, Chunyong; Xi, Jinwen

doi:10.1007/s11042-016-3545-5

Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm

Published: 20 April 2016

Volume 76, pages 16875–16891, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chunyong Yin¹ &
Jinwen Xi¹

679 Accesses
12 Citations
Explore all metrics

Abstract

With the rapid popularization of the Internet and the multimedia that be deemed to a new information transmission mode, people can not only get the information you want easily, but also post the information that you have in the world. At the same time, with the introduction of a variety of tablet PCs, smart phones and other network terminals, and the emergence of a variety of social networks, greatly accelerated the pace of information on the internet. People can update a variety of text, pictures, video and other data in a variety of applications every day. There is data show that the Internet has an exponential level of information data and news or media company will typically see hundreds and thousands of submissions every day, people have been in a very expansive information time. In the face of such huge information resources, how to manage it effectively, make people get the target information more convenient and fast, has become a hot research topic. And text classification technology in text information mining is effective to solve this problem. We mainly study the mobile text classification technology based on the maximum entropy model and implement the automatic classification system of texts in cloud computing, and through technical improvements, for a large number of documents in the network, given technical solutions in mobile environment. This paper introduces the text classification methods and features of the maximum entropy model with improved information gain selection method and the pretreatment method and the MapReduce programming method, the experimental results have a good accuracy and recall, the classification of large amounts of text, meeting the requirements of practical application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Application of improved distributed naive Bayesian algorithms in text classification

Article 30 April 2019

Big Data Automatic Classification Processing System Based on Cloud Computing

RETRACTED ARTICLE: Multimedia text classification algorithm using potential Dirichlet distribution in mobile cloud computing environment

Article 06 December 2019

References

Basu T, Murthy CA (2012) Effective text classification by a supervised feature selection approach. Proc 12th Int Conf Data Mining Workshops (ICDMW). IEEE. 2012:918–925
Berger AL, Della Pietra SA, Della Pietra VJ (1996) A maximum entropy approach to natural language processing[J]. Comput Linguist 22(1):38–73
Google Scholar
Chen K, Zheng W (2009) Cloud computing: an example of the system and the study of the present situation. J Soft Ware 20(5):1337–1348
Article Google Scholar
Fei H, Kang S (2005) Study on word frequency statistics based on Chinese [J]. Comput Eng Appl 41(7):67–68
Google Scholar
Gu B, Sheng VS, Tay KY, Romano W, Li S (2014) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Article MathSciNet Google Scholar
Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Netw 67:140–150
Article Google Scholar
Jiang J (2010) Feature extraction and feature weighting in text classification [D]. Chongqing University, China
Google Scholar
Li R (2005) Text classification and related technology research [D]. FuDan University, China
Google Scholar
Li R, Wang J, Chen X, Tao X, Hu Y (2005) Using the maximum entropy model for Chinese text classification[J]. Comput Res Dev 01:94–101
Article Google Scholar
Li J, Zhu Q, Li P (2005) A text categorization based on maximum entropy model[A]. China Chinese information society, information retrieval and information content security professional committee, the second national information retrieval and content security academic conference(NCIRCS-2005) proceedings[C], China
Peng X (2012) Naive Bayesian text classification research and implementation in cloud computing environment. Huazhong University of Science and Technology, China
Google Scholar
Shang W (2007) Text classification and related technology research [D]. Beijing JiaoTong University, China
Google Scholar
Song F (2004) Research on some basic problems of automatic text classification [D]. Nanjing University of Science and Technology, China
Google Scholar
Wang J (2000) Research on web text mining technology. Comput Res Dev 37(5):513–520
Google Scholar
Xue D (2004) Research on key issues in automatic classification of chinese text (: Bachelor’s degree thesis). Tsinghua University, China, Beijing
Google Scholar
Yin C (2014) Towards accurate node-based detection of P2P Botnets. Sci World J 2014:425491
Google Scholar
Yin C, Zou M, Iko D, Wang J (2013) Botnet detection based on correlation of malicious behaviors. Int J Hybrid Inf Technol 6(6):291–300
Article Google Scholar
Zhang M (2005) The research and improvement of bayes text classifier [D]. Taiyuan University of Technology, China
Google Scholar
Zhang Q, Zhu L, Zhang Y (2008) Overview of Chinese word segmentation algorithm [J]. Inf Explor 11:53–56
Google Scholar

Download references

Acknowledgments

Foundation item: This work was funded by the National Natural Science Foundation of China (No.61373134). It was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Jiangsu Key Laboratory of Meteorological Observation and Information Processing (No.KDXS1105) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).

Author information

Authors and Affiliations

School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Jiangsu, Nanjing, 210044, China
Chunyong Yin & Jinwen Xi

Authors

Chunyong Yin
View author publications
You can also search for this author in PubMed Google Scholar
Jinwen Xi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunyong Yin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, C., Xi, J. Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm. Multimed Tools Appl 76, 16875–16891 (2017). https://doi.org/10.1007/s11042-016-3545-5

Download citation

Received: 16 February 2016
Revised: 20 March 2016
Accepted: 15 April 2016
Published: 20 April 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11042-016-3545-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm

Abstract

Access this article

Similar content being viewed by others

Application of improved distributed naive Bayesian algorithms in text classification

Big Data Automatic Classification Processing System Based on Cloud Computing

RETRACTED ARTICLE: Multimedia text classification algorithm using potential Dirichlet distribution in mobile cloud computing environment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm

Abstract

Access this article

Similar content being viewed by others

Application of improved distributed naive Bayesian algorithms in text classification

Big Data Automatic Classification Processing System Based on Cloud Computing

RETRACTED ARTICLE: Multimedia text classification algorithm using potential Dirichlet distribution in mobile cloud computing environment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation