Skip to main content
Log in

Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the rapid popularization of the Internet and the multimedia that be deemed to a new information transmission mode, people can not only get the information you want easily, but also post the information that you have in the world. At the same time, with the introduction of a variety of tablet PCs, smart phones and other network terminals, and the emergence of a variety of social networks, greatly accelerated the pace of information on the internet. People can update a variety of text, pictures, video and other data in a variety of applications every day. There is data show that the Internet has an exponential level of information data and news or media company will typically see hundreds and thousands of submissions every day, people have been in a very expansive information time. In the face of such huge information resources, how to manage it effectively, make people get the target information more convenient and fast, has become a hot research topic. And text classification technology in text information mining is effective to solve this problem. We mainly study the mobile text classification technology based on the maximum entropy model and implement the automatic classification system of texts in cloud computing, and through technical improvements, for a large number of documents in the network, given technical solutions in mobile environment. This paper introduces the text classification methods and features of the maximum entropy model with improved information gain selection method and the pretreatment method and the MapReduce programming method, the experimental results have a good accuracy and recall, the classification of large amounts of text, meeting the requirements of practical application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Basu T, Murthy CA (2012) Effective text classification by a supervised feature selection approach. Proc 12th Int Conf Data Mining Workshops (ICDMW). IEEE. 2012:918–925

  2. Berger AL, Della Pietra SA, Della Pietra VJ (1996) A maximum entropy approach to natural language processing[J]. Comput Linguist 22(1):38–73

    Google Scholar 

  3. Chen K, Zheng W (2009) Cloud computing: an example of the system and the study of the present situation. J Soft Ware 20(5):1337–1348

    Article  Google Scholar 

  4. Fei H, Kang S (2005) Study on word frequency statistics based on Chinese [J]. Comput Eng Appl 41(7):67–68

    Google Scholar 

  5. Gu B, Sheng VS, Tay KY, Romano W, Li S (2014) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416

    Article  MathSciNet  Google Scholar 

  6. Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Netw 67:140–150

    Article  Google Scholar 

  7. Jiang J (2010) Feature extraction and feature weighting in text classification [D]. Chongqing University, China

    Google Scholar 

  8. Li R (2005) Text classification and related technology research [D]. FuDan University, China

    Google Scholar 

  9. Li R, Wang J, Chen X, Tao X, Hu Y (2005) Using the maximum entropy model for Chinese text classification[J]. Comput Res Dev 01:94–101

    Article  Google Scholar 

  10. Li J, Zhu Q, Li P (2005) A text categorization based on maximum entropy model[A]. China Chinese information society, information retrieval and information content security professional committee, the second national information retrieval and content security academic conference(NCIRCS-2005) proceedings[C], China

  11. Peng X (2012) Naive Bayesian text classification research and implementation in cloud computing environment. Huazhong University of Science and Technology, China

    Google Scholar 

  12. Shang W (2007) Text classification and related technology research [D]. Beijing JiaoTong University, China

    Google Scholar 

  13. Song F (2004) Research on some basic problems of automatic text classification [D]. Nanjing University of Science and Technology, China

    Google Scholar 

  14. Wang J (2000) Research on web text mining technology. Comput Res Dev 37(5):513–520

    Google Scholar 

  15. Xue D (2004) Research on key issues in automatic classification of chinese text (: Bachelor’s degree thesis). Tsinghua University, China, Beijing

    Google Scholar 

  16. Yin C (2014) Towards accurate node-based detection of P2P Botnets. Sci World J 2014:425491

    Google Scholar 

  17. Yin C, Zou M, Iko D, Wang J (2013) Botnet detection based on correlation of malicious behaviors. Int J Hybrid Inf Technol 6(6):291–300

    Article  Google Scholar 

  18. Zhang M (2005) The research and improvement of bayes text classifier [D]. Taiyuan University of Technology, China

    Google Scholar 

  19. Zhang Q, Zhu L, Zhang Y (2008) Overview of Chinese word segmentation algorithm [J]. Inf Explor 11:53–56

    Google Scholar 

Download references

Acknowledgments

Foundation item: This work was funded by the National Natural Science Foundation of China (No.61373134). It was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Jiangsu Key Laboratory of Meteorological Observation and Information Processing (No.KDXS1105) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunyong Yin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, C., Xi, J. Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm. Multimed Tools Appl 76, 16875–16891 (2017). https://doi.org/10.1007/s11042-016-3545-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3545-5

Keywords

Navigation