Skip to main content

A User-Oriented Special Topic Generation System for Digital Newspaper

  • Conference paper
  • First Online:
Book cover Natural Language Processing and Chinese Computing (NLPCC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9362))

  • 2274 Accesses

Abstract

With the coming of digital newspaper, user-oriented special topic generation becomes extremely urgent to satisfy the users’ requirements both functionally and emotionally. We propose an applicable automatic special topic generation system for digital newspapers based on users’ interests. Firstly, extract subject heading vector of the topic of interest by filtering out function words, localizing Latent Dirichlet Allocation (LDA) and training the LDA model. Secondly, remove semantically repetitive vector component by constructing a synonymy word map. Lastly, organize and refine the special topic according to the similarity between the candidate news and the topic, and the density of topic-related terms. The experimental results show that the system has both simple operation and high accuracy, and it is stable enough to be applied for user-oriented special topic generation in practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fan, J.-R.: Research on Topic Generation and Retrieval of News Video Based on Text. Institute of Computing Technology, Chinese Academy of Science, Beijing (2008)

    Google Scholar 

  2. Li, H.-X., Zhang, H.-P.: Internet hot topic detection based on topic words. In: Proceedings of the 5th China Information Retrieval Conference, Shanghai (2009)

    Google Scholar 

  3. Wang, Z.-M.: Research on Web News Topic Organization and Acquisition System. College of Information Science & Engineering, Central South University (2008)

    Google Scholar 

  4. Cui, J.-M., Liu, J.-M., Liao, Z.-Y.: A Research of Text Categorization Based on Support Vector Machine. Computer Simulation 30(2), 294–299 (2013)

    Google Scholar 

  5. Tan, H., Jia, Z.-Y., Shi, Z.-Z.: How to Organize and Generate News Topics with Great Efficiency. Science & Technology Review 7, 48–51 (2004)

    Google Scholar 

  6. Erk, K., PadĂł, S.: A structured vector space model for word meaning in context. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2008)

    Google Scholar 

  7. Biggers, L.R., Bocovich, C., Capshaw, R., Eddy, B.P., Etzkorn, L.H., Kraft, N.A.: Configuring Latent Dirichlet Allocation Based Feature Location. Empirical Software Engineering 19(3), 465–500 (2014)

    Article  Google Scholar 

  8. He, D.: Retrospect of and Prospect for Chinese Thesaurus. Information Studies Theory & Application (2010)

    Google Scholar 

  9. Feng, G.-H., Zhen, Z.: Review of Chinese Automatic Word Segmentation. Library and Information Service 55(2), 41–45 (2011)

    Google Scholar 

  10. Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Pr., pp. 1–17 (2011)

    Google Scholar 

  11. David, M.B.: Probabilistic Topic Models. Communications of the ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  12. David, M.B., Andrew, Y.N., Michael, I.J.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  13. Mei, J.-J., Zhu, Y.-M., Gao, Y.-Q.: Cilin-thesaurus of Chinese words. Shanghai Lexicographic Publishing House, Shanghai (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xi Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, X., Ye, M., Tang, Z., Xu, JB., Gao, LC. (2015). A User-Oriented Special Topic Generation System for Digital Newspaper. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2015. Lecture Notes in Computer Science(), vol 9362. Springer, Cham. https://doi.org/10.1007/978-3-319-25207-0_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25207-0_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25206-3

  • Online ISBN: 978-3-319-25207-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics