Skip to main content

Author Tree-Structured Hierarchical Dirichlet Process

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11198))

Abstract

Three key aspects of online discussion venues are the multitude of participants, the underlying trends of content, and the structure of the venue. However, most models are unable to take into account all three of these. In hierarchically organized message forums, authors may participate differently at multiple levels of sections, with different interests and contributions across the hierarchy. Well-designed probabilistic models of online discussion are applicable to many tasks such as prediction of future content or authorship attribution. However, traditional models such as Hierarchical Dirichlet Processes (HDPs) do not fully take into account authors, and are further unable to fully take into account deep hierarchical venues where documents can arise at all tree nodes. We introduce the Author Tree-structured Hierarchical Dirichlet Process (ATHDP), allowing Dirichlet process based topic modeling of both text content and authors over a given tree structure of arbitrary size and height. Experiments on six hierarchical discussion data sets demonstrate better performance of ATHDP compared to traditional HDP based alternatives in terms of perplexity and authorship attribution accuracy.

MHA and JP had equal contributions. The work was supported by Academy of Finland decisions 295694 and 313748.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.kielipankki.fi/corpora/.

References

  1. Adams, R., Ghahramani, Z., Jordan, M.: Tree-structured stick breaking for hierarchical data. In: Proceedings of NIPS, pp. 19–27. Curran Associates Inc. (2010)

    Google Scholar 

  2. Ahmed, A., Ho, Q., Teo, C.H., Eisenstein, J., Smola, A.J., Xing, E.P.: Online inference for the infinite topic-cluster model: Storylines from streaming text. In: Proceedings of AISTATS, pp. 101–109 (2011)

    Google Scholar 

  3. Alam, M.H., Ryu, W.J., Lee, S.: Joint multi-grain topic sentiment. Inf. Sci. 339(C), 206–223 (2016)

    Google Scholar 

  4. Blei, D., Griffiths, T., Jordan, M.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57, 7:1–7:30 (2010)

    Article  MathSciNet  Google Scholar 

  5. Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. Erosheva, E., Fienberg, S., Lafferty, J.: Mixed-membership models of scientific publications. Proc. Natl. Acad. Sci. 101(suppl 1), 5220–5227 (2004)

    Article  Google Scholar 

  7. He, R., McAuley, J.: Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of WWW, pp. 507–517 (2016)

    Google Scholar 

  8. Jiang, S., Qian, X., Shen, J., Fu, Y., Mei, T.: Author topic model-based collaborative filtering for personalized poi recommendations. IEEE Trans. Multimed. 17(6), 907–918 (2015)

    Google Scholar 

  9. Kim, H., Sun, Y., Hockenmaier, J., Han, J.: ETM: entity topic models for mining documents associated with entities. In: Proceedings of ICDM, pp. 349–358. IEEE Computer Society (2012)

    Google Scholar 

  10. Kim, J., Kim, D., Kim, S., Oh, A.: Modeling topic hierarchies with the recursive Chinese restaurant process. In: Proceedings of CIKM, pp. 783–792. ACM (2012)

    Google Scholar 

  11. Li, W., McCallum, A.: Pachinko allocation: DAG-structured mixture models of topic correlations. In: Proceedings of ICML, pp. 577–584. ACM (2006)

    Google Scholar 

  12. Peltonen, J., Belorustceva, K., Ruotsalo, T.: Topic-relevance map: visualization for improving search result comprehension. In: Proceedings of IUI. pp. 611–622. ACM (2017)

    Google Scholar 

  13. Poddar, L., Hsu, W., Lee, M.L.: Author-aware aspect topic sentiment model to retrieve supporting opinions from reviews. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 472–481. Association for Computational Linguistics (2017)

    Google Scholar 

  14. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of UAI, pp. 487–494. AUAI Press (2004)

    Google Scholar 

  15. Teh, Y., Jordan, M., Beal, M., Blei, D.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 1566–1581 (2006)

    Article  MathSciNet  Google Scholar 

  16. Xuan, J., Lu, J., Zhang, G., Xu, R.Y., Luo, X.: A Bayesian nonparametric model for multi-label learning. Mach. Learn. 106(11), 1787–1815 (2017). Nov

    Article  MathSciNet  Google Scholar 

  17. Yang, L., et al.: CQArank: jointly model topics and expertise in community question answering. In: Proceedings of CIKM, pp. 99–108. ACM (2013)

    Google Scholar 

  18. Yang, M., Hsu, W.H.: HDPauthor: a new hybrid author-topic model using latent Dirichlet allocation and hierarchical Dirichlet processes. In: Proceedings of WWW, pp. 619–624. ACM (2016)

    Google Scholar 

  19. Zhang, S., Zhang, S., Yen, N.Y., Zhu, G.: The recommendation system of micro-blog topic based on user clustering. Mob. Netw. Appl. 22(2), 228–239 (2017). Apr

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Md Hijbul Alam or Jaakko Peltonen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alam, M.H., Peltonen, J., Nummenmaa, J., Järvelin, K. (2018). Author Tree-Structured Hierarchical Dirichlet Process. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds) Discovery Science. DS 2018. Lecture Notes in Computer Science(), vol 11198. Springer, Cham. https://doi.org/10.1007/978-3-030-01771-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01771-2_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01770-5

  • Online ISBN: 978-3-030-01771-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics