One Step Beyond: Keyword Extraction in German Utilising Surprisal from Topic Contexts

Philipp, J. Nathanael; Kölbl, Max; Kyogoku, Yuki; Yousef, Tariq; Richter, Michael

doi:10.1007/978-3-031-10464-0_53

J. Nathanael Philipp¹⁰,
Max Kölbl¹⁰,
Yuki Kyogoku¹⁰,
Tariq Yousef¹⁰ &
…
Michael Richter¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 507))

Included in the following conference series:

Science and Information Conference

956 Accesses

Abstract

This paper describes a study on keyword extraction in German with a model that utilises Shannon information as a lexical feature. Lexical information content was derived from large, extra-sentential semantic contexts of words in the framework of the novel Topic Context Model. We observed that lexical information content increased the performance of a Recurrent Neural Network in keyword extraction, outperforming TexTRank and other two models, i.e., Named Entity Recognition and Latent Dirichlet Allocation used comparatively in this study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://heise.de.
2.
https://spacy.io.
3.
https://github.com/jnphilipp/TextRank.
4.
model.components_ / model.components_.sum(axis=1)[:, np.newaxis] as suggested by https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html.

References

Aji, S.: Document summarization using positive pointwise mutual information. Int. J. Comput. Sci. Inf. Technol. 4(2), 47–55 (2012)
Google Scholar
Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1638–1649 (2018)
Google Scholar
Bao, J., et al.: Towards a theory of semantic communication. In: 2011 IEEE Network Science Workshop, pp. 110–117. IEEE (2011)
Google Scholar
Bennett, E.D., Goodman, N.D.: Extremely costly intensifiers are stronger than quite costly ones. Cognition 178, 147–161 (2018)
Article Google Scholar
Bharti, S.K., Babu, K.S.: Automatic keyword extraction for text summarization: a survey. arXiv preprint arXiv:1704.03242 (2017)
Biemann, C., Heyer, G., Quastoff, U.: Wissensrohstoff text. eine einführung in das text mining (2. auflage) (2021)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Çano, E., Bojar, O.: Keyphrase generation: a multi-aspect survey. arXiv preprint arXiv:1910.05059 (2019)
Cherry, E.C.: A history of the theory of information. Proc. IEE Part III Radio Commun. Eng. 98(55), 383–393 (1951)
Google Scholar
Cohen, J.: Trusses: cohesive subgraphs for social network analysis. National security agency technical report 16:3-1 (2008)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dretske, F.: Knowledge and the Flow of Information. MIT Press (1981)
Google Scholar
Hahn, M., Jurafsky, D., Futrell, R.: Universals of word order reflect optimization of grammars for efficient communication. Proc. Natl. Acad. Sci. 117(5), 2347–2353 (2020)
Article Google Scholar
Hale, J.: A probabilistic early parser as a psycholinguistic model. In: Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies, pp. 1–8. Association for Computational Linguistics (2001)
Google Scholar
Herrera, J.P., Pury, P.A.: Statistical keyword detection in literary corpora. Eur. Phys. J. B 63(1), 135–146 (2008). arXiv: cs/0701028
Huo, H., Liu, X.H.: Automatic summarization based on mutual information. Appl. Mech. Mater. 513–517, 1994–1997 (2014)
Article Google Scholar
Jaeger, T.F., Levy, R.P.: Speakers optimize information density through syntactic reduction. In: Advances in Neural Information Processing Systems, pp. 849–856 (2007)
Google Scholar
Kamp, H., Van Genabith, J., Reyle, U.: Discourse representation theory. In: Gabbay, D.M., Guenthner, F. (eds.) Handbook of Philosophical Logic, pp. 125–394. Springer, Dordrecht (2011). https://doi.org/10.1007/978-94-007-0485-5_3
Chapter Google Scholar
Kölbl, M., Kyogoku, Y., Philipp, J., Richter, M., Rietdorf, C., Yousef, T.: Keyword extraction in German: information-theory vs. deep learning. In: Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI, pp. 459–464. INSTICC, SciTePress (2020)
Google Scholar
Kölbl, M., Kyogoku, Y., Philipp, J.N., Richter, M., Rietdorf, C., Yousef, T.: The semantic level of Shannon information: are highly informative words good keywords? A study on German. In: Loukanova, R. (ed.) Natural Language Processing in Artificial Intelligence—NLPinAI 2020. SCI, vol. 939, pp. 139–161. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-63787-3_5
Chapter Google Scholar
Levy, R.: Expectation-based syntactic comprehension. Cognition 106(3), 1126–1177 (2008)
Article Google Scholar
Lin, D., et al.: An information-theoretic definition of similarity. In: ICML, vol. 98, pp. 296–304 (1998)
Google Scholar
MacWhinney, B.: The competition model. In: Mechanisms of Language Acquisition, pp. 249–308 (1987)
Google Scholar
MacWhinney, B., Bates, E.: Functionalism and the competition model. In: The Crosslinguistic Study of Sentence Processing, pp. 3–73 (1989)
Google Scholar
Mahowald, K., Fedorenko, E., Piantadosi, S.T., Gibson, E.: Info/information theory: speakers choose shorter words in predictive contexts. Cognition 126(2), 313–318 (2013)
Article Google Scholar
Dan Melamed, I.: Measuring semantic entropy. In: Tagging Text with Lexical Semantics: Why, What, and How? (1997)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Google Scholar
Miller, G.A.: The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63(2), 81 (1956)
Article Google Scholar
Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6(1), 1–28 (1991)
Article Google Scholar
Novak, L., Piotrovskij, R.: Esperimento di predizione ed entropia della lingua rumena. Statistics linguistics, Bolona (1971)
Google Scholar
Peters, M.E., Neumann, M., Zettlemoyer, L., Yih, W.: Dissecting contextual word embeddings: architecture and representation. arXiv preprint arXiv:1808.08949 (2018)
Piantadosi, S.T., Tily, H., Gibson, E.: Word lengths are optimized for efficient communication. Proc. Nat. Acad. Sci. 108(9), 3526–3529 (2011)
Article Google Scholar
Piotrowski, R.: Text informational estimates and synergetics. J. Quant. Linguist. 4(1–3), 232–243 (1997)
Article Google Scholar
Piotrowski, R.G.: Quantitative linguistics and information theory (quantitative linguistik und informationstheorie) (2005)
Google Scholar
Ravindra, G.: Information theoretic approach to extractive text summarization. Ph.D. thesis, Supercomputer Education and Research Center, Indian Institute of Science, Bangalore, (2009)
Google Scholar
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint arXiv:cmp-lg/9511007 (1995)
Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in WordNet. In: ECAI, vol. 16, pp. 1089 (2004)
Google Scholar
Shannon, C.E.: Prediction and entropy of printed English. Bell Syst. Tech. J. 30(1), 50–64 (1951)
Article Google Scholar
Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press (1949)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (2013)
Article MathSciNet Google Scholar
Tixier, A., Malliaros, F., Vazirgiannis, M.: A graph degeneracy-based approach to keyword extraction. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1860–1870 (2016)
Google Scholar
Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pp. 33–40 (2003)
Google Scholar
Waxman, S., Xiaolan, F., Arunachalam, S., Leddon, E., Geraghty, K., Song, H.: Are nouns learned before verbs? Infants provide insight into a long-standing debate. Child Dev. Perspect. 7(3), 155–159 (2013)
Article Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automated keyphrase extraction. In: Design and Usability of Digital Libraries: Case Studies in the Asia Pacific, pp. 129–152. IGI Global (2005)
Google Scholar
Zhang, C.: Automatic keyword extraction from documents using conditional random fields. J. Comput. Inf. Syst. 4(3), 1169–1180 (2008)
Google Scholar
Zhang, Q., Wang, Y., Gong, Y., Huang, X.-J.: Keyphrase extraction using deep recurrent neural networks on Twitter. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 836–845 (2016)
Google Scholar
Zhang, Yu., Tuo, M., Yin, Q., Qi, L., Wang, X., Liu, T.: Keywords extraction with deep neural network model. Neurocomputing 383, 113–121 (2020)
Article Google Scholar
Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in wordnet. In: 2008 2nd International Conference on Future Generation Communication and Networking Symposia, vol. 3, pp. 85–89. IEEE (2008)
Google Scholar

Download references

Acknowledgments

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), project number: 357550571.

The training of the LDA and neural networks was done on the High Performance Computing (HPC) Cluster of the Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH) of the Technische Universität Dresden.

Author information

Authors and Affiliations

Institute of Computer Science, NLP Group, Universität Leipzig, Augustusplatz 10, 04109, Leipzig, Germany
J. Nathanael Philipp, Max Kölbl, Yuki Kyogoku, Tariq Yousef & Michael Richter

Authors

J. Nathanael Philipp
View author publications
You can also search for this author in PubMed Google Scholar
Max Kölbl
View author publications
You can also search for this author in PubMed Google Scholar
Yuki Kyogoku
View author publications
You can also search for this author in PubMed Google Scholar
Tariq Yousef
View author publications
You can also search for this author in PubMed Google Scholar
Michael Richter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Nathanael Philipp .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Philipp, J.N., Kölbl, M., Kyogoku, Y., Yousef, T., Richter, M. (2022). One Step Beyond: Keyword Extraction in German Utilising Surprisal from Topic Contexts. In: Arai, K. (eds) Intelligent Computing. SAI 2022. Lecture Notes in Networks and Systems, vol 507. Springer, Cham. https://doi.org/10.1007/978-3-031-10464-0_53

Download citation

DOI: https://doi.org/10.1007/978-3-031-10464-0_53
Published: 07 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10463-3
Online ISBN: 978-3-031-10464-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

One Step Beyond: Keyword Extraction in German Utilising Surprisal from Topic Contexts