Statistical-Based Abbreviation Expansion

Zelinka, Jan; Romportl, Jan; Müller, Luděk

doi:10.1007/978-3-642-23538-2_39

Jan Zelinka²¹,
Jan Romportl²¹ &
Luděk Müller²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6836))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

940 Accesses

Abstract

The work presented in this paper deals with the text normalization for highly inflectional languages. This paper is focused on abbreviation expansion and likewise on numerals normalization. Our text normalization system does not use any explicit parser or part-of-speech tagger and thus it can be called lightly supervised. The standard rule-based text normalization method is compared with the proposed statistical-based one in the task of expansion of Czech abbreviations.

This research was supported by the Grant Agency of the Czech Republic, project No. GAČR 102/08/0707 and the Technology Agency of the Czech Republic, project No. TA01011264 and the Ministry of Education of the Czech Republic, project No. MŠMT LC536.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Deep Learning Approach to Self-expansion of Abbreviations Based on Morphology and Context Distance

Automatic Matching and Expansion of Abbreviated Phrases Without Context

Chinese Lexical Normalization Based on Information Extraction: An Experimental Study

References

Hippman, R., Dostálová, T., Zvárová, J., Nagy, M., Seydlová, M., Hanzlíček, P., Kříž, P., Šmídl, L., Trmal, J.: Voice-supported electronic health record for temporomandibular joint disorders. Methods of Information in Medicine 49, 168–172 (2010)
Article Google Scholar
Caruana, R., Niculescu-Mizil, A.: Data mining in metric space: An empirical analysis of supervised learning performance criteria, pp. 69–78. ACM Press, New York (2004)
Google Scholar
Shen, Y.: Loss Functions for Binary Classification and Class Probability Estimation. PhD thesis (2005)
Google Scholar
Sproat, R.: Lightly supervised learning of text normalization: Russian number names. In: IEEE Workshop on Spoken Language Technology, Berkeley, U.S.A (2010)
Google Scholar
Schlippe, T., Zhu, C., Gebhardt, J., Schultz, T.: Text normalization based on statistical machine translation and internet user support. In: INTERSPEECH, pp. 1816–1819 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Cybernetics, University of West Bohemia, 306 14, Plzen, Czech Republic
Jan Zelinka, Jan Romportl & Luděk Müller

Authors

Jan Zelinka
View author publications
You can also search for this author in PubMed Google Scholar
Jan Romportl
View author publications
You can also search for this author in PubMed Google Scholar
Luděk Müller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Sciences, University of West Bohemia, Univerzitní 22, 306 14, Pilsen, Czech Republic
Ivan Habernal
Faculty of Applied Sciences, Dept. of Computer Science and Engineering, University of West Bohemia, Univerzitni 8, 306 14, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zelinka, J., Romportl, J., Müller, L. (2011). Statistical-Based Abbreviation Expansion. In: Habernal, I., Matoušek, V. (eds) Text, Speech and Dialogue. TSD 2011. Lecture Notes in Computer Science(), vol 6836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23538-2_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-23538-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23537-5
Online ISBN: 978-3-642-23538-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics