Abstract
HFST–Helsinki Finite-State Technology ( hfst.sf.net ) is a framework for compiling and applying linguistic descriptions with finite-state methods. HFST currently connects some of the most important finite-state tools for creating morphologies and spellers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical information. HFST offers a path from language descriptions to efficient language applications in key environments and operating systems. HFST also provides an opportunity to exchange transducers between different software providers in order to get the best out of each finite-state library.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, & Tools with Gradiance, 2nd edn. Addison-Wesley Publishing Company, Reading (2007)
Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst: A general and efficient weighted finite-state transducer library. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007), http://www.openfst.org
Apache Software Foundation: Apache License, Version 2.0, http://www.apache.org/licenses/LICENSE-2.0.html
Beesley, K.R.: Constraining separated morphotactic dependencies in finite-state grammars. In: Karttunen, L., Oflazer, K. (eds.) Proceedings of the International Workshop on Finite State Methods in Natural Language Processing, pp. 118–127. Association for Computational Linguistics, Morristown (1998)
Beesley, K.R., Karttunen, L.: Finite State Morphology. CSLI Publications, Stanford (2003)
Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11, 481–494 (1964)
Free Software Foundation: GNU Lesser General Public License, Version 3, http://www.gnu.org/licenses/lgpl.html
Garrido-Alenda, A., Forcada, M.L., Carrasco, R.C.: Incremental construction and maintenance of morphological analysers based on augmented letter transducers (2002)
Hopcroft, J.E.: An n log n algorithm for minimizing states in a finite automaton. Tech. rep., Stanford University, Stanford, CA, USA (1971)
Huldén, M.: Fast approximate string matching with finite automata. Procesamiento del Lenguaje Natural 43, 57–64 (2009)
Karttunen, L.: Constructing lexical transducers. In: The Proceedings of the 15th International Conference on Computational Linguistics, Coling 1994, pp. 406–411. ACL, Morristown (1994)
Liang, F.M.: Word hyphenation by computer. Ph.D. thesis, Stanford University (1983), http://www.tug.org/docs/liang/
Lindén, K., Silfverberg, M., Pirinen, T.: HFST tools for morphology—an efficient open-source package for construction of morphological analyzers. In: Mahlow, Piotrowski (eds.) [14], pp. 28–47
Mahlow, C., Piotrowski, M. (eds.): SFCM 2009. CCIS, vol. 41. Springer, Heidelberg (2009)
Proceedings of the 18th Nordic Conference of Computational Linguistics, Nodalida 2011, Riga, May 11-13 (2011)
Oflazer, K.: Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction. Computational Linguistics 22(1), 73–89 (1996)
Pirinen, T.: Suomen kielen äärellistilainen automaattinen morfologinen analyysi avoimen lähdekoodin menetelmin. Master’s thesis, Helsingin yliopisto (2008), http://www.helsinki.fi/~tapirine/gradu/
Pirinen, T.: Modularisation of Finnish finite-state language description–towards wide collaboration in open source development of a morphological analyser. In: Nodalida (ed.) [15],
Pirinen, T.A., Lindén, K.: Building and using existing hunspell dictionaries and TeX hyphenators as finite-state automata. In: Proccedings of Computational Linguistics – Applications, Wisła, Poland, pp. 25–32 (2010), http://www.helsinki.fi/~tapirine/publications/Pirinen-cla-2010.pdf
Pirinen, T.A., Lindén, K.: Finite-state spell-checking with weighted language and error models. In: Proceedings of the Seventh SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages, Valletta, Malta, pp. 13–18 (2010)
Savary, A.: Typographical nearest-neighbor search in a finite-state lexicon and its application to spelling correction. In: Watson, B.W., Wood, D. (eds.) CIAA 2001. LNCS, vol. 2494, pp. 251–260. Springer, Heidelberg (2003)
Schmid, H.: A programming language for finite state transducers. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002, pp. 308–309. Springer, Heidelberg (2006)
Silfverberg, M., Lindén, K.: Conflict resolution using weighted rules in HFST-TWOLC. In: Proceedings of the 17th Nordic Conference of Computational Linguistics, Nodalida 2009, Nealt, pp. 174–181 (2009)
Silfverberg, M., Lindén, K.: HFST runtime format—a compacted transducer format allowing for fast lookup. In: Watson, B., Courie, D., Cleophas, L., Rautenbach, P. (eds.) FSMNLP (July 13, 2009), http://www.ling.helsinki.fi/~klinden/pubs/fsmnlp2009runtime.pdf
Silfverberg, M., Lindén, K.: Part-of-speech tagging using parallel weighted finite-state transducers. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 369–380. Springer, Heidelberg (2010)
Silfverberg, M., Lindén, K.: Combining statistical models for POS tagging using finite-state calculus. In: Nodalida (ed.) [15]
Zielinski, A., Simon, C.: Morphisto: Service-oriented open source morphology for German. In: Mahlow, Piotrowski (eds.) [14], pp. 64–75.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lindén, K., Axelson, E., Hardwick, S., Pirinen, T.A., Silfverberg, M. (2011). HFST—Framework for Compiling and Applying Morphologies. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2011. Communications in Computer and Information Science, vol 100. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23138-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-23138-4_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23137-7
Online ISBN: 978-3-642-23138-4
eBook Packages: Computer ScienceComputer Science (R0)