Skip to main content

HFST—Framework for Compiling and Applying Morphologies

  • Conference paper
Systems and Frameworks for Computational Morphology (SFCM 2011)

Abstract

HFST–Helsinki Finite-State Technology ( hfst.sf.net ) is a framework for compiling and applying linguistic descriptions with finite-state methods. HFST currently connects some of the most important finite-state tools for creating morphologies and spellers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical information. HFST offers a path from language descriptions to efficient language applications in key environments and operating systems. HFST also provides an opportunity to exchange transducers between different software providers in order to get the best out of each finite-state library.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, & Tools with Gradiance, 2nd edn. Addison-Wesley Publishing Company, Reading (2007)

    MATH  Google Scholar 

  2. Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst: A general and efficient weighted finite-state transducer library. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007), http://www.openfst.org

    Chapter  Google Scholar 

  3. Apache Software Foundation: Apache License, Version 2.0, http://www.apache.org/licenses/LICENSE-2.0.html

  4. Beesley, K.R.: Constraining separated morphotactic dependencies in finite-state grammars. In: Karttunen, L., Oflazer, K. (eds.) Proceedings of the International Workshop on Finite State Methods in Natural Language Processing, pp. 118–127. Association for Computational Linguistics, Morristown (1998)

    Google Scholar 

  5. Beesley, K.R., Karttunen, L.: Finite State Morphology. CSLI Publications, Stanford (2003)

    Google Scholar 

  6. Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11, 481–494 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  7. Free Software Foundation: GNU Lesser General Public License, Version 3, http://www.gnu.org/licenses/lgpl.html

  8. Garrido-Alenda, A., Forcada, M.L., Carrasco, R.C.: Incremental construction and maintenance of morphological analysers based on augmented letter transducers (2002)

    Google Scholar 

  9. Hopcroft, J.E.: An n log n algorithm for minimizing states in a finite automaton. Tech. rep., Stanford University, Stanford, CA, USA (1971)

    Google Scholar 

  10. Huldén, M.: Fast approximate string matching with finite automata. Procesamiento del Lenguaje Natural 43, 57–64 (2009)

    Google Scholar 

  11. Karttunen, L.: Constructing lexical transducers. In: The Proceedings of the 15th International Conference on Computational Linguistics, Coling 1994, pp. 406–411. ACL, Morristown (1994)

    Google Scholar 

  12. Liang, F.M.: Word hyphenation by computer. Ph.D. thesis, Stanford University (1983), http://www.tug.org/docs/liang/

  13. Lindén, K., Silfverberg, M., Pirinen, T.: HFST tools for morphology—an efficient open-source package for construction of morphological analyzers. In: Mahlow, Piotrowski (eds.) [14], pp. 28–47

    Google Scholar 

  14. Mahlow, C., Piotrowski, M. (eds.): SFCM 2009. CCIS, vol. 41. Springer, Heidelberg (2009)

    MATH  Google Scholar 

  15. Proceedings of the 18th Nordic Conference of Computational Linguistics, Nodalida 2011, Riga, May 11-13 (2011)

    Google Scholar 

  16. Oflazer, K.: Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction. Computational Linguistics 22(1), 73–89 (1996)

    Google Scholar 

  17. Pirinen, T.: Suomen kielen äärellistilainen automaattinen morfologinen analyysi avoimen lähdekoodin menetelmin. Master’s thesis, Helsingin yliopisto (2008), http://www.helsinki.fi/~tapirine/gradu/

  18. Pirinen, T.: Modularisation of Finnish finite-state language description–towards wide collaboration in open source development of a morphological analyser. In: Nodalida (ed.) [15],

    Google Scholar 

  19. Pirinen, T.A., Lindén, K.: Building and using existing hunspell dictionaries and TeX hyphenators as finite-state automata. In: Proccedings of Computational Linguistics – Applications, Wisła, Poland, pp. 25–32 (2010), http://www.helsinki.fi/~tapirine/publications/Pirinen-cla-2010.pdf

  20. Pirinen, T.A., Lindén, K.: Finite-state spell-checking with weighted language and error models. In: Proceedings of the Seventh SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages, Valletta, Malta, pp. 13–18 (2010)

    Google Scholar 

  21. Savary, A.: Typographical nearest-neighbor search in a finite-state lexicon and its application to spelling correction. In: Watson, B.W., Wood, D. (eds.) CIAA 2001. LNCS, vol. 2494, pp. 251–260. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  22. Schmid, H.: A programming language for finite state transducers. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002, pp. 308–309. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  23. Silfverberg, M., Lindén, K.: Conflict resolution using weighted rules in HFST-TWOLC. In: Proceedings of the 17th Nordic Conference of Computational Linguistics, Nodalida 2009, Nealt, pp. 174–181 (2009)

    Google Scholar 

  24. Silfverberg, M., Lindén, K.: HFST runtime format—a compacted transducer format allowing for fast lookup. In: Watson, B., Courie, D., Cleophas, L., Rautenbach, P. (eds.) FSMNLP (July 13, 2009), http://www.ling.helsinki.fi/~klinden/pubs/fsmnlp2009runtime.pdf

  25. Silfverberg, M., Lindén, K.: Part-of-speech tagging using parallel weighted finite-state transducers. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 369–380. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  26. Silfverberg, M., Lindén, K.: Combining statistical models for POS tagging using finite-state calculus. In: Nodalida (ed.) [15]

    Google Scholar 

  27. Zielinski, A., Simon, C.: Morphisto: Service-oriented open source morphology for German. In: Mahlow, Piotrowski (eds.) [14], pp. 64–75.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lindén, K., Axelson, E., Hardwick, S., Pirinen, T.A., Silfverberg, M. (2011). HFST—Framework for Compiling and Applying Morphologies. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2011. Communications in Computer and Information Science, vol 100. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23138-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23138-4_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23137-7

  • Online ISBN: 978-3-642-23138-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics