Skip to main content

Identification with Probability One of Stochastic Deterministic Linear Languages

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2842))

Abstract

Learning context-free grammars is generally considered a very hard task. This is even more the case when learning has to be done from positive examples only. In this context one possibility is to learn stochastic context-free grammars, by making the implicit assumption that the distribution of the examples is given by such an object. Nevertheless this is still a hard task for which no algorithm is known. We use recent results to introduce a proper subclass of linear grammars, called deterministic linear grammars, for which we prove that a small canonical form can be found. This has been a successful condition for a learning algorithm to be possible. We propose an algorithm for this class of grammars and we prove that our algorithm works in polynomial time, and structurally converges to the target in the paradigm of identification in the limit with probability 1. Although this does not ensure that only a polynomial size sample is necessary for learning to be possible, we argue that the criterion means that no added (hidden) bias is present.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baker, J.K.: Trainable grammars for speech recognition. In: Speech Communication Papers for the 97th Meeting of the Acoustical Soc. of America, pp. 547–550 (1979)

    Google Scholar 

  2. Carrasco, R., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS (LNAI), vol. 862, pp. 139–150. Springer, Heidelberg (1994)

    Google Scholar 

  3. Carrasco, R.C., Oncina, J.: Learning deterministic regular grammars from stochastic samples in polynomial time. RAIRO (Theoretical Informatics and Applications) 33(1), 1–20 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  4. de la Higuera, C., Oncina, J.: Learning deterministic linear languages. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 185–200. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. de la Higuera, C., Thollard, F.: Identication in the limit with probability one of stochastic deterministic finite automata. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 15–24. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  6. Feller, W.: An Introduction to Probability Theory and Its Applications, 3rd edn., vol. 1 and 2. John Wiley & Sons, Inc., Chichester (1968)

    MATH  Google Scholar 

  7. Langley, P., Stromsten, S.: Learning context-free grammars with a simplicity bias. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 220–228. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Nevill-Manning, C., Witten, I.: Identifying hierarchical structure in sequences: A linear-time algorithm. Journal of A. I.Research 7, 67–82 (1997)

    MATH  Google Scholar 

  9. Sakakibara, Y.: Efficient learning of context-free grammars from positive structural examples. Information and Computation 97, 23–60 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  10. Sakakibara, Y., Brown, M., Hughley, R., Mian, I., Sjolander, K., Underwood, R., Haussler, D.: Stochastic context-free grammars for trna modeling. Nuclear Acids Res. 22, 5112–5120 (1994)

    Article  Google Scholar 

  11. Wang, Y., Acero, A.: Evaluation of spoken language grammar learning in the atis domain. In: Proceedings of ICASSP (2002)

    Google Scholar 

  12. Young-Lai, M., Tompa, F.W.: Stochastic grammatical inference of text database structure. Machine Learning 40(2), 111–137 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

de la Higuera, C., Oncina, J. (2003). Identification with Probability One of Stochastic Deterministic Linear Languages. In: Gavaldá, R., Jantke, K.P., Takimoto, E. (eds) Algorithmic Learning Theory. ALT 2003. Lecture Notes in Computer Science(), vol 2842. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39624-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39624-6_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20291-2

  • Online ISBN: 978-3-540-39624-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics