Skip to main content

Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses

  • Conference paper
Grammatical Inference: Algorithms and Applications (ICGI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4201))

Included in the following conference series:

  • 521 Accesses

Abstract

A tree pattern p is a first-order term in formal logic, and the language of p is the set of all the tree patterns obtainable by replacing each variable in p with a tree pattern containing no variables. We consider the inductive inference of the unions of these languages from positive examples using strategies that guarantee some forms of minimality during the learning process. By a result in our earlier work, the existence of a characteristic set for each language in a class \({\mathcal L }\) (within \({\mathcal L }\)) implies that \({\mathcal L }\) can be identified in the limit by a learner that simply conjectures a hypothesis containing the examples, that is minimal in the number of elements of up to an appropriate size. Furthermore, if there is a size ℓ such that each candidate hypothesis has a characteristic set (within the languages in \({\mathcal L }\) that intersects non-emptily with the examples) that consists only of elements of up to size ℓ, then the hypotheses containing the least number of elements of up to size ℓ are at the same time minimal with respect to inclusion. In this paper we show how to determine such a size ℓ for the unions of the tree pattern languages, and hence allowing us to learn the class using hypotheses that fulfill the two mentioned notions of minimality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Angluin, D.: Inductive inference of formal languages from positive data. Information and Control 45, 117–135 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  2. Angluin, D.: Inference of reversible languages. J. of the ACM 29, 741–765 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  3. Arimura, H., Ishizaka, H., Shinohara, T.: Learning unions of tree patterns using queries. Theoretical Computer Science 185(1), 47–62 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  4. Arimura, H., Shinohara, T., Otsuki, S.: A polynomial time algorithm for finding finite unions of tree pattern languages. In: Brewka, G., Jantke, K.P., Schmitt, P.H. (eds.) NIL 1991. LNCS (LNAI), vol. 659, pp. 118–131. Springer, Heidelberg (1993)

    Chapter  Google Scholar 

  5. Arimura, H., Shinohara, T., Otsuki, S.: Polynomial time inference of unions of two tree pattern languages. IEICE Transactions on Information and Systems E75-D(7), 426–434 (1992)

    Google Scholar 

  6. Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)

    Google Scholar 

  7. Arimura, H., Shinohara, T., Otsuki, S., Ishizaka, H.: A generalization of the least general generalization. Machine Intelligence 13 13, 59–85 (1994)

    Google Scholar 

  8. Chan, C., Garofalakis, M., Rastogi, R.: RE-tree: an efficient index structure for regular expressions. The VLDB Journal 12(2), 102–119 (2003)

    Article  Google Scholar 

  9. Gold, E.M.: Language identification in the limit. Information and Control 10, 447–474 (1967)

    Article  MATH  Google Scholar 

  10. W3C XML Core Working Group. Extensible Markup Language (XML) 1.0, 3rd edn. W3C Recommendation (2004)

    Google Scholar 

  11. Kobayashi, S., Yokomori, T.: Identifiability of subspaces and homomorphic images of zero-reversible languages. In: Proc. of Algorithmic Learning Theory, 8th Int. Conf., ALT 1997. LNCS (LNAI), vol. 1316, pp. 48–61. Springer, Heidelberg (1997)

    Google Scholar 

  12. Ng, Y.K., Ono, H., Shinohara, T.: Measuring over-generalization in the minimal multiple generalizations of biosequences. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 176–188. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Ng, Y.K., Shinohara, T.: Inferring unions of the pattern languages by the most fitting covers. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 269–282. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Ng, Y.K., Shinohara, T.: Finding consensus patterns in very scarce biosequence samples from their minimal multiple generalizations. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 540–545. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Plotkin, G.D.: A note on inductive generalization. Machine Intelligence 5, 153–163 (1970)

    MathSciNet  Google Scholar 

  16. Sato, M.: Inductive inference of formal languages. Bulletin of Informatics and Cybernetics 27(1), 85–106 (1995)

    MATH  MathSciNet  Google Scholar 

  17. Shinohara, T.: Polynomial time inference of extended regular pattern languages. In: Goto, E., Nakajima, R., Yonezawa, A., Nakata, I., Furukawa, K. (eds.) RIMS 1982. LNCS, vol. 147, pp. 115–127. Springer, Heidelberg (1983)

    Google Scholar 

  18. Wright, K.: Identification of unions of languages drawn from an identifiable class. In: Proc. of the Second Annual Workshop on Computational Learning Theory, pp. 328–333. Morgan Kaufmann, San Francisco (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ng, Y.K., Shinohara, T. (2006). Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_25

Download citation

  • DOI: https://doi.org/10.1007/11872436_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45264-5

  • Online ISBN: 978-3-540-45265-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics