Skip to main content

Robust Inference of Relevant Attributes

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2842))

Abstract

Given n Boolean input variables representing a set of attritubes, we consider Boolean functions f (i.e., binary classifications of tuples) that actually depend only on a small but unknown subset of these variables/attributes, in the following called relevant. The goal is to determine the relevant attributes given a sequence of examples – input vectors X and corresponding classifications f(X). We analyze two simple greedy strategies and prove that they are able to achieve this goal for various kinds of Boolean functions and various input distributions according to which the examples are drawn at random.

This generalizes results obtained by Akutsu, Miyano, and Kuhara for the uniform distribution. The analysis also provides explicit upper bounds on the number of necessary examples. They depend on the distribution and combinatorial properties of the function to be inferred.

Our second contribution is an extension of these results to the situation where attribute noise is present, i.e., a certain number of input bits x i may be wrong. This is a typical situation, e.g., in medical research or computational biology, where not all attributes can be measured reliably. We show that even in such an error-prone situation, reliable inference of the relevant attributes can be performed, because our greedy strategies are robust even against a linear number of errors.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. 1993 ACM SIGMOD Conf., pp. 207–216 (1993)

    Google Scholar 

  2. Akutsu, T., Bao, F.: Approximating Minimum Keys and Optimal Substructure Screens. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 290–299. Springer, Heidelberg (1996)

    Google Scholar 

  3. Akutsu, T., Miyano, S., Kuhara, S.: A Simple Greedy Algorithm for Finding Functional Relations: Efficient Implementation and Average Case Analysis. TCS 292(2), 481–495 (2003); Morishita, S., Arikawa, S. (eds.): DS 2000. LNCS (LNAI), vol. 1967, pp. 86–98. Springer, Heidelberg (2000)

    Book  Google Scholar 

  4. Angluin, D.: Queries and Concept Learning. Machine Learning 2(4), 319–342 (1988)

    Google Scholar 

  5. Angluin, D., Laird, P.: Learning from noisy examples. Machine Learning 2(4), 343–370 (1988)

    Google Scholar 

  6. Arora, S., Babai, L., Stern, J., Sweedyk, Z.: The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations. J. CSS 54, 317–331 (1997)

    MATH  MathSciNet  Google Scholar 

  7. Arpe, J., Reischuk, R.: Robust Inference of Relevant Attributes. Techn. Report, SIIM-TR-A 03-12, Univ. Lübeck (2003), available at http://www.tcs.mu-luebeck.de/TechReports.html

  8. Blum, A., Hellerstein, L., Littlestone, N.: Learning in the Presence of Finitely or Infinitely Many Irrelevant Attributes. In: Proc. 4th, pp. 157–166 (1991)

    Google Scholar 

  9. Blum, A., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97(1–2), 245–271 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  10. Feige, U.: A Threshold of ln n for Approximating Set Cover. J. ACM 45, 634–652 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Goldman, S., Sloan, H.: Can PAC Learning Algorithms Tolerate Random Attribute Noise? Algorithmica 14, 70–84 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  12. Johnson, D.: Approximation Algorithms for Combinatorial Problems. J. CSS 9, 256–278 (1974)

    MATH  Google Scholar 

  13. Littlestone, N.: Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm. Machine Learning 4(2), 285–318 (1988)

    Google Scholar 

  14. Littlestone, N.: From On-line to Batch Learning. In: Proc. 2nd COLT 1989, pp. 269–284 (1989)

    Google Scholar 

  15. Mannila, H., Räihä, K.: On the Complexity of Inferring Functional Dependencies. Discrete Applied Mathematics 40, 237–243 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  16. Mossel, E., O’Donnell, R., Servedio, R.: Learning Juntas. In: Proc. STOC 2003, pp. 206–212 (2003)

    Google Scholar 

  17. Valiant, L.: Projection Learning. Machine Learning 37(2), 115–130 (1999)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Arpe, J., Reischuk, R. (2003). Robust Inference of Relevant Attributes. In: Gavaldá, R., Jantke, K.P., Takimoto, E. (eds) Algorithmic Learning Theory. ALT 2003. Lecture Notes in Computer Science(), vol 2842. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39624-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39624-6_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20291-2

  • Online ISBN: 978-3-540-39624-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics