Robust Inference of Relevant Attributes

Arpe, Jan; Reischuk, Rüdiger

doi:10.1007/978-3-540-39624-6_10

Robust Inference of Relevant Attributes

Jan Arpe⁴ &
Rüdiger Reischuk⁴

Conference paper

351 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2842))

Abstract

Given n Boolean input variables representing a set of attritubes, we consider Boolean functions f (i.e., binary classifications of tuples) that actually depend only on a small but unknown subset of these variables/attributes, in the following called relevant. The goal is to determine the relevant attributes given a sequence of examples – input vectors X and corresponding classifications f(X). We analyze two simple greedy strategies and prove that they are able to achieve this goal for various kinds of Boolean functions and various input distributions according to which the examples are drawn at random.

This generalizes results obtained by Akutsu, Miyano, and Kuhara for the uniform distribution. The analysis also provides explicit upper bounds on the number of necessary examples. They depend on the distribution and combinatorial properties of the function to be inferred.

Our second contribution is an extension of these results to the situation where attribute noise is present, i.e., a certain number of input bits x _i may be wrong. This is a typical situation, e.g., in medical research or computational biology, where not all attributes can be measured reliably. We show that even in such an error-prone situation, reliable inference of the relevant attributes can be performed, because our greedy strategies are robust even against a linear number of errors.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. 1993 ACM SIGMOD Conf., pp. 207–216 (1993)
Google Scholar
Akutsu, T., Bao, F.: Approximating Minimum Keys and Optimal Substructure Screens. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 290–299. Springer, Heidelberg (1996)
Google Scholar
Akutsu, T., Miyano, S., Kuhara, S.: A Simple Greedy Algorithm for Finding Functional Relations: Efficient Implementation and Average Case Analysis. TCS 292(2), 481–495 (2003); Morishita, S., Arikawa, S. (eds.): DS 2000. LNCS (LNAI), vol. 1967, pp. 86–98. Springer, Heidelberg (2000)
Book Google Scholar
Angluin, D.: Queries and Concept Learning. Machine Learning 2(4), 319–342 (1988)
Google Scholar
Angluin, D., Laird, P.: Learning from noisy examples. Machine Learning 2(4), 343–370 (1988)
Google Scholar
Arora, S., Babai, L., Stern, J., Sweedyk, Z.: The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations. J. CSS 54, 317–331 (1997)
MATH MathSciNet Google Scholar
Arpe, J., Reischuk, R.: Robust Inference of Relevant Attributes. Techn. Report, SIIM-TR-A 03-12, Univ. Lübeck (2003), available at http://www.tcs.mu-luebeck.de/TechReports.html
Blum, A., Hellerstein, L., Littlestone, N.: Learning in the Presence of Finitely or Infinitely Many Irrelevant Attributes. In: Proc. 4th, pp. 157–166 (1991)
Google Scholar
Blum, A., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97(1–2), 245–271 (1997)
Article MATH MathSciNet Google Scholar
Feige, U.: A Threshold of ln n for Approximating Set Cover. J. ACM 45, 634–652 (1998)
Article MATH MathSciNet Google Scholar
Goldman, S., Sloan, H.: Can PAC Learning Algorithms Tolerate Random Attribute Noise? Algorithmica 14, 70–84 (1995)
Article MATH MathSciNet Google Scholar
Johnson, D.: Approximation Algorithms for Combinatorial Problems. J. CSS 9, 256–278 (1974)
MATH Google Scholar
Littlestone, N.: Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm. Machine Learning 4(2), 285–318 (1988)
Google Scholar
Littlestone, N.: From On-line to Batch Learning. In: Proc. 2nd COLT 1989, pp. 269–284 (1989)
Google Scholar
Mannila, H., Räihä, K.: On the Complexity of Inferring Functional Dependencies. Discrete Applied Mathematics 40, 237–243 (1992)
Article MATH MathSciNet Google Scholar
Mossel, E., O’Donnell, R., Servedio, R.: Learning Juntas. In: Proc. STOC 2003, pp. 206–212 (2003)
Google Scholar
Valiant, L.: Projection Learning. Machine Learning 37(2), 115–130 (1999)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Theoretische Informatik, Universität zu Lübeck, Wallstr. 40, 23560, Lübeck, Germany
Jan Arpe & Rüdiger Reischuk

Authors

Jan Arpe
View author publications
You can also search for this author in PubMed Google Scholar
Rüdiger Reischuk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universitat Politècnica de Catalunya, Barcelona, Spain
Ricard Gavaldá
Meme Media Laboratory, Hokkaido University Sapporo, Kita 13, Nishi 8, Kita-ku, 060-8628, Sapporo, Japan
Klaus P. Jantke
,
Eiji Takimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arpe, J., Reischuk, R. (2003). Robust Inference of Relevant Attributes. In: Gavaldá, R., Jantke, K.P., Takimoto, E. (eds) Algorithmic Learning Theory. ALT 2003. Lecture Notes in Computer Science(), vol 2842. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39624-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-39624-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20291-2
Online ISBN: 978-3-540-39624-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics