Handling continuous data in top-down induction of first-order rules

Malerba, Donato; Esposito, Floriana; Semeraro, Giovanni; Caggese, Sergio

doi:10.1007/3-540-63576-9_93

Handling continuous data in top-down induction of first-order rules

Donato Malerba¹,
Floriana Esposito¹,
Giovanni Semeraro¹ &
…
Sergio Caggese¹

Machine Learning 1
Conference paper
First Online: 01 January 2005

167 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1321))

Abstract

Handling numerical information is one of the most important research issues for practical applications of first-order learning systems. This paper is concerned with the problem of inducing first-order classification rules from both numeric and symbolic data. We propose a specialization operator that discretizes continuous data during the learning process. The heuristic function used to choose among different discretizations satisfies a property that can be profitably exploited to improve the efficiency of the specialization operator. The operator has been implemented and bested on the document understanding domain.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

F. Bergadano, and R. Bisio. Constructive learning with continuous-valued attributes. In B. Bouchon, L. Saitta, and R.R. Yager (Eds.), Uncertainty and Intelligent Systems., LNCS 313, Berlin: Springer-Verlag, pp. 54–162, 1988.
Google Scholar
M. Botta, and A.Giordana. Learning quantitative features in a symbolic environment. In Z.W. Ras and M. Zemankova (Eds.), Methodologies for Intelligent Systems, LNAI 542, Berlin: Springer-Verlag, pp. 296–305, 1991.
Google Scholar
W. Buntine. Generalized subsumption and its application to induction and redundancy. Artificial Intelligence, vol. 36, no. 2, pp. 375–399, 1988.
Google Scholar
J.H. Connell and M. Brady. Generating and generalizing models of visual objects. Artificial Intelligence, vol. 31, no. 2, pp. 159–183, 1987.
Google Scholar
L. De Raedt. Interactive Theory Revision. London: Academic Press, 1992.
Google Scholar
S. Dzeroski, L. Todorovski, and T. Urbancic. Handling real numbers in ILP: A step towards better behavioural clones (Extended abstract). In N. Lavrac and S. Wrobel (Eds.), Machine Learning. ECML95, LNAI 912, Berlin: Springer, pp. 283–286, 1995.
Google Scholar
S. Dzeroski, and I. Bratko. Applications of inductive logic programming. In L. De Raedt (Ed.), Advances in Inductive Logic Programming, Amsterdam: IOS Press, pp. 65–81,1996.
Google Scholar
F. Esposito, D. Malerba, and G. Semeraro. Incorporating statistical techniques into empirical symbolic learning systems. In D.J. Hand (Ed.), Artificial Intelligence Frontiers in Statistics, London: Chapman & Hall, pp. 168–181, 1993.
Google Scholar
F. Esposito, D. Malerba, and G. Semeraro. Multistrategy learning for document recognition. Applied Artificial Intelligence, vol. 8, no. 1, pp. 33–84, 1994.
Google Scholar
F. Esposito, S. Caggese, D. Malerba, and G. Semeraro. Classification in noisy domains by flexible matching. Proceedings of the European Symposium on Intelligent Techniques, pp. 45–49,1997.
Google Scholar
F. Esposito, S. Caggese, D. Malerba, and G. Semeraro. Discretizing continuous data while learning first-order rules. In M. van Someren & G. Widmer, 9th European Conference on Machine Learning — Poster Papers, pp. 47–56, Edicni oddelenf VŠ9E, Prague, 1997.
Google Scholar
U.M. Fayyad and K.B. Irani. On the handling of continuous-valued attributes in decision tree generation. Machine Learning, vol. 8, pp. 87–102, 1992.
Google Scholar
R. Gemello, F. Mana, and L. Saitta. Rigel: An inductive learning system. Machine Learning, vol. 6, no. 1, pp. 7–35, 1991.
Google Scholar
D. Haussler. Learning conjunctive concepts in structural domains. Machine Learning, vol. 4, no. 1, pp. 7–40, 1989.
Google Scholar
N. Helft. Inductive generalization: A logical framework. In I. Bratko and N. Lavrac (Eds.), Progress in Machine Learning — Proceedings of the EWSL87, Sigma Press, pp. 149–157, 1987.
Google Scholar
W. Horak. Office document architecture and office document interchange formats: current status of international standardization. IEEE Computer, vol. 18, no. 10, pp. 50–60, 1985.
Google Scholar
N. Lavrac, and S. Dzeroski. Inductive Logic Programming: Techniques and Applications. Chichester. Ellis Horwood, 1994.
Google Scholar
N. Lavrac, S. Dzeroski, and I. Bratko, Handling imperfect data in inductive logic programming. In L. De Raedt (Ed.), Advances in Inductive Logic Programming, Amsterdam: IOS Press, pp. 48–64, 1996.
Google Scholar
J. W. Lloyd. Foundations of Logic Programming. Second Edition. Berlin: Springer-Verlag, 1987.
Google Scholar
D. Malerba, G. Semeraro, and F. Esposito. A multistrategy approach to learning multiple dependent concepts. In C. Taylor and R. Nakhaeizadeh (Eds.), Machine Learning and Statistics: The Interface, London: Wiley, pp. 87–106, 1997.
Google Scholar
R.S. Michalski. Pattern Recognition as rule-guided inductive inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-2, no.4, pp. 349–361,1980.
Google Scholar
G. Nagy, S.C. Seth, and S.D. Stoddard. A prototype document image analysis system for technical journals. IEEE Computer, vol. 25, no. 7, pp. 10–22, 1992.
Google Scholar
M. Orkin, and R. Drogin. Vital Statistics, New York, NY: McGraw Hill, 1990.
Google Scholar
M.J. Pazzani, & D. Kibler. The utility of knowledge in inductive learning. Machine Learning, vol. 9, no. 1, pp. 57–94, 1992.
Google Scholar
G.D. Plotkin. Automatic methods of inductive inference. PhD thesis, Edinburgh University, August 1971.
Google Scholar
R. Quinlan. Induction of decision trees. Machine Learning, vol. 1, pp. 81–106, 1986.
Google Scholar
J.R. Quinlan, and R.M. Cameron-Jones. FOIL: A midterm report. In P.B. Brazdil (Ed.), Machine Learning: ECM-93, Lecture Notes in Artificial Intelligence, 667, Berlin: Springer-Verlag, pp. 3–20, 1993.
Google Scholar
C. Rouveirol. Flattening and saturation: Two representation changes for generalization. Machine Learning, vol. 14, pp. 219–232,1994.
Google Scholar
G. Semeraro, F. Esposito, and D. Malerba. Ideal refinement of Datalog programs. In M. Proietti (Ed.), Logic Program Synthesis and Transformation, LNCS 1048,Berlin:Springer-Verlag, pp. 120–136, 1996.
Google Scholar
Y.Y. Tang, C.D. Yan, and C.Y. Suen. Document processing for automatic knowledge acquisition. IEEE Transactions on Knowledge and Data Engineering, vol. 6, no. 1, pp. 321,1994.
Google Scholar
L.G. Valiant A theory of the learnable. Communications of the ACM, vol. 27, no. 11, pp. 1134–1142,1984.
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università degli Studi di Bari, via Orabona 4, 70125, Bari, Italy
Donato Malerba, Floriana Esposito, Giovanni Semeraro & Sergio Caggese

Authors

Donato Malerba
View author publications
You can also search for this author in PubMed Google Scholar
Floriana Esposito
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Semeraro
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Caggese
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Maurizio Lenzerini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Malerba, D., Esposito, F., Semeraro, G., Caggese, S. (1997). Handling continuous data in top-down induction of first-order rules. In: Lenzerini, M. (eds) AI*IA 97: Advances in Artificial Intelligence. AI*IA 1997. Lecture Notes in Computer Science, vol 1321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63576-9_93

Download citation

DOI: https://doi.org/10.1007/3-540-63576-9_93
Published: 07 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63576-5
Online ISBN: 978-3-540-69601-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics