Abstract
This paper presents a brief overview of the history and trends of Machine Learning, organized according to its goal and principal methodologies. More details are given for the concept learning task, one of the most mature in the field. Applications to Information Extraction tasks are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
In J. Kittler, K. S. Fu, and L. F. Pau, editors, Pattern Recognition Theory and Applications. Reidel Publ. Co., Boston, MA, 1982.
In J. M. Zytkow, editor, Machine Learning (Special Issue on Machine Discovery), volume 12. 1993.
In K. Morik, F. Bergadano, and W. Buntine, editors, Machine Learning (Special issue on Evaluating and Changing Representation), volume 14. 1994.
In M. desJardins and D. F. Gordon, editors, Machine Learning (Special issue on Bias Evaluation and Selection), volume 20. 1995.
In J. Shavlik, L. Hunter, and D. Searls, editors, Machine Learning (Special Issue on applications in Molecular Biology), volume 21. 1995.
In L. Kaelbling, editor, Machine Learning (Special Issue on Reinforcement Learning), volume 22. 1996.
In E. Simoudis, J. Han J., and U. Fayyad, editors, Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining. AAAI Press, Menlo PArk, CA, 1996.
In J. A. Franklin, T. M. Mitchell, and S. Thrun, editors, Machine Learning (Special Issue on Robot Learning), volume 23. 1996.
N. Abe and H. Li. Learning word association norms using tree cut pair models. In Proceedings of the 13th Conference on Machine Learning, pages 3–11, Bari, Italy, 1996. Morgan Kaufman.
D. W. Aha and D. Kibler. Noise-tolerant instance-based learning algorithms. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 794–799, Detroit, MI, 1989.
L. Baird. Residual algorithms: Reinforcement learning with function approximation. In 12th International Conference on Machine Learning, pages 30–37, Lake Tahoe, CA, 1995.
F. Bergadano, A. Giordana, and L. Saitta. Learning concepts in noisy environment. IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-10:555–578, 1988.
M. Blum and L. Blum. Toward a mathematical theory of inductive inference. Information and Control, 28:125–155, 1975.
M. Botta and A. Giordana. SMART+: A multi-strategy learning tool. In IJCAI-93, Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 937–943, Chambéry, France, 1993.
L. Brennan. Stacked regression. Machine Learning, 24(1):49–64, 1996.
J.G. Carbonell. Learning by analogy: formulating and generalizing plans from past experience. In J.G. Carbonell, R.S. Michalski, and T. Mitchell, editors, Machine Learning, an Artificial Intelligence Approach, pages 137–161. Morgan Kaufmann, 1983.
L. J. Cohen. Inductive logic 1945–1977. In E. Agazzi, editor, Modern Logic. D. Reidel Publ. Co., 1980.
W. Cohen. Text categorization and relational learning. In 12th International Conference on Machine Learning, pages 124–132, Lake Tahoe, CA, 1995.
W. W. Cohen. Incremental abductive explanation based learning. Machine Learning, 15:5–24, 1993.
B. Croft. Machine learning and information retrieval. In 12th International Conference on Machine Learning, pages 587–587, Lake Tahoe, CA, 1995.
K. A. De Jong. Analysis of the Behaviour of a Class of Genetic Adaptive Systems. PhD thesis, Dept. of Computer and Communication Sciences, University of Michigan, Ann Arbor, MI, 1975.
K. A. De Jong, W. M. Spears, and F. D. Gordon. Using genetic algorithms for concept learning. Machine Learning, 13:161–188, 1993.
G. F. DeJong and R. J. Mooney. Explanation based generalization: an alternative view. Machine Learning, 1:145–176, 1986.
L. Devroye. Any discrimination rule can have an arbitrarily bad probability of error for finite sample size. IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-2:154–157, 1982.
R. Feldman and I. Dagan. Knowledge discovery in textual databases (kdt). In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 112–117, Montreal, Quebec, 1995. AAAI Press.
T. Fine. Theories of Probability: an examination of foundations. Academic Press, New York, NY, 1974.
D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2:139–172, 1987.
Y. Freund and R. E. Schapire. A decision-theorethic generalization of on-line learning and an application to boosting. In Second European Conference on Computational Learning Theory, pages 23–37. Springer-Verlag, 1995.
K. S. Fu. Syntactic Pattern Recognition. Academic Press, New York, NY, 1974.
A. Giordana and F. Neri. Search-intensive concept induction. Evolutionary Computation, 3 (4):375–416, 1995.
A. Giordana, F. Neri, L. Saitta, and M. Botta. Integrating multiple learning strategies in first order logics. Machine Learning, To appear, 1997.
A. Giordana and C. Sale. Genetic algorithms for learning relations. In 9th International Conference on Machine Learning, pages 169–178, Aberdeen, UK, 1992.
E. M. Gold. Language identification in the limit. Information and Control, 10:447–474, 1967.
D. P. Greene and S. F. Smith. Competition-based induction of decision models from examples. Machine Learning, 13:229–258, 1993.
D. Haussler. Quantifying inductive bias — ai learning algorithms and valiant's learning framework. Artificial Intelligence, 36:177–221, 1988.
D. Haussler. Learning conjunctive concepts in structural domains. Machine Learning, 4:7–40, 1989.
J. H. Holland. Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, Mi, 1975.
K. Hornik, M. Stinchcombe, and H. White. Multilayer feed-forward networks are universal approximators. Neural Networks, 2:359–366, 1989.
C.Z. Janikow. A knowledge intensive genetic algorithm for supervised learning. Machine Learning, 13:198–228, 1993.
K. P. Jantke. Case-based learning and inductive inference. In 5th Annual ACM Workshop on Computational Learning Theory, pages 218–223, Pittsburgh, PA, 1992.
F. Jelinek. Continuous speech recognition by statistical methods. In Proceedings of IEEE, volume 64, pages 532–556, 1976.
R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial intelligence, pages 1137–1143, Montreal, Quebec, 1995. AAAI Press.
T. Kohonen. Self-Organizing Maps. Springer-Verlag, Berlin, 1995.
J. E. Laird, A. Newell, and P. S. Rosenbloom. Soar: an architecture for general intelligence. Artificial Intelligence, 33, 1987.
K. Lang. Newsweeder: Learning to filter netnews. In 12th International Conference on Machine Learning, pages 331–339, Lake Tahoe, CA, 1995.
P. Langley. Editorial: On machine learning. Machine Learning, 1:5–10, 1986.
P. Langley. Editorial: Machine learning as an experimental science. Machine Learning, 3:5–8, 1988.
P. Langley, G. L. Bradshaw, and H. A. Simon. Bacon.5: The discovery of conservation laws. In International Joint Conference on Artificial Intelligence, pages 121–126, Vancouver, Canada, 1981.
P. Langley, G. L. Bradshaw, H. A. Simon, and J. M. Zytkow. Scientific Discovery: computational explorations of the creative processes. MIT Press, Cambridge, MA, 1987.
D. B. Lenat. AM: an artificial intelligence approach to discovery in mathematics as heuristic search. McGraw-Hill, New York, NY, 1982.
D. B. Lenat. EURISKO: A program that learns new heuristics and domain concepts. the nature of heuristics iii: Program design and results. Artificial Intelligence, 21, 1983.
D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In 11th International Machine Learning Conference, New Brunswick, NJ, July 1994.
E. D. Liddy, W. Paik, and E. S. Yu. Text categorization for multiple users based on semantic feature from a machine readable dictionary. ACM Transaction on Information Systems, 12:278–295, 1994.
C. X. Ling and M. Marinov. Answering the connessionistic challenge: a symbolic model of learning the past tenses of english verbs. Cognition, 49:235–290, 1993.
R.S. Michalski. Pattern recognition as a rule-guided inductive inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2:349–361, 1980.
R.S. Michalski. A theory and methodology of inductive learning. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning, an Artificial Intelligence Approach, volume I, pages 83–134. Morgan Kaufmann, Los Altos, CA, 1983.
R.S. Michalski and R. Stepp. Learning from observation: conceptual clustering. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning, an Artificial Intelligence Approach, volume I, pages 83–134. Morgan Kaufmann, Los Altos, CA, 1981
M. Minsky and S. Papert. Perceptrons. MIT Press, Cambride, MA, 1969.
S. Minton. Learning Search Control Knowledge: an Explanation-based Approach. Kluwer, Boston, MA, 1988.
S. Minton, J. G. Carbonell, C. A. Knoblock, D. R. Kuokka, O. Etzioni, and Y. Gil. Explanation-based learning: a problem solving perspective. Artificial Intelligence, 40:63–118, 1989.
T.M. Mitchell. Generalization as search. Artificial Intelligence, 18:203–226, 1982.
T.M. Mitchell. Webwatcher: a learning apprentice for the world wide web. In AAAI Spring Symposium, Stanford, CA, 1995.
T.M. Mitchell, R.M. Keller, and S.T. Kedar-Cabelli. Explanation based generalization: an unifying view. Machine Learning, 1:47–80, 1986.
S. Muggleton, editor. Inductive Logic Programming. Academic Press, London, UK, 1992.
F. Neri. First Order Logic Concept Learning by means of a Distributed Genetic Algorithm. PhD thesis, University of Torino, Italy, 1997. Available at http://www.di.uriito.it/neri/phd/thesis.ps.gz.
F. Neri and L. Saitta. Exploring the power of genetic search in learning symbolic classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-18:1135–1142, 1996.
F. Neri, L. Saitta, and A. Tiberghien. Modelling physical knowledge acquisition in children with machine learning. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society, page In press, Stanford, 1997.
A. Nix and M. Vose. Modeling genetic algorithms with Markov Chains. Annals of Mathematics and Artificial Intelligence, 5:79–88, 1992.
M. Pazzani, M. Dyer, and m. Flowers. Using prior learning to facilitate the learning of new causal theories. In Proceedings of International Joint Conference on Artificial Intelligence, pages 277–279, Milan, Italy, 1987.
M.J. Pazzani and D. Kibler. The utility of knowledge in inductive learning. Machine Learning, 14:57–94, 1992.
J. R. Quinlan. Learning logical definitions from relations. Machine Learning, 5:239–266, 1990.
E. Riloff and W. Lehnert. Information extraction as a basis for high precision text classification. ACM Transaction on Information Systems, 12:296–333, 1994.
J. Rissanen. Universal coding, information, prediction, and estimation. IEEE Transaction on Information Theory, IT-30:629–636, 1984.
E. W. Rosch. Principles of categorization. In E. W. Rosch and B. Lloyd, editors, Cognition and Categorization. Earlbaum, Hillsdale, NJ, 1978.
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–407, 1958.
D. E. Rumelhart and J. L. McClelland. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Parts I & II. MIT Press, Cambridge, Massachusetts, 1986.
M. Sahami, M. Hearst, and E. Saund. Applying the multiple cause mixture model to text categorization. In Proceedings of the 13th Conference on Machine Learning, pages 435–443, Bari, Italy, 1996. Morgan Kaufman.
L. Saitta and F. Bergadano. Pattern recognition and valiant's learning framework. IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-15:145–155, 1993.
L. Saitta, M. Botta, and F. Neri. Multistrategy learning and theory revision. Machine Learning, 11:153–172, 1993.
G. Salton. Development in automatic text retrieval. Science, 253:974–980, 1991.
C. Schaffer. A conservation law for generalization performance. In 11th International Conference on Machine Learning, pages 259–265, New Brunswick, NJ, 1994.
R. E. Schapire. The strenght of weak learnability. Machine Learning, 5:197–227, 1990.
T. R. Shultz, D. Mareschal, and W. C. Schmidt. Modeling cognitive development on balance scale phenomena. Machine Learning, 16:57–86, 1994.
R. J. Solomonoff. A formal theory of inductive inference. Information and Control, 7:1–22, 224–254, 1964.
P. Suppes, M. Bottner, and L. Liang. Comprehension grammars generated from ml on nl sentences. Machine Learning, 19:133–152, 1990.
R.S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.
D. Thau. Primacy effects and selective attention in incremental clustering. In Fourteenth Annual Conference of the Cognitive Science Society, pages 219–223, Hillsdale, NJ, 1992. Lawrence Erlbaum Associates.
P.E. Utgoff. Machine learning of Inductive Bias. Kluwer Academic Press, 1986.
L. G. Valiant. Learning fallible deterministic finite automata. Communications of the ACM, 27:1134–1142, 1984.
M. VanHeyningen. The unified computer science technical reports index: Lessons in indexing diverse resources. In Proceedings of the 2nd Int. Conf, on the World Wide Web, 1994.
V. N. Vapnik and Y. A. Chervonenkis. Necessary and sufficient conditions for the uniform convergence of means to their expectations. Theory Probability Applications, 26:532–553, 1981.
M. Veloso and J. Carbonell. Automatic case generation, storage and retrieval in prodigy. In Proceedings of the First Workshop on Multistrategy Learning, pages 363–377, Harpers Ferry, WV, 1991.
S. Vosniadou and W. F. Brewer. Mental models of the earth: A study of conceptual change in childhood. Cognitive Psychology, 24:535–585, 1992.
T. W. Yan and H. Garcia-Molina. Index structures for selective dissemination of information. Technical Report TRSTAN-CS-92-1454, Stanford University, Stanford, CA, 1992.
O. R. Zaane and J. Han. Resource and knowledge discovery in global information systems: A preliminary design and experiment. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 331–336, Menlo Park, CA, 1995. AAAI Press.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Neri, F., Saitta, L. (1997). Machine learning for information extraction. In: Pazienza, M.T. (eds) Information Extraction A Multidisciplinary Approach to an Emerging Information Technology. SCIE 1997. Lecture Notes in Computer Science, vol 1299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63438-X_9
Download citation
DOI: https://doi.org/10.1007/3-540-63438-X_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63438-6
Online ISBN: 978-3-540-69548-6
eBook Packages: Springer Book Archive