Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1299))

Included in the following conference series:

Abstract

This paper presents a brief overview of the history and trends of Machine Learning, organized according to its goal and principal methodologies. More details are given for the concept learning task, one of the most mature in the field. Applications to Information Extraction tasks are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 29.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 39.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. In J. Kittler, K. S. Fu, and L. F. Pau, editors, Pattern Recognition Theory and Applications. Reidel Publ. Co., Boston, MA, 1982.

    Google Scholar 

  2. In J. M. Zytkow, editor, Machine Learning (Special Issue on Machine Discovery), volume 12. 1993.

    Google Scholar 

  3. In K. Morik, F. Bergadano, and W. Buntine, editors, Machine Learning (Special issue on Evaluating and Changing Representation), volume 14. 1994.

    Google Scholar 

  4. In M. desJardins and D. F. Gordon, editors, Machine Learning (Special issue on Bias Evaluation and Selection), volume 20. 1995.

    Google Scholar 

  5. In J. Shavlik, L. Hunter, and D. Searls, editors, Machine Learning (Special Issue on applications in Molecular Biology), volume 21. 1995.

    Google Scholar 

  6. In L. Kaelbling, editor, Machine Learning (Special Issue on Reinforcement Learning), volume 22. 1996.

    Google Scholar 

  7. In E. Simoudis, J. Han J., and U. Fayyad, editors, Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining. AAAI Press, Menlo PArk, CA, 1996.

    Google Scholar 

  8. In J. A. Franklin, T. M. Mitchell, and S. Thrun, editors, Machine Learning (Special Issue on Robot Learning), volume 23. 1996.

    Google Scholar 

  9. N. Abe and H. Li. Learning word association norms using tree cut pair models. In Proceedings of the 13th Conference on Machine Learning, pages 3–11, Bari, Italy, 1996. Morgan Kaufman.

    Google Scholar 

  10. D. W. Aha and D. Kibler. Noise-tolerant instance-based learning algorithms. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 794–799, Detroit, MI, 1989.

    Google Scholar 

  11. L. Baird. Residual algorithms: Reinforcement learning with function approximation. In 12th International Conference on Machine Learning, pages 30–37, Lake Tahoe, CA, 1995.

    Google Scholar 

  12. F. Bergadano, A. Giordana, and L. Saitta. Learning concepts in noisy environment. IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-10:555–578, 1988.

    Article  Google Scholar 

  13. M. Blum and L. Blum. Toward a mathematical theory of inductive inference. Information and Control, 28:125–155, 1975.

    Article  MathSciNet  MATH  Google Scholar 

  14. M. Botta and A. Giordana. SMART+: A multi-strategy learning tool. In IJCAI-93, Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 937–943, Chambéry, France, 1993.

    Google Scholar 

  15. L. Brennan. Stacked regression. Machine Learning, 24(1):49–64, 1996.

    Google Scholar 

  16. J.G. Carbonell. Learning by analogy: formulating and generalizing plans from past experience. In J.G. Carbonell, R.S. Michalski, and T. Mitchell, editors, Machine Learning, an Artificial Intelligence Approach, pages 137–161. Morgan Kaufmann, 1983.

    Google Scholar 

  17. L. J. Cohen. Inductive logic 1945–1977. In E. Agazzi, editor, Modern Logic. D. Reidel Publ. Co., 1980.

    Google Scholar 

  18. W. Cohen. Text categorization and relational learning. In 12th International Conference on Machine Learning, pages 124–132, Lake Tahoe, CA, 1995.

    Google Scholar 

  19. W. W. Cohen. Incremental abductive explanation based learning. Machine Learning, 15:5–24, 1993.

    Google Scholar 

  20. B. Croft. Machine learning and information retrieval. In 12th International Conference on Machine Learning, pages 587–587, Lake Tahoe, CA, 1995.

    Google Scholar 

  21. K. A. De Jong. Analysis of the Behaviour of a Class of Genetic Adaptive Systems. PhD thesis, Dept. of Computer and Communication Sciences, University of Michigan, Ann Arbor, MI, 1975.

    Google Scholar 

  22. K. A. De Jong, W. M. Spears, and F. D. Gordon. Using genetic algorithms for concept learning. Machine Learning, 13:161–188, 1993.

    Google Scholar 

  23. G. F. DeJong and R. J. Mooney. Explanation based generalization: an alternative view. Machine Learning, 1:145–176, 1986.

    Google Scholar 

  24. L. Devroye. Any discrimination rule can have an arbitrarily bad probability of error for finite sample size. IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-2:154–157, 1982.

    Article  MATH  Google Scholar 

  25. R. Feldman and I. Dagan. Knowledge discovery in textual databases (kdt). In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 112–117, Montreal, Quebec, 1995. AAAI Press.

    Google Scholar 

  26. T. Fine. Theories of Probability: an examination of foundations. Academic Press, New York, NY, 1974.

    MATH  Google Scholar 

  27. D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2:139–172, 1987.

    Google Scholar 

  28. Y. Freund and R. E. Schapire. A decision-theorethic generalization of on-line learning and an application to boosting. In Second European Conference on Computational Learning Theory, pages 23–37. Springer-Verlag, 1995.

    Google Scholar 

  29. K. S. Fu. Syntactic Pattern Recognition. Academic Press, New York, NY, 1974.

    MATH  Google Scholar 

  30. A. Giordana and F. Neri. Search-intensive concept induction. Evolutionary Computation, 3 (4):375–416, 1995.

    Article  Google Scholar 

  31. A. Giordana, F. Neri, L. Saitta, and M. Botta. Integrating multiple learning strategies in first order logics. Machine Learning, To appear, 1997.

    Google Scholar 

  32. A. Giordana and C. Sale. Genetic algorithms for learning relations. In 9th International Conference on Machine Learning, pages 169–178, Aberdeen, UK, 1992.

    Google Scholar 

  33. E. M. Gold. Language identification in the limit. Information and Control, 10:447–474, 1967.

    Article  MathSciNet  MATH  Google Scholar 

  34. D. P. Greene and S. F. Smith. Competition-based induction of decision models from examples. Machine Learning, 13:229–258, 1993.

    Article  Google Scholar 

  35. D. Haussler. Quantifying inductive bias — ai learning algorithms and valiant's learning framework. Artificial Intelligence, 36:177–221, 1988.

    Article  MathSciNet  MATH  Google Scholar 

  36. D. Haussler. Learning conjunctive concepts in structural domains. Machine Learning, 4:7–40, 1989.

    Google Scholar 

  37. J. H. Holland. Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, Mi, 1975.

    Google Scholar 

  38. K. Hornik, M. Stinchcombe, and H. White. Multilayer feed-forward networks are universal approximators. Neural Networks, 2:359–366, 1989.

    Article  MATH  Google Scholar 

  39. C.Z. Janikow. A knowledge intensive genetic algorithm for supervised learning. Machine Learning, 13:198–228, 1993.

    Article  Google Scholar 

  40. K. P. Jantke. Case-based learning and inductive inference. In 5th Annual ACM Workshop on Computational Learning Theory, pages 218–223, Pittsburgh, PA, 1992.

    Google Scholar 

  41. F. Jelinek. Continuous speech recognition by statistical methods. In Proceedings of IEEE, volume 64, pages 532–556, 1976.

    Article  Google Scholar 

  42. R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial intelligence, pages 1137–1143, Montreal, Quebec, 1995. AAAI Press.

    Google Scholar 

  43. T. Kohonen. Self-Organizing Maps. Springer-Verlag, Berlin, 1995.

    Book  MATH  Google Scholar 

  44. J. E. Laird, A. Newell, and P. S. Rosenbloom. Soar: an architecture for general intelligence. Artificial Intelligence, 33, 1987.

    Article  Google Scholar 

  45. K. Lang. Newsweeder: Learning to filter netnews. In 12th International Conference on Machine Learning, pages 331–339, Lake Tahoe, CA, 1995.

    Google Scholar 

  46. P. Langley. Editorial: On machine learning. Machine Learning, 1:5–10, 1986.

    Google Scholar 

  47. P. Langley. Editorial: Machine learning as an experimental science. Machine Learning, 3:5–8, 1988.

    Google Scholar 

  48. P. Langley, G. L. Bradshaw, and H. A. Simon. Bacon.5: The discovery of conservation laws. In International Joint Conference on Artificial Intelligence, pages 121–126, Vancouver, Canada, 1981.

    Google Scholar 

  49. P. Langley, G. L. Bradshaw, H. A. Simon, and J. M. Zytkow. Scientific Discovery: computational explorations of the creative processes. MIT Press, Cambridge, MA, 1987.

    Google Scholar 

  50. D. B. Lenat. AM: an artificial intelligence approach to discovery in mathematics as heuristic search. McGraw-Hill, New York, NY, 1982.

    Google Scholar 

  51. D. B. Lenat. EURISKO: A program that learns new heuristics and domain concepts. the nature of heuristics iii: Program design and results. Artificial Intelligence, 21, 1983.

    Article  Google Scholar 

  52. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In 11th International Machine Learning Conference, New Brunswick, NJ, July 1994.

    Google Scholar 

  53. E. D. Liddy, W. Paik, and E. S. Yu. Text categorization for multiple users based on semantic feature from a machine readable dictionary. ACM Transaction on Information Systems, 12:278–295, 1994.

    Article  Google Scholar 

  54. C. X. Ling and M. Marinov. Answering the connessionistic challenge: a symbolic model of learning the past tenses of english verbs. Cognition, 49:235–290, 1993.

    Article  Google Scholar 

  55. R.S. Michalski. Pattern recognition as a rule-guided inductive inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2:349–361, 1980.

    Article  MATH  Google Scholar 

  56. R.S. Michalski. A theory and methodology of inductive learning. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning, an Artificial Intelligence Approach, volume I, pages 83–134. Morgan Kaufmann, Los Altos, CA, 1983.

    Google Scholar 

  57. R.S. Michalski and R. Stepp. Learning from observation: conceptual clustering. In R. Michalski, J. Carbonell, and T. Mitchell, editors, Machine Learning, an Artificial Intelligence Approach, volume I, pages 83–134. Morgan Kaufmann, Los Altos, CA, 1981

    Google Scholar 

  58. M. Minsky and S. Papert. Perceptrons. MIT Press, Cambride, MA, 1969.

    MATH  Google Scholar 

  59. S. Minton. Learning Search Control Knowledge: an Explanation-based Approach. Kluwer, Boston, MA, 1988.

    Book  Google Scholar 

  60. S. Minton, J. G. Carbonell, C. A. Knoblock, D. R. Kuokka, O. Etzioni, and Y. Gil. Explanation-based learning: a problem solving perspective. Artificial Intelligence, 40:63–118, 1989.

    Article  Google Scholar 

  61. T.M. Mitchell. Generalization as search. Artificial Intelligence, 18:203–226, 1982.

    Article  MathSciNet  Google Scholar 

  62. T.M. Mitchell. Webwatcher: a learning apprentice for the world wide web. In AAAI Spring Symposium, Stanford, CA, 1995.

    Google Scholar 

  63. T.M. Mitchell, R.M. Keller, and S.T. Kedar-Cabelli. Explanation based generalization: an unifying view. Machine Learning, 1:47–80, 1986.

    Google Scholar 

  64. S. Muggleton, editor. Inductive Logic Programming. Academic Press, London, UK, 1992.

    MATH  Google Scholar 

  65. F. Neri. First Order Logic Concept Learning by means of a Distributed Genetic Algorithm. PhD thesis, University of Torino, Italy, 1997. Available at http://www.di.uriito.it/neri/phd/thesis.ps.gz.

    Google Scholar 

  66. F. Neri and L. Saitta. Exploring the power of genetic search in learning symbolic classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-18:1135–1142, 1996.

    Article  Google Scholar 

  67. F. Neri, L. Saitta, and A. Tiberghien. Modelling physical knowledge acquisition in children with machine learning. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society, page In press, Stanford, 1997.

    Google Scholar 

  68. A. Nix and M. Vose. Modeling genetic algorithms with Markov Chains. Annals of Mathematics and Artificial Intelligence, 5:79–88, 1992.

    Article  MathSciNet  MATH  Google Scholar 

  69. M. Pazzani, M. Dyer, and m. Flowers. Using prior learning to facilitate the learning of new causal theories. In Proceedings of International Joint Conference on Artificial Intelligence, pages 277–279, Milan, Italy, 1987.

    Google Scholar 

  70. M.J. Pazzani and D. Kibler. The utility of knowledge in inductive learning. Machine Learning, 14:57–94, 1992.

    Google Scholar 

  71. J. R. Quinlan. Learning logical definitions from relations. Machine Learning, 5:239–266, 1990.

    Google Scholar 

  72. E. Riloff and W. Lehnert. Information extraction as a basis for high precision text classification. ACM Transaction on Information Systems, 12:296–333, 1994.

    Article  Google Scholar 

  73. J. Rissanen. Universal coding, information, prediction, and estimation. IEEE Transaction on Information Theory, IT-30:629–636, 1984.

    Article  MathSciNet  MATH  Google Scholar 

  74. E. W. Rosch. Principles of categorization. In E. W. Rosch and B. Lloyd, editors, Cognition and Categorization. Earlbaum, Hillsdale, NJ, 1978.

    Google Scholar 

  75. F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–407, 1958.

    Article  Google Scholar 

  76. D. E. Rumelhart and J. L. McClelland. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Parts I & II. MIT Press, Cambridge, Massachusetts, 1986.

    Google Scholar 

  77. M. Sahami, M. Hearst, and E. Saund. Applying the multiple cause mixture model to text categorization. In Proceedings of the 13th Conference on Machine Learning, pages 435–443, Bari, Italy, 1996. Morgan Kaufman.

    Google Scholar 

  78. L. Saitta and F. Bergadano. Pattern recognition and valiant's learning framework. IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI-15:145–155, 1993.

    Article  Google Scholar 

  79. L. Saitta, M. Botta, and F. Neri. Multistrategy learning and theory revision. Machine Learning, 11:153–172, 1993.

    Google Scholar 

  80. G. Salton. Development in automatic text retrieval. Science, 253:974–980, 1991.

    Article  MathSciNet  Google Scholar 

  81. C. Schaffer. A conservation law for generalization performance. In 11th International Conference on Machine Learning, pages 259–265, New Brunswick, NJ, 1994.

    Google Scholar 

  82. R. E. Schapire. The strenght of weak learnability. Machine Learning, 5:197–227, 1990.

    Google Scholar 

  83. T. R. Shultz, D. Mareschal, and W. C. Schmidt. Modeling cognitive development on balance scale phenomena. Machine Learning, 16:57–86, 1994.

    Google Scholar 

  84. R. J. Solomonoff. A formal theory of inductive inference. Information and Control, 7:1–22, 224–254, 1964.

    Article  MathSciNet  MATH  Google Scholar 

  85. P. Suppes, M. Bottner, and L. Liang. Comprehension grammars generated from ml on nl sentences. Machine Learning, 19:133–152, 1990.

    MATH  Google Scholar 

  86. R.S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.

    Google Scholar 

  87. D. Thau. Primacy effects and selective attention in incremental clustering. In Fourteenth Annual Conference of the Cognitive Science Society, pages 219–223, Hillsdale, NJ, 1992. Lawrence Erlbaum Associates.

    Google Scholar 

  88. P.E. Utgoff. Machine learning of Inductive Bias. Kluwer Academic Press, 1986.

    Google Scholar 

  89. L. G. Valiant. Learning fallible deterministic finite automata. Communications of the ACM, 27:1134–1142, 1984.

    Article  Google Scholar 

  90. M. VanHeyningen. The unified computer science technical reports index: Lessons in indexing diverse resources. In Proceedings of the 2nd Int. Conf, on the World Wide Web, 1994.

    Google Scholar 

  91. V. N. Vapnik and Y. A. Chervonenkis. Necessary and sufficient conditions for the uniform convergence of means to their expectations. Theory Probability Applications, 26:532–553, 1981.

    Article  MATH  Google Scholar 

  92. M. Veloso and J. Carbonell. Automatic case generation, storage and retrieval in prodigy. In Proceedings of the First Workshop on Multistrategy Learning, pages 363–377, Harpers Ferry, WV, 1991.

    Google Scholar 

  93. S. Vosniadou and W. F. Brewer. Mental models of the earth: A study of conceptual change in childhood. Cognitive Psychology, 24:535–585, 1992.

    Article  Google Scholar 

  94. T. W. Yan and H. Garcia-Molina. Index structures for selective dissemination of information. Technical Report TRSTAN-CS-92-1454, Stanford University, Stanford, CA, 1992.

    Google Scholar 

  95. O. R. Zaane and J. Han. Resource and knowledge discovery in global information systems: A preliminary design and experiment. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 331–336, Menlo Park, CA, 1995. AAAI Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Maria Teresa Pazienza

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neri, F., Saitta, L. (1997). Machine learning for information extraction. In: Pazienza, M.T. (eds) Information Extraction A Multidisciplinary Approach to an Emerging Information Technology. SCIE 1997. Lecture Notes in Computer Science, vol 1299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63438-X_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-63438-X_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63438-6

  • Online ISBN: 978-3-540-69548-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics