Skip to main content
Log in

A Survey of Methods for Scaling Up Inductive Algorithms

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

One of the defining challenges for the KDD research community is to enable inductive learning algorithms to mine very large databases. This paper summarizes, categorizes, and compares existing work on scaling up inductive algorithms. We concentrate on algorithms that build decision trees and rule sets, in order to provide focus and specific details; the issues and techniques generalize to other types of data mining. We begin with a discussion of important issues related to scaling up. We highlight similarities among scaling techniques by categorizing them into three main approaches. For each approach, we then describe, compare, and contrast the different constituent techniques, drawing on specific examples from published papers. Finally, we use the preceding analysis to suggest how to proceed when dealing with a large problem, and where to focus future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agrawal, R. and Shim, K. 1995. Developing tightly-coupled applications on IBM DB2/CS relational database system: Methodology and experience. Research Report RJ 10005(89094), IBM Corporation.

  • Agrawal, R. and Shim, K. 1996. Developing tightly-coupled data mining applications on a relational database system. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 287-290.

    Google Scholar 

  • Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. Research Report RJ9839, IBM Corporation.

  • Ali, K.M. and Pazzani, M.J. 1996. Error reduction through learning multiple descriptions. Machine Learning, 24(3):173-202.

    Google Scholar 

  • Almuallim, H., Akiba, Y., and Kaneda, S. 1995. On handling tree-structure attributes in decision tree learning. Proceedings of the Twelfth International Conference on Machine Learning. Morgan Kaufmann.

  • Andersen, W., Hendler, J., Evett, M., and Kettler, B. 1994. Massively parallel matching of knowledge structures. In Massively Parallel Artificial Intelligence, H. Kitano and J. Hendler (Eds.). AAAI/MIT Press.

  • Aronis, J., Kolluri, V., Provost, F., and Buchanan, B. 1997. The WoRLD: Knowledge discovery from multiple distributed databases. Proceedings of Florida Artificial Intelligence Research Symposium (FLAIRS-97).

  • Aronis, J. and Provost, F. 1994. Efficiently constructing relational features from background knowledge for inductive machine learning. Working Notes of the AAAI-94 Workshop on Knowledge Discovery in Databases, Seattle, WA.

  • Aronis, J. and Provost, F. 1997. Increasing the efficiency of data mining algorithms with breadth-first marker propagation. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA.

  • Aronis, J., Provost, F., and Buchanan, B. 1996. Exploiting background knowledge in automated discovery. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 355-358.

    Google Scholar 

  • Auer, P., Holte, R., and Maas, W. 1995. Theory and applications of agnostic PAC-learning with small decision trees. Proceedings of the Twelveth International Conference on Machine Learning, pp. 21-29.

  • Blockeel, H., De Raedt, L., Jacobs, N., and Demoen, B. 1999. Scaling up inductive logic programming by learning from interpretations. Data Mining and Knowledge Discovery, 3(1):59-93.

    Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. 1984. Classification and Regression Trees. Wadsworth International Group.

  • Brockhausen, P. and Morik, K. 1996. Direct access of an ILP algorithm to a database management system. Proceedings of MLnet Sponsored Familiarization Workshop Data Mining with Inductive Logic Programming.

  • Buchanan, B. and Feigenbaum, E. 1978. DENDRAL and META-DENDRAL: Their applications dimensions. Artificial Intelligence, 11:5-24.

    Google Scholar 

  • Buchanan, B.G., Smith, D.H., White, W.C., Gritter, R., Feigenbaum, E.A., Lederberg, J., and Djerassi, C. 1976. Applications of artificial intelligence for chemical inference, xxii. automatic rule formation in mass spectrometry by means of the META-DENDRAL program. Journal of the American Chemical Society, 96:6168.

    Google Scholar 

  • Buntine, W. 1991. A theory of learning classification rules. Ph.D. Thesis, School of Computer Science, University of Technology, Sydney, Australia.

    Google Scholar 

  • Catlett, J. 1991a. Megainduction: A test flight. Proceedings of the Eighth International Workshop on Machine Learning, Morgan Kaufmann, pp. 596-599.

  • Catlett, J. 1991b. Megainduction: Machine learning on very large databases. Ph.D. Thesis, School of Computer Science, University of Technology, Sydney, Australia.

    Google Scholar 

  • Chan, P. and Stolfo, S. 1993. Toward parallel and distributed learning by meta-learning. Working Notes AAAI Workshop Knowledge Discovery in Databases, pp. 227-240.

  • Chan, P. and Stolfo, S. 1997. On the accuracy of meta-learning for scalable data mining. Journal of Intelligent Information Systems, 8:5-28.

    Google Scholar 

  • Chan, P. and Stolfo, S. 1998. Towards scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 164-168.

  • Chen, M., Han, J., and Yu, P. 1997. Data mining: An overview from database perspective. IEEE Transactions on Knowledge and Data Engineering.

  • Chen, M.-S. and Yu, P. 1995. Using multi-attribute predicates for mining classification rules. Technical report, IBM Research Report.

  • Clearwater, S., Cheng, T., Hirsh, H., and Buchanan, B. 1989. Incremental batch learning. Proceedings of the Sixth International Workshop on Machine Learning, San Mateo CA: Morgan Kaufmann, pp. 366-370.

    Google Scholar 

  • Clearwater, S. and Provost, F. 1990. RL4: A tool for knowledge-based induction. Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence, IEEE C.S. Press, pp. 24-30.

  • Cohen, W.W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. Thirteenth International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 988-994.

  • Cohen, W.W. 1995. Fast effective rule induction. Proceedings of the Twelfth International Conference on Machine Learning, pp. 115-123.

  • Cook, D. and Holder, L. 1990. Accelerated learning on the connection machine. Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence, San Mateo CA: Morgan Kaufmann, pp. 366-370.

    Google Scholar 

  • Craven, M.W. 1996. Extracting Comprehensible Models from Trained Neural Networks. Ph.D. Thesis, University of Wisconson-Madison. Technical Report No. 1326.

  • Danyluk, A. and Provost, F. 1993. Small disjuncts in action: Learning to diagnose errors in the telephone network local loop. In Machine Learning: Proceedings of the Tenth International Conference, P. Utgoff (Ed.). Morgan Kaufmann Publishers, Inc., pp. 81-88.

  • DesJardins, M. and Gordon, D.F. 1995. Special issue on bias evaluation and selection. Machine Learning, 20(1/2).

  • Devijver, P.A. and Kittler, J. 1982. Pattern Recognition: A Statistical Approach. Prentice Hall.

  • Dietterich, T.G. 1997. Machine learning research: Four current directions. AI Magazine, 18(4):97-136.

    Google Scholar 

  • Domingos, P. 1996a. Efficient specific-to-general rule induction. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 319-322.

    Google Scholar 

  • Domingos, P. 1996b. Linear time rule induction. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 96-101.

    Google Scholar 

  • Domingos, P. 1997. Knowledge acquisition from examples via multiple models. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML-97), D.H. Fisher (Ed.). San Francisco, CA: Morgan Kaufmann, pp. 98-106.

    Google Scholar 

  • Duda, R.O. and Hart, P.E. 1973. Pattern Classification and Scene Analysis. New York: John Wiley.

    Google Scholar 

  • Evett, M. 1994. PARKA: A System for Massively Parallel Knowledge Representation. Ph.D. Thesis, Department of Computer Science, University of Maryland, College Park, Maryland.

    Google Scholar 

  • Fawcett, T. and Provost, F. 1997. Adaptive fraud detection. Data Mining and Knowledge Discovery, 1(3):291-316.

    Google Scholar 

  • Fayyad, U. 1997. Editorial. Data Mining and Knowledge Discovery, 1(1):5-10.

    Google Scholar 

  • Fayyad, U., Haussler, D., and Stolorz, P. 1996. KDD for science data analysis: Issues and examples. Proceedings of the Second International Conference on Data Mining and Knowledge Discovery, Menlo Park, CA: AAAI Press, pp. 50-56.

    Google Scholar 

  • Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. 1996a. Knowledge discovery and data mining: Towards a unifying framework. Proceedings of the Second International Conference on Data Mining and Knowledge Discovery, Menlo Park, CA: AAAI Press, pp. 82-88.

    Google Scholar 

  • Fayyad, U.M., Piatetsky-Shapiro, G., and Smyth, P. 1996b. From data mining to knowledge discovery: An overview. In Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthursamy (Eds.). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Fayyad, U., Weir, N., and Djorgovski, S. 1993. SKICAT: A machine learning system for automated cataloging of large scale sky surveys. Proceedings of the Tenth International Conference on Machine Learning, Morgan Kaufmann.

  • Fox, E.A., Akscyn, R.M., Furuta, R., and Legsett, J. 1995. Communications of the ACM, 38(4), Morgan Kaufmann.

  • Freitas, A. and Lavington, S. 1996. Using SQL primitives and parallel DB servers to speed up knowledge discovery in large relational databases. Cybernetics and Systems'96: Proceedings of the Thirteenth European Meeting on Cybernetics and Systems Research, pp. 955-960.

  • Freitas, A.A. and Lavington, S.H. 1997. Mining Very Large Databases with Parallel Processing. Norwell, MA: Kluwer Academic Publishers.

    Google Scholar 

  • Frey, L.J. and Fisher, D.H. 1999. Modeling decision tree performance with the power law. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, D. Heckerman and J. Whittaker (Eds.). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  • Friedman, J.H. 1997. Data mining and statistics: What's the connection? Proceedings of the 29th Symposium on the Interface Between Computer Science and Statistics.

  • Fürnkranz, J. 1998. Integrative windowing. Journal of Artificial Intelligence Research, 8:129-164.

    Google Scholar 

  • Fürnkranz, J. and Widmer, G. 1994. Incremental reduced error pruning. Proceedings of the Eleventh International Machine Learning Conference, New Brunswick: Morgan Kaufmann.

    Google Scholar 

  • Gaines, B. 1989. An ounce of knowledge is worth a ton of data: Quantitative studies of the trade-off between expertise and data based on statistically well-founded empirical induction. Proceedings of the Sixth International Workshop on Machine Learning, San Mateo, CA: Morgan Kaufmann, pp. 156-159.

    Google Scholar 

  • Galal, G., Cook, D.J., and Holder, L. 1999. Exploiting parallelism in a scientific discovery system to improve scalability. Journal of the American Society for Information Science. (In press).

  • Gehrke, J., Ramakrishnan, R., and Ganti, V. 1998. Rainforest—A framework for fast decision tree construction of large datasets. Proceedings of the Twenty-Fourth International Conference on Very Large Data Bases, New York, NY.

  • Graefe, G., Fayyad, U., and Chaudhuri, S. 1998. On the efficient gathering of sufficient statistics for classification of large SQL databases. Proceedings of the Fourth International Conference on Knowledge Discovery and Data-Mining, New York, NY: AAAI Press.

    Google Scholar 

  • Grossman, R. and Bailey, S. (1998). A tutorial introduction to high performance data mining. Tutorial given at the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98).

  • Guo, Y. and Sutiwaraphun, J. 1998. Knowledge probing in distributed data mining. Working Notes of the KDD-97 Workshop on Distributed Data Mining, pp. 61-69.

  • Haines, T.L. 1998. Private communication.

  • Hall, L.O., Chawla, N., and Bowyer, K.W. 1998. Combining decision trees learned in parallel. Working Notes of the KDD-97 Workshop on Distributed Data Mining, pp. 10-15.

  • Han, J., Fu, Y., Wang, W., Chiang, J., Gong, W., Koperski, K., Li, D., Lu, Y., Rajan, A., Stefanovic, N., Xia, B., and Zaiane, O.R. 1996. DB Miner: A system for mining knowledge in large relational databases. Proceedings of the Second International Conference on Data Mining and Knowledge Discovery, Menlo Park, CA: AAAI Press, pp. 250-255.

    Google Scholar 

  • Harris-Jones, C. and Haines, T.L. 1997. Sample size and misclassification: Is more always better? Working Paper AMSCAT-WP-97-118, AMS Center for Advanced Technologies.

  • Haussler, D. 1988. Quantifiying inductive bias: AI learning algorithms and Valiant's learning framework. Artificial Intelligence, 36:177-221.

    Google Scholar 

  • Hayes, P. 1979. The logic of frames. In Frame Conceptions and Text Understanding, D. Metzing (Ed.). de Gruyter, pp. 46-61.

  • Holsheimer, M., Kersten, M., and Siebes, A. 1996. Data Surveyor: Searching the nuggets in parallel. In Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthursamy (Eds.). Menlo Park: AAAI Press, pp. 447-467.

    Google Scholar 

  • Holte, R. 1993. Very simple classification rules perform well on most commonly used datasets. Machine Learning, 3:63-91.

    Google Scholar 

  • Holte, R., Acker, L., and Porter, B. 1989. Concept learning and the problem of small disjuncts. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, San Mateo, CA: Morgan Kaufmann, pp. 813-818.

    Google Scholar 

  • Huber, P. 1997. From large to huge: A statistician's reaction to KDD and DM. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 304-308.

    Google Scholar 

  • Iba, W. and Langley, P. 1992. Induction of one-level decision trees. Proceedings of Ninth International Conference on Machine Learning, Morgan Kaufmann, pp. 233-240

  • Jensen, D. 1998. Private communication.

  • Jensen, D. and Cohen, P.R. 1999. Multiple comparisons in induction algorithms. Machine Learning, to appear.

  • John, G. and Langley, P. 1996. Static versus dynamic sampling for data mining. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, pp. 367-370.

  • John, G. and Lent, B. 1997. SIPping from the data firehose. Proceedings of Third International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 199-202.

    Google Scholar 

  • Kargupta, H. and Chan, P. (Eds.). 1998. KDD-98 Workshop on Distributed Data Mining.

  • Kargupta, H., Johnson, E., Sanseverino, E.R., Park, B.-H., Silvestre, L.D., and Hershberger, D. 1998. Scalable data mining from distributed, vertically partitioned feature space using collective mining and gene expression based genetic algorithms. Working Notes of the KDD-97 Workshop on Distributed Data Mining, pp. 70-91. http://www.eecs.wsu.edu/~hillol/pubs/bodhi.ps.Z.

  • Karp, P.D. and Paley, S.M. 1995. Knowledge representation in the large. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 751-758.

    Google Scholar 

  • Karp, P.D., Paley, S.M., and Greenberg, I. 1994. A storage system for scalable knowledge representation. Proceedings of the Third International Conference on Information and Knowledge Management.

  • Kaufman, K. and Michalski, R. 1996. A method for reasoning with structured and continuous attributes in the INLEN-2 knowledge discovery system. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 232-237.

    Google Scholar 

  • Kearns, M. 1993. Efficient noise-tolerant learning from statistical queries. Proceedings of the Twenty-Fifth ACM Symposium on the Theory of Computing, New York, NY: ACM Press, pp. 392-401.

    Google Scholar 

  • Kohavi, R. 1996. Wrappers for Performance Enhancement and Oblivious Decision Graphs. Ph.D. Thesis, Dept. of Computer Science, Stanford University, Palo Alto, CA.

    Google Scholar 

  • Kohavi, R. 1998. Crossing the chasm: From academic machine learning to commercial data mining. Invited talk for the Fifteenth International Conference on Machine Learning.

  • Kohavi, R. and John, G. 1997. Wrappers for feature subset selection. Artificial Intelligence, 97(1–2):273-324.

    Google Scholar 

  • Kohavi, R. and Sommerfield, D. 1995. Feature subset selection using the wrapper model: Overfitting and dynamic search space topology. Proceedings of the First International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Kononenko, I. 1994. Estimating attributes: Analysis and extensions of Relief. In Proceedings of the European Conference on Machine Learning, F. Bergadano and L.D. Raedt (Eds.).

  • Kononenko, I., Simec, E., and Robnik-Sikonja, M. (1997). Overcoming the myopia of inductive learning algorithms with RELIEFF. Applied Intelligence, 7:39-55.

    Google Scholar 

  • Kufrin, R. 1997. Generating C4.5 production rules in parallel. Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), Menlo Park, CA: AAAI Press, pp. 565-670.

    Google Scholar 

  • Kumar, V. and Rao, V. 1987. Parallel depth-first search, part 2: Analysis. International Journal of Parallel Programming, 16:501-519.

    Google Scholar 

  • Lathrop, R., Webster, T., Smith, T., and Winston, P. 1990. ARIEL: A massively parallel symbolic learning assistant for protein structure/function. In AI at MIT: Expanding Frontiers, P. Winston and S. Shellard (Eds.). Cambridge, MA, MIT Press.

    Google Scholar 

  • Li, B. (1998). Free Parallel Data Mining. Ph.D. Thesis, Department of Computer Science, New York University.

  • Lim, T.-J., Loh, W.-Y., and Shih, Y.-S. 1999. A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, to appear.

  • Littlestone, N. 1988. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285-318.

    Google Scholar 

  • Mehta, M., Agrawal, R., and Rissanen, J. 1996. SLIQ: A fast scalable classifier for data mining. Proceedings of the Fifth International Conference on Extending Database Technology (EDBT), Avignon, France.

  • Merz, C.J. and Murphy, P.M. 1997. UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.

  • Miller, A. 1990. Subset Selection in Regression. Chapman and Hall.

  • Mitchell, T. 1982. Generalization as search. Artificial Intelligence, 18(2):203-226.

    Google Scholar 

  • Mitchell, T.M. 1980. The need for biases in learning generalizations. Technical Report Report CBM-TR-117, New Brunswick, NJ: Rutgers University.

    Google Scholar 

  • Moore, A. and Lee, M. 1994. Efficient algorithms for minimizing cross validation error. Proceedings of the Eleventh International Conference on Machine Learning, Morgan Kaufmann.

  • Moore, A. and Lee, M. 1998. Cached sufficient statistics for efficient machine learning with large datasets. Journal of Artificial Intelligence Research, 8:67-91.

    Google Scholar 

  • Muggleton, S. 1992. Inductive Logic Programming. London: Academic Press Ltd.

    Google Scholar 

  • Musick, R. 1998. Supporting large-scale computational science. Technical Report UCRL-ID-129903, Center for Applied Scientific Computing, Lawrence Livermore National Lab.

  • Musick, R., Catlett, J., and Russell, S. 1993. Decision theoretic subsampling for induction on large databases. Proceedings of the Tenth International Conference on Machine Learning, San Mateo, CA: Morgan Kaufmann, pp. 212-219.

    Google Scholar 

  • Oates, T. and Jensen, D. 1997. The effects of training set size on decision tree complexity. In Machine Learning: Proceedings of the Fourteenth International Conference, D. Fisher (Ed.). Morgan Kaufmann, pp. 254-262.

  • Oates, T. and Jensen, D. 1998. Large data sets lead to overly complex models: An explanation and a solution. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-99), R. Agrawal and P. Stolorz (Eds.). Menlo Park, CA: AAAI Press, pp. 294-298.

    Google Scholar 

  • Pagallo, G. and Haussler, D. 1990. Boolean feature discovery in empirical learning. Machine Learning, 5:71-99.

    Google Scholar 

  • Piatetsky-Shapiro, G., Brachman, R., Khabaza, T., Kloesgen, W., and Simoudis, E. 1996. An overview of issues in developing industrial data mining and knowledge discovery applications. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 89-95.

    Google Scholar 

  • Prodromidis, A.L. and Stolfo, S.J. 1998. Pruning meta-classifiers in a distributed data mining system. Working Notes of the KDD-97 Workshop on Distributed Data Mining, pp. 22-30.

  • Provost, F. 1992. Policies for the Selection of Bias in Inductive Machine Learning. Ph.D. Thesis, Department of Computer Science, University of Pittsburgh, Pittsburgh, PA.

    Google Scholar 

  • Provost, F.J. 1993. Iterative weakening: Optimal and near-optimal policies for the selection of search bias. Proceedings of the Eleventh National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 749-755.

    Google Scholar 

  • Provost, F. and Aronis, J. 1996. Scaling up inductive learning with massive parallelism. Machine Learning, 23:33-46.

    Google Scholar 

  • Provost, F. and Buchanan, B. 1995. Inductive policy: The pragmatics of bias selection. Machine Learning, 20:35-61.

    Google Scholar 

  • Provost, F.J. and Hennessy, D. 1994. Distributed machine learning: Scaling up with coarse-grained parallelism. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology.

  • Provost, F. and Hennessy, D. 1996. Scaling up: Distributed machine learning with cooperation. Proceedings of the Thirteenth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Provost, F., Jensen, D., and Oates, T. 1999. Efficient progressive sampling. To appear in the ACMSLGKDD Fifth International Conference on Knowledge Discovery and Data Mining.

  • Provost, F. and Kolluri, V. 1997a. Scaling up inductive algorithms: An overview. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press, pp. 239-242.

    Google Scholar 

  • Provost, F. and Kolluri, V. 1997b. A survey of methods for scaling up inductive learning. Technical Report ISL-97-3, Pittsburgh, PA: Intelligent Systems Laboratory, University of Pittsburgh. http://www.pitt.edu/~uxkst/surveypaper.ps.

    Google Scholar 

  • Quinlan, J. 1983. Learning efficient classification procedures and their application to chess endgames. In Machine Learning: An AI approach, R. Michalski, C.J., and T. Mitchell (Eds.). Los Altos, CA: Morgan Kaufmann.

    Google Scholar 

  • Quinlan, J.R. 1986. Induction of decision trees. Machine Learning, 1:81-106.

    Google Scholar 

  • Quinlan, J. 1987. Simplifying decision trees. International Journal of Man-Machine Studies, 27:221-234.

    Google Scholar 

  • Quinlan, J.R. 1993. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Rao, V. and Kumar, V. 1987. Parallel depth-first search, part 1: Implementation. International Journal of Parallel Programming, 16:479-499.

    Google Scholar 

  • Ribeiro, J., Kaufmann, K., and Kerschberg, K. 1995. Knowledge discovery from multiple databases. Proceedings of the First International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, pp. 240-245.

    Google Scholar 

  • Riddle, P., Segal, R., and Etzioni, O. 1994. Representation design and brute-force induction in a Boeing manufacturing domain. Applied Artificial Intelligence, 8:125-147.

    Google Scholar 

  • Rymon, R. 1993. An SE-tree based characterization of the induction problem. Proceedings of the Tenth International Conference on Machine Learning. Morgan Kaufmann.

  • Sarawagi, S., Thomas, S., and Agrawal, R. 1998. Integrating association rule mining with relational database systems: Alternatives and implications. Proceedings of the ACM SIGMOD International Conference on Management of Data.

  • Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. Proceedings of Twenty-First International Conference on Very Large Data Bases, Morgan Kaufmann, pp. 432-444.

  • Schlimmer, J.C. 1993. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proceedings of the Tenth International Conference on Machine Learning, P. Utgoff (Ed.). San Mateo, CA: Morgan Kaufmann, pp. 284-290.

    Google Scholar 

  • Segal, R. and Etzioni, O. 1994. Learning decision lists using homogeneous rules. Proceedings of the Twelfth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 619-625.

    Google Scholar 

  • Shafer, J., Agrawal, R., and Mehta, M. 1996. SPRINT: A scalable parallel classifier for data mining. Proceedings of the Twenty-Second International Conference on Very Large Data Bases, Mumbai, India.

  • Shasha, D. 1998. PC4.5. http://merv.cs.nyu.edu:8001/~binli/pc4.5.

  • Shavlik, J.W., Mooney, R.J., and Towell, G.G. 1991. An experimental comparison of symbolic and connectionist learning algorithms. Machine Learning, 6(2):111-143.

    Google Scholar 

  • Simon, H. and Lea, G. 1973. Problem solving and rule induction: A unified view. In Knowledge and Cognition, Gregg (Ed.). New Jersey: Lawrence Erlbaum Associates, pp. 105-127.

    Google Scholar 

  • Smyth, P. and Goodman, R. 1992. An information theoretic approach to rule induction from databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301-316.

    Google Scholar 

  • Srikant, R. and Agrawal, R. 1996. Mining quantitative association rules in large relational tables. Proceedings of the ACM SIGMOD Conference on Management of Data, Montreal.

  • Stolfo, S. 1998. http://www.cs.columbia.edu/~sal/JAM/PROJECT.

  • Stolfo, S., Fan, D., Lee, W., Prodromidis, A., and Chan, P. 1997. Credit card fraud detection using meta-learning: Issues and initial results. Proceedings of the AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management (AAAI Technical Report WS-97-07), Menlo Park: CA: AAAI Press, pp. 83-90.

    Google Scholar 

  • Stolfo, S., Prodromidis, A., Tselepis, S., Fan, D., Lee, W., and Chan, P. 1997. JAM: Java agents for meta-learning over distributed databases. Proceedings of the AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management (AAAI Technical Report WS-97-07), Menlo Park: CA: AAAI Press, pp. 91-98.

    Google Scholar 

  • Toivonen, H. 1996. Sampling large databases for association rules. Proceedings of the Twenty-fourth International Conference on Very Large Data Bases.

  • Utgoff, P. 1989. Incremental induction of decision trees. Machine Learning, 4:161-186.

    Google Scholar 

  • Valiant, L.G. 1984. A theory of the learnable. Communications of the ACM, 27(11):1134-1142.

    Google Scholar 

  • Webb, G. 1995. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:383-417.

    Google Scholar 

  • Weiss, S.M., Galen, R.S., and Tadepalli, P.V. 1990. Maximizing the predictive value of production rules. Artificial Intelligence, 45:47-71.

    Google Scholar 

  • Wettschereck, D., Aha, D.W., and Mohri, T. 1997. A review and comparative evaluation of feature weighting methods for lazy learning algorithms. Artificial Intelligence Review, 10:1-37. Also Technical Report AIC-95-012, Naval Research Laboratory.

    Google Scholar 

  • Wettschereck, D. and Dietterich, T.G. 1995. An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Machine Learning, 19(1):5-28.

    Google Scholar 

  • Williams, G. 1990. Inducing and Combining Multiple Decision Trees. Ph.D. Thesis, Australian National University, Canberra, Australia.

    Google Scholar 

  • Wu, X. and Lo, W.H. 1998. Multi-layer incremental induction. Proceedings of the Fifth Pacific Rim International Conference on Artificial Intelligence, Springer-Verlag, pp. 24-32.

  • Zaki, M. 1998. Scalable Data Mining for Rules. Ph.D. Thesis, Department of Computer Science, University of Rochester, Rochester, NY.

    Google Scholar 

  • Zaki, M.J., Ho, C., and Agrawal, R. 1999. Scalable parallel classification for data mining on shared memory multiprocessors. Proceedings of IEEE International Conference on Data Engineering.

  • Zaki, M.J., Parthasarathy, S., Li, W., and Ogihara, M. 1997. Evaluation of sampling for data mining of association rules. Proceedings of the Seventh International Workshop on Research Issues in Data Engineering.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Provost, F., Kolluri, V. A Survey of Methods for Scaling Up Inductive Algorithms. Data Mining and Knowledge Discovery 3, 131–169 (1999). https://doi.org/10.1023/A:1009876119989

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009876119989

Navigation