Skip to main content

Data Mining as an Automated Service

  • Conference paper
  • First Online:
  • 1167 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2637))

Abstract

An automated data mining service offers an out-sourced, cost-effective analysis option for clients desiring to leverage their data resources for decision support and operational improvement. In the context of the service model, typically the client provides the service with data and other information likely to aid in the analysis process (e.g. domain knowledge, etc.). In return, the service provides analysis results to the client. We describe the required processes, issues, and challenges in automating the data mining and analysis process when the high-level goals are: (1) to provide the client with a high quality, pertinent analysis result; and (2) to automate the data mining service, minimizing the amount of human analyst effort required and the cost of delivering the service. We argue that by focusing on client problems within market sectors, both of these goals may be realized.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Mining association rules between sets of items in large databases. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 207–216, Washington, D.C., May 1993.

    Google Scholar 

  2. J. D. Becher, P. Berkhin, and E. Freeman. Automating exploratory data analysis for efficient mining. In Proc. of the Sixth ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD-2000), pages 424–429, Boston, MA, 2000.

    Google Scholar 

  3. P. S. Bradley and U. M. Fayyad. Refining initial points for K-Means clustering. In Proc. 15th International Conf. on Machine Learning, pages 91–99. Morgan Kaufmann, San Francisco, CA, 1998.

    Google Scholar 

  4. P. S. Bradley, J. Gehrke, R. Ramakrishnan, and R. Srikant. Scaling mining algorithms to large databases. Comm. of the ACM, 45(8):38–43, 2002.

    Article  Google Scholar 

  5. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, Belmont, 1984.

    MATH  Google Scholar 

  6. C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998.

    Article  Google Scholar 

  7. I. V. Cadez and P. S. Bradley. Model based population tracking and automatic detection of distribution changes. In Proc. Neural Information Processing Systems 2001, 2001.

    Google Scholar 

  8. D. M. Chickering. Personal communication, January 2003.

    Google Scholar 

  9. CRISP-DM Consortium. Cross industry standard process for data mining (crispdm). http://www.crisp-dm.org/.

  10. Microsoft Corp. Introduction to ole db for data mining. http://www.microsoft.com/data/oledb/dm.htm.

  11. R. Duda, P. Hart, and D. Stork. Pattern classification. John Wiley & Sons, New York, 2000.

    Google Scholar 

  12. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurasamy. Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA, 1996.

    Google Scholar 

  13. Data Mining Group. Pmml version 2.0. http://www.dmg.org/index.htm.

  14. S. Guha, R. Rastogi, and K. Shim. Cure: An efficient clustering algorithm for large databases. In Proc. ACM SIGMOD Intl. Conf. on Management of Data, pages 73–84, New York, 1998. ACM Press.

    Google Scholar 

  15. A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.

    Google Scholar 

  16. Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Efficient algorithms for discovering association rules. In Usama M. Fayyad and Ramasamy Uthurusamy, editors, AAAI Workshop on Knowledge Discovery in Databases (KDD-94), pages 181–192, Seattle, Washington, 1994. AAAI Press.

    Google Scholar 

  17. Nimrod Megiddo and Ramakrishnan Srikant. Discovering predictive association rules. In Knowledge Discovery and Data Mining, pages 274–278, 1998.

    Google Scholar 

  18. Sreerama K. Murthy. Automatic construction of decision trees from data: A multidisciplinary survey. Data Mining and Knowledge Discovery, 2(4):345–389, 1998.

    Article  Google Scholar 

  19. M. T. Oguz. Strategic intelligence: Business intelligence in competitive strategy. DM Review, August 2002.

    Google Scholar 

  20. Clark F. Olson. Parallel algorithms for hierarchical clustering. Parallel Computing, 21(8): 1313–1325, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  21. G. Piatetsky-Shapiro. Personal communication, January 2003.

    Google Scholar 

  22. Foster J. Provost and Tom Fawcett. Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In Knowledge Discovery and Data Mining, pages 43–48, 1997.

    Google Scholar 

  23. D. Pyle. Data Preparation for Data Mining. Morgan Kaufmann, San Francisco, CA, 1999.

    Google Scholar 

  24. Padhraic Smyth. Clustering using monte carlo cross-validation. In Knowledge Discovery and Data Mining, pages 126–133, 1996.

    Google Scholar 

  25. M. Stone. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, 36:111–147, 1974.

    MATH  Google Scholar 

  26. D. E. Weisman and C. Buss. Database functionality high, analytics lags, September 28, 2001. Forrester Brief: Business Technographics North America.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bradley, P.S. (2003). Data Mining as an Automated Service. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-36175-8_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-04760-5

  • Online ISBN: 978-3-540-36175-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics