skip to main content
10.1145/1807167.1807335acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
tutorial

Database systems research on data mining

Published:06 June 2010Publication History

ABSTRACT

Data mining remains an important research area in database systems. We present a review of processing alternatives, storage mechanisms, algorithms, data structures and optimizations that enable data mining on large data sets. We focus on the computation of well-known multidimensional statistical and machine learning models. We pay particular attention to SQL and MapReduce as two competing technologies for large scale processing. We conclude with a summary of solved major problems and open research issues.

References

  1. J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco, 2nd edition, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Ordonez. Integrating K-means clustering with a relational DBMS using SQL. IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(2):188--201, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Ordonez. Statistical model computation with UDFs. IEEE Transactions on Knowledge and Data Engineering (TKDE), 22, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Stonebraker, D. Abadi, D.J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. MapReduce and parallel DBMSs: friends or foes? Commun. ACM, 53(1):64--71, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Database systems research on data mining

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
      June 2010
      1286 pages
      ISBN:9781450300322
      DOI:10.1145/1807167

      Copyright © 2010 Copyright is held by the owner/author(s)

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 June 2010

      Check for updates

      Qualifiers

      • tutorial

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader