skip to main content
10.1145/3207677.3277974acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaeConference Proceedingsconference-collections
research-article

An Original Data Understanding Process

Authors Info & Claims
Published:22 October 2018Publication History

ABSTRACT

The data mining1 standard process divides a data mining project into six phases, i.e. business understanding, data understanding, data preparation, modeling, evaluation and deployment. The goal of the data understanding phase is to understand the original data. At present, there are relatively few studies on this phase. In practical applications, some visualization methods are usually used to understand the original data. Therefore, we propose a systematic process for data understanding, and make full use of visualization technology to help users understand the data. In addition, we revise the DP (Density Peaks) algorithm to identify the high-density region, and integrate it into the data understanding process. The experimental results show that the data understanding process proposed in this paper is effective.

References

  1. Chapman P, Kerber R, Clinton J, et al. 1999. The CRISP-DM Process Model. Prodeedings of Fmoods.Google ScholarGoogle Scholar
  2. Rodriguez A and Laio A. 2014. Clustering by fast search and find of density peaks. Science, 344(6191): 1492.Google ScholarGoogle ScholarCross RefCross Ref
  3. Chen P, Fan X, Liu R, et al. 2015. Fiber segmentation using a density-peaks clustering algorithm. IEEE International Symposium on Biomedical Imaging, 633--637.Google ScholarGoogle Scholar
  4. Liu D, Cheng S F and Yang Y. 2015. Density Peaks Clustering Approach for Discovering Demand Hot Spots in City-scale Taxi Fleet Dataset. International Conference on Intelligent Transportation Systems, IEEE, 1831--1836 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Liu P, Liu Y, Hou X, et al. 2016. A Text Clustering Algorithm Based on Find of Density Peaks. International Conference on Information Technology in Medicine and Education, 348--352. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Li Y, Liu W, Wang Y, et al. 2015. Co-spectral clustering based density peak. IEEE International Conference on Communication Technology, 925--929.Google ScholarGoogle Scholar

Index Terms

  1. An Original Data Understanding Process

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CSAE '18: Proceedings of the 2nd International Conference on Computer Science and Application Engineering
      October 2018
      1083 pages
      ISBN:9781450365123
      DOI:10.1145/3207677

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      CSAE '18 Paper Acceptance Rate189of383submissions,49%Overall Acceptance Rate368of770submissions,48%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader