ABSTRACT
The data mining1 standard process divides a data mining project into six phases, i.e. business understanding, data understanding, data preparation, modeling, evaluation and deployment. The goal of the data understanding phase is to understand the original data. At present, there are relatively few studies on this phase. In practical applications, some visualization methods are usually used to understand the original data. Therefore, we propose a systematic process for data understanding, and make full use of visualization technology to help users understand the data. In addition, we revise the DP (Density Peaks) algorithm to identify the high-density region, and integrate it into the data understanding process. The experimental results show that the data understanding process proposed in this paper is effective.
- Chapman P, Kerber R, Clinton J, et al. 1999. The CRISP-DM Process Model. Prodeedings of Fmoods.Google Scholar
- Rodriguez A and Laio A. 2014. Clustering by fast search and find of density peaks. Science, 344(6191): 1492.Google ScholarCross Ref
- Chen P, Fan X, Liu R, et al. 2015. Fiber segmentation using a density-peaks clustering algorithm. IEEE International Symposium on Biomedical Imaging, 633--637.Google Scholar
- Liu D, Cheng S F and Yang Y. 2015. Density Peaks Clustering Approach for Discovering Demand Hot Spots in City-scale Taxi Fleet Dataset. International Conference on Intelligent Transportation Systems, IEEE, 1831--1836 Google ScholarDigital Library
- Liu P, Liu Y, Hou X, et al. 2016. A Text Clustering Algorithm Based on Find of Density Peaks. International Conference on Information Technology in Medicine and Education, 348--352. Google ScholarDigital Library
- Li Y, Liu W, Wang Y, et al. 2015. Co-spectral clustering based density peak. IEEE International Conference on Communication Technology, 925--929.Google Scholar
Index Terms
- An Original Data Understanding Process
Recommendations
Automatic data understanding: a necessity of intelligent communication
ICAISC'10: Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part IIIn this paper aspects of intelligent man-machine communication are considered. Selected topics are deemed from a perspective of data processing. Human beings expect that data processing is under a kind of intelligent control. This expectation forms ...
Creating Values for Big Data Analytics through Business and Technology Alignment
Advances in Visual InformaticsAbstractMeaningful insights are the most important outcome of a big data analytics project (BDA). As the BDA project has been widely used to facilitate business decisions, many organizations focus on gaining valuable insights into their business ...
Comparing Understanding and Memorization in Physicalization and VR Visualization
TEI '21: Proceedings of the Fifteenth International Conference on Tangible, Embedded, and Embodied InteractionWe investigate whether presenting data in a VR visualization or as a physicalization impacts understanding and recollection. Two equivalent representations of the same data set, one in physical form and one in VR, were created. Participants answered ...
Comments