research-article

Restricted Randomness DBSCAN : A faster DBSCAN Algorithm

Authors:
Sashakt Pathak

Jaypee Institute of Information Technology, India

Jaypee Institute of Information Technology, India
View Profile

,
Arushi Agarwal

Jaypee Institute of Information Technology, India

Jaypee Institute of Information Technology, India
View Profile

,
Ankita Ankita

Jaypee Institute of Information Technology, India

Jaypee Institute of Information Technology, India
View Profile

,
Mahendra Kumar Gurve

Jaypee Institute of Information Technology, India

Jaypee Institute of Information Technology, India
View Profile

IC3-2021: Proceedings of the 2021 Thirteenth International Conference on Contemporary ComputingAugust 2021Pages 7–12https://doi.org/10.1145/3474124.3474204

Published:04 November 2021Publication History

IC3-2021: Proceedings of the 2021 Thirteenth International Conference on Contemporary Computing

Pages 7–12

ABSTRACT

Data Mining is the process of extracting useful and accurate information or patterns from large databases using different algorithms and methods of machine learning. To analyze the data, Clustering is one of the methods in which similar data is grouped together and DBSCAN clustering algorithm is the one, which is broadly used in numerous practical applications. This paper presents a more efficient density based clustering algorithm, which has the ability to discover cluster faster than the existing DBSCAN algorithm. The efficiency is achieved by restricting the randomness of choosing points from the dataset. Our proposed algorithm named Restricted Randomness DBSCAN (RR DBSCAN) is compared with conventional DBSCAN algorithm over 9 datasets on the basis of Silhouette Coefficient, Time taken in formation of clusters and accuracy. The results show that RR DBSCAN performs better than traditional DBSCAN in terms of accuracy and time taken to form clusters.

References

[1] Phyu, Thair Nu, ”Survey of classification techniques in data mining.” Proceedings of the International MultiConference of Engineers and Computer Scientists. Vol. 1. 2009.Google Scholar
[2] Kotsiantis, Sotiris, and Dimitris Kanellopoulos, ”Association rules mining: A recent overview.” GESTS International Transactions on Computer Science and Engineering 32.1 (2006): 71-82.Google Scholar
[3] Berkhin, Pavel. ”A survey of clustering data mining techniques.” Grouping multidimensional data. Springer, Berlin, Heidelberg, 2006. 25-71.Google ScholarCross Ref
[4] McCarty, John A., and Manoj Hastak. ”Segmentation approaches in data-mining: A comparison of RFM, CHAID, and logistic regression.” Journal of business research 60.6 (2007): 656-662.Google ScholarCross Ref
[5] Kamalpreet K. Jassar, Kanwalvir S. Dhindsa, ”Comparative study of spatial data mining techniques” International Journal of computer applications, 2015.Google Scholar
[6] X.Y. Wang, J.M. Garibaldi, ”A comparison of fuzzy and non-fuzzy clustering techniques in cancer diagnosis”The University of Nottingham.Google Scholar
[7] Ting Liu, Charles Rosenberg, henry A. Rowley, ”Clustering billions of images with large scale nearest neighbor search”, IEEE workshop on applications ofComputer Vision, 2007.Google Scholar
[8] Kemal Akkaya, Fatih Senel, Brian McLaughlan, “Clustering of Wireless Sensor and actor networks based on sensor distribution and inter-actor connectivity”. Journal of Parallel and Distributed Computing 69.6 (2009): 573-587.Google ScholarDigital Library
[9] Koutsoukas, Alexios, et al. ”From in silico target prediction to multi-target drug design: current databases, methods and applications.” Journal of proteomics 74.12 (2011): 2554-2574.Google ScholarCross Ref
[10] Johnson, Stephen C. ”Hierarchical clustering schemes.” Psychometrika 32.3 (1967): 241-254.Google ScholarCross Ref
[11] Sousa, Lúcia & Gama, João. The Application of Hierarchical Clustering Algorithms for Recognition Using Biometrics of the Hand. International Journal of Advanced Engineering Research and Science (IJAERS). ISSN: 2349-6495.Google Scholar
[12] Martin Ester, Hans P. Kriegel, J. Sander, Xiaowei Xu, ”A density based algorithm for discovering clusters in large spatial database with noise”, KDD-96.Google Scholar
[13] Fu X., Wang Y., Ge Y., Chen P., Teng S. Research and Application of DBSCAN Algorithm Based on Hadoop Platform. In: Zu Q., Vargas-Vera M., Hu B. (eds) Pervasive Computing and the Networked World. ICPCA/SWS 2013. Lecture Notes in Computer Science, vol 8351. Springer, Cham.Google Scholar
[14] Pappas, Thrasyvoulos N. ”An adaptive clustering algorithm for image segmentation.” IEEE Transactions on signal processing40.4 (1992): 901-914.Google ScholarDigital Library
[15] Agrawal, R., Gehrke, J., Gunopulos, D., & Raghavan, P. ”Automatic subspace clustering of high dimensional data for data mining applications” (Vol. 27, No. 2, pp. 94-105). ACM.Google Scholar
[16] Hartigan, John A., and Manchek A. Wong. ”Algorithm AS 136: A k-means clustering algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics) 28.1 (1979): 100-108.Google Scholar
[17] Ray, Siddheswar, and Rose H. Turi. ”Determination of number of clusters in k-means clustering and application in colour image segmentation.” Proceedings of the 4th international conference on advances in pattern recognition and digital techniques. 1999.Google Scholar
[18] J.Sander, “Density Based Clustering ”,in Enclyclopedia of machine learning, Springer, 2011, pp.270-273.Google Scholar
[19] Ankita, Thakur Manish K. ”Modified DBSCAN Using Particle Swarm Optimization for Spatial Hotspot Identification.” 2018 Eleventh International Conference on Contemporary Computing (IC3). IEEE, 2018.Google Scholar
[20] Martino, F. D., and Sessa, S., “A fuzzy particle swarm optimization algorithm and its application to hotspot events in spatial analysis,” J Ambient Intell Human Comput, Springer, 2013.Google ScholarCross Ref
[21] Smiti, Abir, and Zied Eloudi. ”Soft dbscan: Improving dbscan clustering method using fuzzy set theory.” 2013 6th International Conference on Human System Interactions (HSI). IEEE, 2013.Google Scholar
[22] Ienco, Dino, and Gloria Bordogna. ”Fuzzy extensions of the DBScan clustering algorithm.” Soft Computing 22.5 (2018): 1719-1730.Google ScholarDigital Library
[23] Viswanath, P., and Rajwala Pinkesh. ”l-dbscan: A fast hybrid density based clustering method.” 18th International Conference on Pattern Recognition (ICPR’06). Vol. 1. IEEE, 2006.Google ScholarDigital Library
[24] Liu, Bing. ”A fast density-based clustering algorithm for large databases.” 2006 International Conference on Machine Learning and Cybernetics. IEEE, 2006.Google ScholarCross Ref
[25] Aranganayagi, S., and K. Thangavel. ”Clustering categorical data using silhouette coefficient as a relocating measure.” International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007). Vol. 2. IEEE, 2007.Google ScholarDigital Library

Recommendations

K-DBSCAN: An improved DBSCAN algorithm for big data
Abstract
Big data storage and processing are among the most important challenges now. Among data mining algorithms, DBSCAN is a common clustering method. One of the most important drawbacks of this algorithm is its low execution speed. This study aims to ...
Read More
AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

Clustering is a typical data mining technique that partitions a dataset into multiple subsets of similar objects according to similarity metrics. In particular, density-based algorithms can find clusters of different shapes and sizes while remaining ...
Read More
Rough-DBSCAN: A fast hybrid density based clustering method for large data sets

Density based clustering techniques like DBSCAN are attractive because it can find arbitrary shaped clusters along with noisy outliers. Its time requirement is O(n^2) where n is the size of the dataset, and because of this it is not a suitable one to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IC3-2021: Proceedings of the 2021 Thirteenth International Conference on Contemporary Computing
August 2021
483 pages
ISBN:9781450389204
DOI:10.1145/3474124

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 November 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Clustering
DBSCAN Algorithm
Restricted randomness
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 69
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Restricted Randomness DBSCAN : A faster DBSCAN Algorithm

IC3-2021: Proceedings of the 2021 Thirteenth International Conference on Contemporary Computing

ABSTRACT

References

Cited By

Recommendations

K-DBSCAN: An improved DBSCAN algorithm for big data

AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

Rough-DBSCAN: A fast hybrid density based clustering method for large data sets

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Restricted Randomness DBSCAN : A faster DBSCAN Algorithm

IC3-2021: Proceedings of the 2021 Thirteenth International Conference on Contemporary Computing

ABSTRACT

References

Cited By

Recommendations

K-DBSCAN: An improved DBSCAN algorithm for big data

AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

Rough-DBSCAN: A fast hybrid density based clustering method for large data sets

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media