skip to main content
10.1145/3299815.3314452acmconferencesArticle/Chapter ViewAbstractPublication Pagesacm-seConference Proceedingsconference-collections
short-paper

An Attempt at Improving Density-based Clustering Algorithms

Published: 18 April 2019 Publication History

Abstract

Clustering is an unsupervised analytical technique for processing data that works by grouping elements of a set in order to form clusters of similar items. This task lies at the base level of many other tasks including machine vision and artificial intelligence. Big data sets present challenges for clustering due to the size and complexity of the data to be processed. Previous work in this domain resulted in an algorithm called Fast Density-Grid Clustering, which is designed to create a grid structure on the data and then merging cells based on local density. In this paper, we focus on an ongoing modification to this algorithm that could also be used for other density-based clustering algorithms, based on adaptive grids that have an irregular spacing. Testing shows that while the initial results for the first stage of implementation resulted in an average 29.728% loss in accuracy with no significant speed increase, there is a lot of further room for experimentation and development for this approach.

References

[1]
R. Bellman. 1984. Dynamic Programming. Princeton: Princeton University Press.
[2]
D. Brown, A. Japa, and Y. Shi. 2019. A Fast Density-Grid Based Clustering Method. In Proceedings of the 9th Annual IEEE Computing and Communication Workshop and Conference. (IEEE CCWCC '19). Las Vegas, NV. pp. 48--54.
[3]
M. Ester, H. P. Kriegel, J. Sander, and X. Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining.
[4]
S. Goil, H. Nagesh, and A. Choudhary. 1999. MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets. Center for Parallel and Distributed Computing Technical Report No. CPDC-TR-9906-010. Northwestern University, Evanston, IL.
[5]
J. Han, M. Kamber, and J. Pei. 2012. Data Mining: Concepts and Techniques (Third Edition). Morgan Kaufmann Publishers.
[6]
J. Handl. Cluster Generators. https://personalpages.manchester.ac.uk/staff/Julia.Handl/generators.html.
[7]
C. Snijders, U. Matzat, and U.-D. Reips. 2012. Big Data: Big Gaps of Knowledge in the Field of Internet Science. International Journal of Internet Science.
[8]
UCI. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php.

Cited By

View all
  • (2024)Scalable decision fusion algorithm for enabling decentralized computation in distributed, big data clustering problemsInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02121-715:9(3803-3827)Online publication date: 8-Apr-2024
  • (2023)Research on the Method of Hypergraph Construction of Information Systems Based on Set Pair Distance MeasurementElectronics10.3390/electronics1220437512:20(4375)Online publication date: 23-Oct-2023
  • (2022)A MapReduce-based K-means clustering algorithmThe Journal of Supercomputing10.1007/s11227-021-04078-878:4(5181-5202)Online publication date: 1-Mar-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ACMSE '19: Proceedings of the 2019 ACM Southeast Conference
April 2019
295 pages
ISBN:9781450362511
DOI:10.1145/3299815
  • Conference Chair:
  • Dan Lo,
  • Program Chair:
  • Donghyun Kim,
  • Publications Chair:
  • Eric Gamess
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Algorithm
  2. Big Data
  3. Cluster Analysis
  4. Clustering
  5. Data Mining

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

ACM SE '19
Sponsor:
ACM SE '19: 2019 ACM Southeast Conference
April 18 - 20, 2019
GA, Kennesaw, USA

Acceptance Rates

Overall Acceptance Rate 402 of 779 submissions, 52%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Scalable decision fusion algorithm for enabling decentralized computation in distributed, big data clustering problemsInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02121-715:9(3803-3827)Online publication date: 8-Apr-2024
  • (2023)Research on the Method of Hypergraph Construction of Information Systems Based on Set Pair Distance MeasurementElectronics10.3390/electronics1220437512:20(4375)Online publication date: 23-Oct-2023
  • (2022)A MapReduce-based K-means clustering algorithmThe Journal of Supercomputing10.1007/s11227-021-04078-878:4(5181-5202)Online publication date: 1-Mar-2022
  • (2021)A Document Clustering Approach Using Shared Nearest Neighbour Affinity, TF-IDF and Angular SimilarityIntelligent Data Communication Technologies and Internet of Things10.1007/978-981-15-9509-7_23(267-276)Online publication date: 13-Feb-2021
  • (2020)An Approach on Efficient Clustering Technique of High Dimensional Records2020 5th International Conference on Communication and Electronics Systems (ICCES)10.1109/ICCES48766.2020.9138020(860-865)Online publication date: Jun-2020
  • (2020)A cost effective‐ secure algorithm for work‐flow scheduling in cloud computingInternet Technology Letters10.1002/itl2.2336:1Online publication date: 14-Oct-2020
  • (2019)Text Extraction and Clustering for Multimedia: A review on Techniques and Challenges2019 International Conference on Digitization (ICD)10.1109/ICD47981.2019.9105905(38-43)Online publication date: Nov-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media