research-article

Improvement for Large-Scale Image Data using Fuzzy Rough C-Mean Based Unsupervised CNN Clustering: An Empirical Study on designbyhumans.com

Authors:
Anh Tuan Tran

SAP Innovation Lab, FPT University, Vietnam

SAP Innovation Lab, FPT University, Vietnam

0000-0003-3197-6543
View Profile

,
Ban Quy Tran

SAP Innovation Lab, FPT University, Vietnam and USTH Lab, University of Science and Technology of Hanoi, Vietnam

SAP Innovation Lab, FPT University, Vietnam and USTH Lab, University of Science and Technology of Hanoi, Vietnam

0000-0002-4100-5217
View Profile

,
Kien Trung Luong

SAP Innovation Lab, FPT University, Vietnam and USTH Lab, University of Science and Technology of Hanoi, Vietnam

SAP Innovation Lab, FPT University, Vietnam and USTH Lab, University of Science and Technology of Hanoi, Vietnam

0009-0004-2027-5594
View Profile

ICSCA '23: Proceedings of the 2023 12th International Conference on Software and Computer ApplicationsFebruary 2023Pages 1–7https://doi.org/10.1145/3587828.3587829

Published:20 June 2023Publication History

ICSCA '23: Proceedings of the 2023 12th International Conference on Software and Computer Applications

Pages 1–7

ABSTRACT

Abstract: Clustering analysis, specifically for extensive image data, is increasingly being applied in various fields such as finance, risk management, prediction, etc., and has been a fascinating subject in many scientific discussions. Deep learning, a widely used approach, and classical methods address complex classification problems stemming from real-world cases. In this study, we took various approaches to classification problems and measured their effectiveness by combining different techniques using the results of different scenarios. Many approaches have been proposed to solve the clustering problem; complex clustering methods such as hierarchical, density-based, centroid-based, and graph theoretical have been submitted. However, when it comes to real-world applications, they exposed significant drawbacks when the dataset introduced immeasurable vagueness, uncertainty, or overlapping samples that made it impossible to predict and classify. Several attempts have been made to improve the clustering method's performance, including joint CNN clustering models. Still, many of them carry the cons of the complicated clustering method, which limits the capability of CNN. The combined CNN clustering method is designed to address the problem with those deterministic CNN clustering models and was evaluated on a dataset we collected from the website designbyhumans.com, with enough features to represent a non-synthetic dataset. This research aims to improve upon the established model by using estimation techniques in determining model parameters and graphing plots to justify those choices and give insights into how the model performs on a non-synthetic dataset like ours. We concluded that the model significantly improved compared with a popular complex clustering method, which has been evaluated by computational time, using different metrics to represent how better separated each cluster was. Based on conducted experiments and the future development of the method, we discussed and addressed some of the drawbacks of this approach.

References

P. Zdzisław, "Rough set theory and its applications", Journal of Telecommunications and Information Technology, vol. 3, pp. 7-10, 2002.Google Scholar
Zimmermann, H.-J. (2010), Fuzzy set theory. WIREs Comp Stat, 2: 317-332. https://doi.org/10.1002/wics.82Google ScholarCross Ref
James C. Bezdek, Robert Ehrlich, William Full, FCM: The fuzzy c-means clustering algorithm,Computers & Geosciences,Volume 10, Issues 2–3, 1984,Pages 191-203,ISSN 0098-3004,https://doi.org/10.1016/0098-3004(84)90020-7.Google ScholarCross Ref
Ubukata, S., Notsu, A. and Honda, K., 2017. General formulation of rough C-means clustering. International Journal of Computer Science and Network Security, 17(9), pp.29-38.Google Scholar
H. Qinghua and Y. Daren, "An Improved Clustering Algorithm for Information Granulation", 2005.Google Scholar
Hinton, G.E., 2009. Deep belief networks. Scholarpedia, 4(5), p.5947.Google Scholar
Salakhutdinov, R. and Larochelle, H., 2010, March. Efficient learning of deep Boltzmann machines. In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 693-700). JMLR Workshop and Conference Proceedings.Google Scholar
Zhou, C. and Paffenroth, R.C., 2017, August. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 665-674).Google Scholar
O'Shea, K. and Nash, R., 2015. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., Red Hook, NY, USA, 1097–1105.Google Scholar
He, K., Zhang, X., Ren, S. and Sun, J., 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).Google Scholar
Simonyan, K. and Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google Scholar
Long, J., Shelhamer, E. and Darrell, T., 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).Google Scholar
Hsu, C.C. and Lin, C.W., 2017. Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Transactions on Multimedia, 20(2), pp.421-429.Google Scholar
Riaz, S., Arshad, A. and Jiao, L., 2018. Fuzzy rough C-mean based unsupervised CNN clustering for large-scale image data. Applied Sciences, 8(10), p.1869.Google Scholar
Designbyhumans.comGoogle Scholar

Index Terms

Improvement for Large-Scale Image Data using Fuzzy Rough C-Mean Based Unsupervised CNN Clustering: An Empirical Study on designbyhumans.com
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
    2. Machine learning approaches
2. Information systems
  1. Information systems applications
    1. Data mining

Index terms have been assigned to the content through auto-classification.

Recommendations

Unsupervised fuzzy clustering with multi-center clusters
Clustering and modeling

A new unsupervised fuzzy clustering algorithm is provided in this paper to cluster the data patterns without a priori information about the number of clusters. The initial guesses of the locations of the cluster centers or the initial guesses of the ...
Read More
A size-insensitive integrity-based fuzzy c-means method for data clustering

Fuzzy c-means (FCM) is one of the most popular techniques for data clustering. Since FCM tends to balance the number of data points in each cluster, centers of smaller clusters are forced to drift to larger adjacent clusters. For datasets with ...
Read More
A study of large-scale data clustering based on fuzzy clustering

Large-scale data are any data that cannot be loaded into the main memory of the ordinary. This is not the objective definition of large-scale data, but it is easy to understand what the large-scale data is. We first introduce some present algorithms to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICSCA '23: Proceedings of the 2023 12th International Conference on Software and Computer Applications
February 2023
385 pages
ISBN:9781450398589
DOI:10.1145/3587828

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 40
  Total Downloads
- Downloads (Last 12 months)40
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Improvement for Large-Scale Image Data using Fuzzy Rough C-Mean Based Unsupervised CNN Clustering: An Empirical Study on designbyhumans.com

ICSCA '23: Proceedings of the 2023 12th International Conference on Software and Computer Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Unsupervised fuzzy clustering with multi-center clusters

A size-insensitive integrity-based fuzzy c-means method for data clustering

A study of large-scale data clustering based on fuzzy clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Improvement for Large-Scale Image Data using Fuzzy Rough C-Mean Based Unsupervised CNN Clustering: An Empirical Study on designbyhumans.com

ICSCA '23: Proceedings of the 2023 12th International Conference on Software and Computer Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Unsupervised fuzzy clustering with multi-center clusters

A size-insensitive integrity-based fuzzy c-means method for data clustering

A study of large-scale data clustering based on fuzzy clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media