Abstract
The paper presents a canopy based image clustering algorithm using normalized cross-correlation among the camera fingerprints as a decision criterion. The proposed framework uses two levels of the threshold at two different stages to cluster images based on camera fingerprints. Initially, fingerprints are sorted in descending order of their goodness, and then raw clusters are constructed using a relaxed threshold followed by fine clustering with a hard threshold. The raw and fine clustering process results in non-overlapping clusters, which avoids assigning a fingerprint to multiple raw or fine clusters. The fine clusters are further processed in the attraction phase to improve the cluster’s quality at the cost of some computation. The CIC algorithm results in high-quality clusters with a reduced computational cost. The results show that the computational complexity per fingerprint, with respect to the reference complexity of n(n − 1)/2, decreases as the size of the dataset increases. The proposed algorithm also does not suffer from the problem when the number of cameras is larger than the average number of images taken with a camera, i.e., NC ≫ SC. Hence, the algorithm is suitable for large scale clustering and solving different scenarios of NC ≫ SC.
Similar content being viewed by others
References
Achlioptas D (2003) Database-friendly random projections: Johnson Lindenstrauss with binary coins. J Comput Syst Sci 66(4):671–687
Amelio A, Pizzuti C (2015) Is normalized mutual information a fair measure for comparing community detection methods?. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015. ACM, Paris, pp 1584–1585
Amerini I, Ballan L, Caldelli R, Del Bimbo A, Serra G (2011) A sift-based forensic method for copy–move attack detection and transformation recovery. IEEE Trans Inf For Secur 6(3):1099–1110. IEEE
Bloy G J (2008) Blind camera fingerprinting and image clustering. IEEE Trans Pattern Anal Mach Intell 30(3):532–534
Caldelli R, Amerini I, Picchioni F, Innocenti M (2010) Fast image clustering of unknown source images. In: Proceedings of IEEE international workshop on information forensics and security. IEEE, pp 1–5
Chen M, Fridrich J, Goljan M (2007) Digital imaging sensor identification (further study). In: Security, steganography, and watermarking of multimedia contents IX, 6505,65050P. International Society for Optics and Photonics
Chen M, Fridrich J, Goljan M, Lukás J (2008) Determining image origin and integrity using sensor noise. IEEE Trans Inf For Secur 3(1):74–90
de Souto M C P, Coelho A L V, Faceli K, Sakata T C, Bonadia V, Costa I G (2012) A comparison of external clustering evaluation indices in the context of imbalanced data sets. In: 2012 Brazilian symposium on neural networks. Bellingham, pp 49–54
Elhamifar E, Vidal R (2009) Sparse subspace clustering. In: IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2790–2797
Equitz W H (1989) A new vector quantization clustering algorithm. IEEE Trans Acoust Speech Signal Process 37(10):1568–1575
Ester M, Kriegel H P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96 (34):226–231
Fahmy OM (2015) An efficient clustering technique for cameras identification using sensor pattern noise. In: Proceedings of international conference on systems, signals and image process. IEEE, pp 249–252
Filler T, Fridrich J, Goljan M (2008) Using sensor pattern noise for camera model identification. In: 15th IEEE international conference on image processing. IEEE, pp 1296–1299
Georgievska S, Bakhshi R, Gavai A, Sclocco A, van Werkhoven B (2017) Clustering image noise patterns by embedding and visualization for common source camera detection. Digit Investig 23:22–30
Gisolf F, Barens P, Snel E, Malgoezar A, Vos M, Mieremet A, Geradts Z (2014) Common source identification of images in large databases. Forensic Sci Int 44:222–230
Gloe T, Böhme R (2010) The ‘Dresden image database’ for benchmarking digital image forensics. In: Proceedings of the 2010 ACM symposium on applied computing. ACM, pp 1584–1590
Gloe T, Pfennig S, Kirchner M (2012) Unexpected artefacts in PRNU based camera identification: a ‘Dresden image database’ case-study. In: Proceedings of ACM workshop multimedia security. ACM, pp 109–114
Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. ACM Sigmod Record 27(2):73–84. ACM
Guha S, Rastogi R, Shim K (2000) ROCK: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366. Elsevier
Haralick R M, Shapiro L G (1992) Computer and robot vision. Addison-Wesley
Holst G C (1998) CCD Arrays, cameras, and displays. Citeseer
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Huffman M, Steinley D, Brusco M J (2015) A note on using the adjusted Rand index for link prediction in networks. Social Netw 42:72–79
Jain A K, Dubes R C (1988) Algorithms for clustering data. Prentice-Hall, Inc
Janesick J R (2001) Scientific charge-coupled devices, 117. SPIE Press, Bellingham
Jin X, Han J (2011) Expectation maximization clustering. In: Encyclopedia of machine learning. Springer, pp 382–383
Kanungo T, Mount D M, Netanyahu N S, Piatko C D, Silverman R, Wu A Y (2002) An efficient k-means clustering algorithm: analysis and implementation, vol 7. IEEE, pp 881–892
Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75. IEEE
Kennedy D (2006) Editorial retraction. Science 211(5759):335–335
Khan S, Bianchi T (2019) Reduced complexity image clustering based on camera fingerprints. In: 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2682–2688
Khan S, Bianchi T (2019) Fast image clustering based on camera fingerprint ordering. In: International conference on multimedia and expo (ICME) 2019. IEEE, Shanghai, pp 766–771
Li C T (2010) Unsupervised classification of digital images using enhanced sensor pattern noise. In: Proceedings of IEEE international symposium on circuits and systems. IEEE, pp 3429–3432
Li C T (2010) Source camera identification using enhanced sensor pattern noise. IEEE Trans Acoust Speech Signal Process 5(2):280–287
Li C T, Li Y (2010) Digital camera identification using colour-decoupled photo response non-uniformity noise pattern. In: Proceedings of 2010 IEEE international symposium on circuits and systems. IEEE, pp 3052–3055
Li C T, Lin X (2017) A fast source-oriented image clustering method for digital forensics. EURASIP J Image Video Process 1:69
Li P, Hastie T J, Church K W (2006) Very sparse random projections. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 287–296
Lin X, Li C T (2017) Large-scale image clustering based on camera fingerprints. IEEE Trans Inf For Secur 12(4):793–808
Liu B B, Lee H K, Hu Y, Choi C H (2013) On classification of source cameras: a graph based approach. In: IEEE international workshop on information forensics and security. IEEE, pp 1–5
Lukás J, Fridrich J, Goljan M (2006) Digital camera identification from sensor pattern noise. IEEE Trans Inf For Secur 1(2):205–214
Marra F, Poggi G, Sansone C, Verdoliva L (2017) Blind PRNU-based image clustering for source identification. IEEE Trans Inf For Secur 12 (9):2197–2211
McCallum A, Nigam K, Ungar L H (2009) Efficient clustering of high-dimensional data sets with application to reference matching. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 169–178
Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Interdiscip Rev: Data Min Knowl Discov 2(1):86–96. Wiley Online Library
Ng RT, Han J (2002) CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016. IEEE
Pearson H (2005) Image manipulation: CSI: cell biology. Nature 434:952–953
Phan Q T, Boato G, De Natale F G (2017) Image clustering by source camera via sparse representation. In: Proceedings of the 2nd international workshop on multimedia forensics and security. ACM, pp 1–5
Phan Q T, Boato G, De Natale F G (2018) Accurate and scalable image clustering based on sparse representation of camera fingerprint. arXiv:1810.07945
Rand W M (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
Villalba L G, Orozco A S, Corripio J R (2015) Smartphone image clustering. Expert Syst Appl 42:1927–1940
Vinhn N X, Epps J, Bailey J (2009) Information theoretic measures for clusterings comparison: is a correction for chance necessary?. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 1073–1080
Yeung K Y, Ruzzu W L (2001) Details of the adjusted rand index and clustering algorithms, supplement to the paper an empirical study on principal component analysis for clustering gene expression data. Bioinformatics 17(9):763–774
Yu S X, Shi J (2003) Multiclass spectral clustering. In: IEEE international conference on computer vision. IEEE, pp 313–319
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. ACM Sigmod Record 25(2):103–114. ACM
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declare that he have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Khan, S. Canopy approach of image clustering based on camera fingerprints. Multimed Tools Appl 81, 21591–21618 (2022). https://doi.org/10.1007/s11042-022-12463-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12463-5