Abstract
Although paper shredding is widely used to prevent confidential papers from being misused, still it cannot be considered a convenient process. The size of paper shreds became smaller and smaller, as new methods of shredded paper reassembly and reconstruction are evolving. This paper focuses on clustering, which is a possible phase in the assembly process. This work considers real strip-cut shreds, in addition to images shredded by a simulator in one direction to make strip-cut shreds of different sizes, from wide to narrow shreds, and images shredded in two directions, possibly reflecting cross-cut and micro-cut shreds. K-means is used to cluster shreds, the features tested are gray-level ranges, and the well-known gray-level co-occurrence matrix, invariant moments, segmentation-based fractal texture analysis algorithm, and color moments. The number of shreds grouped in the same cluster with originally adjacent neighbors is used to indicate clustering effectiveness, in addition to the overall accuracy of strip-cut shreds clustering. When the number of clusters is 5, and the k-means experiments run 100 times for 38 images, the overall accuracy of gray-level ranges in simulated strip-cut shreds is 84.87, 89.27, and 93.5 percent in the three different sizes tested, also in cross-cut and micro-cut shreds, gray-level ranges achieve a relatively high number of shreds with 3 and 4 originally adjacent neighbors found in the same cluster.
Similar content being viewed by others
References
Alhaj F, Sharieh A, Sleit A (2019) Reconstructing colored strip-shredded documents based on the hungarians algorithm. In: 2019 2nd international conference on new trends in computing sciences (ICTCS), pp 1–6. https://doi.org/10.1109/ICTCS.2019.8923048
Atallah AS, Emary E, El-Mahallawy MS (2015) A step toward speeding up cross-cut shredded document reconstruction. In: 2015 Fifth international conference on communication systems and network technologies, pp 345–349. https://doi.org/10.1109/CSNT.2015.69
Biswas A, Bhowmick P, Bhattacharya BB (2005) Reconstruction of torn documents using contour maps. In: IEEE International conference on image processing 2005, vol 3, pp III–517–20. https://doi.org/10.1109/ICIP.2005.1530442
Butler P, Chakraborty P, Ramakrishan N (2012) The deshredder: a visual analytic approach to reconstructing shredded documents. In: 2012 IEEE Conference on visual analytics science and technology (VAST), pp 113–122. https://doi.org/10.1109/VAST.2012.6400560
Chen G, Wu J, Jia C, Zhang Y (2017) A pipeline for reconstructing cross-shredded english document. In: 2017 2nd international conference on image, vision and computing (ICIVC), pp 1034–1039. https://doi.org/10.1109/ICIVC.2017.7984711
Chen J, Ke D, Wang Z, Liu Y (2017) A high splicing accuracy solution to reconstruction of cross-cut shredded text document problem. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-017-5389-z
Chen J, Tian M, Qi X, Wang W, Liu Y (2019) A solution to reconstruct cross-cut shredded text documents based on constrained seed k-means algorithm and ant colony algorithm. Expert Syst Appl 127:35–46. https://doi.org/10.1016/j.eswa.2019.02.039
Costa AF, Humpire-Mamani G, Traina AJM (2012) An efficient algorithm for fractal analysis of textures. In: SIBGRAPI 2012 (XXV Conference on graphics, patterns and images), pp 39–46. https://doi.org/10.1109/SIBGRAPI.2012.15
Deever A, Gallagher A (2012) Semi-automatic assembly of real cross-cut shredded documents. In: 2012 19Th IEEE international conference on image processing, pp 233–236. https://doi.org/10.1109/ICIP.2012.6466838
Gonzalez RC, Woods RE, Eddins SL (2009) Digital image processing using MATLAB, 2nd edn Gatesmark Publishing. Printed in the United States of America
Guo S, Lao S, Guo J, Xiang H (2015) A semi-automatic solution archive for cross-cut shredded text documents reconstruction. In: Zhang YJ (ed) Image and graphics. https://doi.org/10.1007/978-3-319-21978-3_39. Springer International Publishing, Cham, pp 447–461
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Systems Man Cybern SMC-3(6):610–621. https://doi.org/10.1109/TSMC.1973.4309314
Hong H, Zheng L, Pan S (2018) Computation of gray level co-occurrence matrix based on cuda and optimization for medical computer vision application. IEEE Access 6:67762–67770. https://doi.org/10.1109/ACCESS.2018.2877697
Htet ZW, Koldaev VD, Teplova YO, Kremer EA, Fedorov PA (2018) The evaluation of computational complexity of moment invariants in image processing. In: 2018 IEEE Conference of russian young researchers in electrical and electronic engineering (EIConrus), pp 1844–1848. https://doi.org/10.1109/EIConRus.2018.8317466
Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inform Theory 8(2):179–187. https://doi.org/10.1109/TIT.1962.1057692
Justino E, Oliveira LS, Freitas C (2006) Reconstructing shredded documents through feature matching. Forensic Sci Int 160(2):140–147. https://doi.org/10.1016/j.forsciint.2005.09.001
Liang Y, Li X (2020) Reassembling shredded document stripes using word-path metric and greedy composition optimal matching solver. IEEE Trans Multimed 22(5):1168–1181. https://doi.org/10.1109/TMM.2019.2941777
Lin HY, Fan-Chiang WC Wada T, Huang F, Lin S (eds) (2009) Image-based techniques for shredded document reconstruction. Springer, Berlin. https://doi.org/10.1007/978-3-540-92957-4_14
Lin HY, Fan-Chiang WC (2012) Reconstruction of shredded document based on image feature matching. Expert Syst Appl 39(3):3324–3332. https://doi.org/10.1016/j.eswa.2011.09.019
Liu H, Cao S, Yan S (2011) Automated assembly of shredded pieces from multiple photos. IEEE Trans Multimed 13(5):1154–1162. https://doi.org/10.1109/TMM.2011.2160845
Muliadi Panggabean T, Elyezer Simaremare M, Siahaan R, Pardede C, Putri Gurning W (2020) Another parallelism technique of glcm implementation using cuda programming. In: 2020 4th International conference on advances in image processing, ICAIP 2020. https://doi.org/10.1145/3441250.3441251. Association for Computing Machinery, New York, pp 143–151
Ou X, Pan W, Xiao P (2014) In vivo skin capacitive imaging analysis by using grey level co-occurrence matrix (glcm). Int J Pharmaceut 460(1):28–32. https://doi.org/10.1016/j.ijpharm.2013.10.024
Paixao TM, Berriel RF, Boeres MCS, Koerich AL, Badue C, De Souza AF, Oliveira-Santos T (2020) Fast(er) reconstruction of shredded text documents via self-supervised deep asymmetric metric learning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 14343–14351. https://doi.org/10.1109/CVPR42600.2020.01435
Paixão TM, Berriel RF, Boeres MC, Koerich AL, Badue C, De Souza AF, Oliveira-Santos T (2020) Self-supervised deep reconstruction of mixed strip-shredded text documents. Pattern Recognit 107:107535. https://doi.org/10.1016/j.patcog.2020.107535
Paixão TM, Berriel RF, Boeres MCS, Badue C, De Souza AF, Oliveira-Santos T (2018) A deep learning-based compatibility score for reconstruction of strip-shredded text documents. In: 2018 31St SIBGRAPI conference on graphics, patterns and images (SIBGRAPI), pp 87–94. https://doi.org/10.1109/SIBGRAPI.2018.00018
Paixão TM, Boeres MCS, Freitas COA, Oliveira-Santos T (2019) Exploring character shapes for unsupervised reconstruction of strip-shredded text documents. IEEE Trans Inf Forensics Secur 14 (7):1744–1754. https://doi.org/10.1109/TIFS.2018.2885253
Patel B, Amin J (2015) Reconstruction of shredded document using image mosaicing technique-a survey. International Journal of Science and Research (IJSR) 4(12):737–740
Phienthrakul T, Santitewagun T, Hnoohom N (2015) A linear scoring algorithm for shredded paper reconstruction. In: 2015 11Th international conference on signal-image technology internet-based systems (SITIS), pp 623–627. https://doi.org/10.1109/SITIS.2015.13
Ping-sung L, Tse-sheng C, Pau-choo C (2001) A fast algorithm for multilevel thresholding. J Inf Sci Eng 17(5):713–727
Saboia P, Goldenstein S Bayro-Corrochano E, Hancock E (eds) (2014) Assessing cross-cut shredded document assembly. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-12568-8_34
Schauer C, Prandtstetter M, Raidl GR Blesa MJ, Blum C, Raidl G, Roli A, Sampels M (eds) (2010) A memetic algorithm for reconstructing cross-cut shredded text documents. Springer, Berlin. https://doi.org/10.1007/978-3-642-16054-7_8
Sleit A, Massad Y, Musaddaq M (2013) An alternative clustering approach for reconstructing cross cut shredded text documents. Telecommun Syst 52(3):1491–1501. https://doi.org/10.1007/s11235-011-9626-x
Ukovich A, Ramponi G (2005) Features for the reconstruction of shredded notebook paper. In: IEEE International conference on image processing 2005, vol 3, pp 93–96. https://doi.org/10.1109/ICIP.2005.1530336
Ukovich A, Ramponi G (2008) Feature extraction and clustering for the computer-aided reconstruction of strip-cut shredded documents. J Electron Imaging 17:17–17–13. https://doi.org/10.1117/1.2898551
Ukovich A, Ramponi G, Doulaverakis H, Kompatsiaris Y, Strintzis MG (2004) Shredded document reconstruction using mpeg-7 standard descriptors. In: Proceedings of the Fourth IEEE international symposium on signal processing and information technology, 2004, pp 334–337. https://doi.org/10.1109/ISSPIT.2004.1433788
Wang Y, Ji DC (2014) A two-stage approach for reconstruction of cross-cut shredded text documents. In: 2014 Tenth international conference on computational intelligence and security, pp 12–16. https://doi.org/10.1109/CIS.2014.92
Wang Y, Wu B, Gao L, Yang H (2019) Automatic reconstruction of cross-cut chinese document shreds based on the feature of typesetting and strokes. In: 2019 IEEE 4Th international conference on signal and image processing (ICSIP), pp 727–731. https://doi.org/10.1109/SIPROCESS.2019.8868511
Xing N, Shi S, Xing Y (2017) Shreds assembly based on character stroke feature. Procedia Computer Science 116:151–157. https://doi.org/10.1016/j.procs.2017.10.060
Xing N, Zhang J (2017) Graphical-character-based shredded chinese document reconstruction. Multimed Tools Appl 76:12871–12891. https://doi.org/10.1007/s11042-016-3685-7
Xing N, Zhang J, Cao F, Liu P (2017) Practical challenge of shredded documents: Clustering of chinese homologous pieces. Appl Sci 7(9). https://doi.org/10.3390/app7090951
Yang H, Wang Y (2021) Automatic splicing of chinese single-sided shreds based on character feature and typesetting characteristics. J Phys Conf Series 1827:012063. https://doi.org/10.1088/1742-6596/1827/1/012063https://doi.org/10.1088/1742-6596/1827/1/012063
Zhao B, Zhou Y, Zhang Z, Na Y, Ma T (2014) Information quantity based automatic reconstruction of shredded chinese documents. In: 2014 IEEE 26Th international conference on tools with artificial intelligence, pp 1016–1020. https://doi.org/10.1109/ICTAI.2014.154
Acknowledgements
The author would like to thank the anonymous reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The author declares that there is no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Madain, A. Clustering paper shreds of different sizes. Multimed Tools Appl 82, 19441–19461 (2023). https://doi.org/10.1007/s11042-022-13835-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13835-7