Skip to main content
Log in

Clustering paper shreds of different sizes

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Although paper shredding is widely used to prevent confidential papers from being misused, still it cannot be considered a convenient process. The size of paper shreds became smaller and smaller, as new methods of shredded paper reassembly and reconstruction are evolving. This paper focuses on clustering, which is a possible phase in the assembly process. This work considers real strip-cut shreds, in addition to images shredded by a simulator in one direction to make strip-cut shreds of different sizes, from wide to narrow shreds, and images shredded in two directions, possibly reflecting cross-cut and micro-cut shreds. K-means is used to cluster shreds, the features tested are gray-level ranges, and the well-known gray-level co-occurrence matrix, invariant moments, segmentation-based fractal texture analysis algorithm, and color moments. The number of shreds grouped in the same cluster with originally adjacent neighbors is used to indicate clustering effectiveness, in addition to the overall accuracy of strip-cut shreds clustering. When the number of clusters is 5, and the k-means experiments run 100 times for 38 images, the overall accuracy of gray-level ranges in simulated strip-cut shreds is 84.87, 89.27, and 93.5 percent in the three different sizes tested, also in cross-cut and micro-cut shreds, gray-level ranges achieve a relatively high number of shreds with 3 and 4 originally adjacent neighbors found in the same cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Alhaj F, Sharieh A, Sleit A (2019) Reconstructing colored strip-shredded documents based on the hungarians algorithm. In: 2019 2nd international conference on new trends in computing sciences (ICTCS), pp 1–6. https://doi.org/10.1109/ICTCS.2019.8923048

  2. Atallah AS, Emary E, El-Mahallawy MS (2015) A step toward speeding up cross-cut shredded document reconstruction. In: 2015 Fifth international conference on communication systems and network technologies, pp 345–349. https://doi.org/10.1109/CSNT.2015.69

  3. Biswas A, Bhowmick P, Bhattacharya BB (2005) Reconstruction of torn documents using contour maps. In: IEEE International conference on image processing 2005, vol 3, pp III–517–20. https://doi.org/10.1109/ICIP.2005.1530442

  4. Butler P, Chakraborty P, Ramakrishan N (2012) The deshredder: a visual analytic approach to reconstructing shredded documents. In: 2012 IEEE Conference on visual analytics science and technology (VAST), pp 113–122. https://doi.org/10.1109/VAST.2012.6400560

  5. Chen G, Wu J, Jia C, Zhang Y (2017) A pipeline for reconstructing cross-shredded english document. In: 2017 2nd international conference on image, vision and computing (ICIVC), pp 1034–1039. https://doi.org/10.1109/ICIVC.2017.7984711

  6. Chen J, Ke D, Wang Z, Liu Y (2017) A high splicing accuracy solution to reconstruction of cross-cut shredded text document problem. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-017-5389-z

  7. Chen J, Tian M, Qi X, Wang W, Liu Y (2019) A solution to reconstruct cross-cut shredded text documents based on constrained seed k-means algorithm and ant colony algorithm. Expert Syst Appl 127:35–46. https://doi.org/10.1016/j.eswa.2019.02.039

    Article  Google Scholar 

  8. Costa AF, Humpire-Mamani G, Traina AJM (2012) An efficient algorithm for fractal analysis of textures. In: SIBGRAPI 2012 (XXV Conference on graphics, patterns and images), pp 39–46. https://doi.org/10.1109/SIBGRAPI.2012.15

  9. Deever A, Gallagher A (2012) Semi-automatic assembly of real cross-cut shredded documents. In: 2012 19Th IEEE international conference on image processing, pp 233–236. https://doi.org/10.1109/ICIP.2012.6466838

  10. Gonzalez RC, Woods RE, Eddins SL (2009) Digital image processing using MATLAB, 2nd edn Gatesmark Publishing. Printed in the United States of America

  11. Guo S, Lao S, Guo J, Xiang H (2015) A semi-automatic solution archive for cross-cut shredded text documents reconstruction. In: Zhang YJ (ed) Image and graphics. https://doi.org/10.1007/978-3-319-21978-3_39. Springer International Publishing, Cham, pp 447–461

  12. Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Systems Man Cybern SMC-3(6):610–621. https://doi.org/10.1109/TSMC.1973.4309314

    Article  Google Scholar 

  13. Hong H, Zheng L, Pan S (2018) Computation of gray level co-occurrence matrix based on cuda and optimization for medical computer vision application. IEEE Access 6:67762–67770. https://doi.org/10.1109/ACCESS.2018.2877697

    Article  Google Scholar 

  14. Htet ZW, Koldaev VD, Teplova YO, Kremer EA, Fedorov PA (2018) The evaluation of computational complexity of moment invariants in image processing. In: 2018 IEEE Conference of russian young researchers in electrical and electronic engineering (EIConrus), pp 1844–1848. https://doi.org/10.1109/EIConRus.2018.8317466

  15. Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inform Theory 8(2):179–187. https://doi.org/10.1109/TIT.1962.1057692

    Article  MATH  Google Scholar 

  16. Justino E, Oliveira LS, Freitas C (2006) Reconstructing shredded documents through feature matching. Forensic Sci Int 160(2):140–147. https://doi.org/10.1016/j.forsciint.2005.09.001

    Article  Google Scholar 

  17. Liang Y, Li X (2020) Reassembling shredded document stripes using word-path metric and greedy composition optimal matching solver. IEEE Trans Multimed 22(5):1168–1181. https://doi.org/10.1109/TMM.2019.2941777

    Article  Google Scholar 

  18. Lin HY, Fan-Chiang WC Wada T, Huang F, Lin S (eds) (2009) Image-based techniques for shredded document reconstruction. Springer, Berlin. https://doi.org/10.1007/978-3-540-92957-4_14

  19. Lin HY, Fan-Chiang WC (2012) Reconstruction of shredded document based on image feature matching. Expert Syst Appl 39(3):3324–3332. https://doi.org/10.1016/j.eswa.2011.09.019

    Article  Google Scholar 

  20. Liu H, Cao S, Yan S (2011) Automated assembly of shredded pieces from multiple photos. IEEE Trans Multimed 13(5):1154–1162. https://doi.org/10.1109/TMM.2011.2160845

    Article  Google Scholar 

  21. Muliadi Panggabean T, Elyezer Simaremare M, Siahaan R, Pardede C, Putri Gurning W (2020) Another parallelism technique of glcm implementation using cuda programming. In: 2020 4th International conference on advances in image processing, ICAIP 2020. https://doi.org/10.1145/3441250.3441251. Association for Computing Machinery, New York, pp 143–151

  22. Ou X, Pan W, Xiao P (2014) In vivo skin capacitive imaging analysis by using grey level co-occurrence matrix (glcm). Int J Pharmaceut 460(1):28–32. https://doi.org/10.1016/j.ijpharm.2013.10.024

    Article  Google Scholar 

  23. Paixao TM, Berriel RF, Boeres MCS, Koerich AL, Badue C, De Souza AF, Oliveira-Santos T (2020) Fast(er) reconstruction of shredded text documents via self-supervised deep asymmetric metric learning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 14343–14351. https://doi.org/10.1109/CVPR42600.2020.01435

  24. Paixão TM, Berriel RF, Boeres MC, Koerich AL, Badue C, De Souza AF, Oliveira-Santos T (2020) Self-supervised deep reconstruction of mixed strip-shredded text documents. Pattern Recognit 107:107535. https://doi.org/10.1016/j.patcog.2020.107535

    Article  Google Scholar 

  25. Paixão TM, Berriel RF, Boeres MCS, Badue C, De Souza AF, Oliveira-Santos T (2018) A deep learning-based compatibility score for reconstruction of strip-shredded text documents. In: 2018 31St SIBGRAPI conference on graphics, patterns and images (SIBGRAPI), pp 87–94. https://doi.org/10.1109/SIBGRAPI.2018.00018

  26. Paixão TM, Boeres MCS, Freitas COA, Oliveira-Santos T (2019) Exploring character shapes for unsupervised reconstruction of strip-shredded text documents. IEEE Trans Inf Forensics Secur 14 (7):1744–1754. https://doi.org/10.1109/TIFS.2018.2885253

    Article  Google Scholar 

  27. Patel B, Amin J (2015) Reconstruction of shredded document using image mosaicing technique-a survey. International Journal of Science and Research (IJSR) 4(12):737–740

    Article  Google Scholar 

  28. Phienthrakul T, Santitewagun T, Hnoohom N (2015) A linear scoring algorithm for shredded paper reconstruction. In: 2015 11Th international conference on signal-image technology internet-based systems (SITIS), pp 623–627. https://doi.org/10.1109/SITIS.2015.13

  29. Ping-sung L, Tse-sheng C, Pau-choo C (2001) A fast algorithm for multilevel thresholding. J Inf Sci Eng 17(5):713–727

    Google Scholar 

  30. Saboia P, Goldenstein S Bayro-Corrochano E, Hancock E (eds) (2014) Assessing cross-cut shredded document assembly. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-12568-8_34

  31. Schauer C, Prandtstetter M, Raidl GR Blesa MJ, Blum C, Raidl G, Roli A, Sampels M (eds) (2010) A memetic algorithm for reconstructing cross-cut shredded text documents. Springer, Berlin. https://doi.org/10.1007/978-3-642-16054-7_8

  32. Sleit A, Massad Y, Musaddaq M (2013) An alternative clustering approach for reconstructing cross cut shredded text documents. Telecommun Syst 52(3):1491–1501. https://doi.org/10.1007/s11235-011-9626-x

    Article  Google Scholar 

  33. Ukovich A, Ramponi G (2005) Features for the reconstruction of shredded notebook paper. In: IEEE International conference on image processing 2005, vol 3, pp 93–96. https://doi.org/10.1109/ICIP.2005.1530336

  34. Ukovich A, Ramponi G (2008) Feature extraction and clustering for the computer-aided reconstruction of strip-cut shredded documents. J Electron Imaging 17:17–17–13. https://doi.org/10.1117/1.2898551

    Article  Google Scholar 

  35. Ukovich A, Ramponi G, Doulaverakis H, Kompatsiaris Y, Strintzis MG (2004) Shredded document reconstruction using mpeg-7 standard descriptors. In: Proceedings of the Fourth IEEE international symposium on signal processing and information technology, 2004, pp 334–337. https://doi.org/10.1109/ISSPIT.2004.1433788

  36. Wang Y, Ji DC (2014) A two-stage approach for reconstruction of cross-cut shredded text documents. In: 2014 Tenth international conference on computational intelligence and security, pp 12–16. https://doi.org/10.1109/CIS.2014.92

  37. Wang Y, Wu B, Gao L, Yang H (2019) Automatic reconstruction of cross-cut chinese document shreds based on the feature of typesetting and strokes. In: 2019 IEEE 4Th international conference on signal and image processing (ICSIP), pp 727–731. https://doi.org/10.1109/SIPROCESS.2019.8868511

  38. Xing N, Shi S, Xing Y (2017) Shreds assembly based on character stroke feature. Procedia Computer Science 116:151–157. https://doi.org/10.1016/j.procs.2017.10.060

    Article  Google Scholar 

  39. Xing N, Zhang J (2017) Graphical-character-based shredded chinese document reconstruction. Multimed Tools Appl 76:12871–12891. https://doi.org/10.1007/s11042-016-3685-7

    Article  Google Scholar 

  40. Xing N, Zhang J, Cao F, Liu P (2017) Practical challenge of shredded documents: Clustering of chinese homologous pieces. Appl Sci 7(9). https://doi.org/10.3390/app7090951

  41. Yang H, Wang Y (2021) Automatic splicing of chinese single-sided shreds based on character feature and typesetting characteristics. J Phys Conf Series 1827:012063. https://doi.org/10.1088/1742-6596/1827/1/012063https://doi.org/10.1088/1742-6596/1827/1/012063

    Article  Google Scholar 

  42. Zhao B, Zhou Y, Zhang Z, Na Y, Ma T (2014) Information quantity based automatic reconstruction of shredded chinese documents. In: 2014 IEEE 26Th international conference on tools with artificial intelligence, pp 1016–1020. https://doi.org/10.1109/ICTAI.2014.154

Download references

Acknowledgements

The author would like to thank the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alia Madain.

Ethics declarations

Conflict of Interests

The author declares that there is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Madain, A. Clustering paper shreds of different sizes. Multimed Tools Appl 82, 19441–19461 (2023). https://doi.org/10.1007/s11042-022-13835-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13835-7

Keywords

Navigation