Abstract
In this work the concept of a trie-based complete solution archive in combination with a genetic algorithm is applied to the Reconstruction of Cross-Cut Shredded Text Documents (RCCSTD) problem. This archive is able to detect and subsequently convert duplicates into new yet unvisited solutions. Cross-cut shredded documents are documents that are cut into rectangular pieces of equal size and shape. The reconstruction of documents can be of high interest in forensic science. Two types of tries are compared as underlying data structure, an indexed trie and a linked trie. Experiments indicate that the latter needs considerably less memory without affecting the run-time. While the archive-enhanced genetic algorithm yields better results for runs with a fixed number of iterations, advantages diminish due to the additional overhead when considering run-time.
This work is supported by the Austrian Science Fund (FWF) under grant P24660.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gusfield, D.: Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press, New York (1997)
Hu, B., Raidl, G.R.: An evolutionary algorithm with solution archives and bounding extension for the generalized minimum spanning tree problem. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (GECCO), pp. 393–400. ACM Press, Philadelphia (2012)
Mauldin, M.L.: Maintaining Diversity in Genetic Search. In: National Conference on Artificial Intelligence, vol. 19, pp. 247–250. AAAI, William Kaufmann (1984)
Perl, J., Diem, M., Kleber, F., Sablatnig, R.: Strip shredded document reconstruction using optical character recognition. In: 4th International Conference on Imaging for Crime Detection and Prevention 2011 (ICDP 2011), pp. 1–6 (2011)
Prandtstetter, M.: Hybrid Optimization Methods for Warehouse Logistics and the Reconstruction of Destroyed Paper Documents. Ph.D. thesis, Vienna University of Technology (2009)
Prandtstetter, M., Raidl, G.R.: Meta-heuristics for reconstructing cross cut shredded text documents. In: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO 2009, pp. 349–356. ACM Press, New York (2009)
Raidl, G.R., Hu, B.: Enhancing genetic algorithms by a trie-based complete solution archive. In: Cowling, P., Merz, P. (eds.) EvoCOP 2010. LNCS, vol. 6022, pp. 239–251. Springer, Heidelberg (2010)
Ronald, S.: Duplicate genotypes in a genetic algorithm. In: IEEE World Congress on Computational Intelligence, Evolutionary Computation Proceedings, pp. 793–798 (1998)
Schauer, C., Prandtstetter, M., Raidl, G.R.: A memetic algorithm for reconstructing cross-cut shredded text documents. In: Blesa, M.J., Blum, C., Raidl, G., Roli, A., Sampels, M. (eds.) HM 2010. LNCS, vol. 6373, pp. 103–117. Springer, Heidelberg (2010)
Sleit, A., Massad, Y., Musaddaq, M.: An alternative clustering approach for reconstructing cross cut shredded text documents. Telecommunication Systems, 1–11 (2011)
Yuen, S.Y., Chow, C.K.: A non-revisiting genetic algorithm. In: IEEE Congress on Evolutionary Computation, CEC 2007, pp. 4583–4590 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Biesinger, B., Schauer, C., Hu, B., Raidl, G.R. (2013). Enhancing a Genetic Algorithm with a Solution Archive to Reconstruct Cross Cut Shredded Text Documents. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds) Computer Aided Systems Theory - EUROCAST 2013. EUROCAST 2013. Lecture Notes in Computer Science, vol 8111. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53856-8_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-53856-8_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53855-1
Online ISBN: 978-3-642-53856-8
eBook Packages: Computer ScienceComputer Science (R0)