Fast beta wavelet network-based feature extraction for image copy detection
Introduction
Nowadays, enormous amounts of images are being generated, stored, and shared by many groups all over the world including radiologists, journalists, librarians, photographers, historians, and even the public [1]. The illegal generation of images resulted from different sources; innovative multimedia products such as the evolution of digital storage devices, the availability of affordable digital cameras, applications and hardware image processing, as well as the expansion and the rapid growth of the Internet.
It is therefore necessary to develop efficient systems for image retrieval and understanding image seeking behavior. The field of Multimedia Information Retrieval aims to satisfy users' queries by finding the correct results. It can be classified into two approaches: Text-based searching which uses textual information (word or composition of words), and Content Based Retrieval (CBR) which is based on the multimedia content itself [2], [3]. As a part of CBR and more specifically CBIR (Content Based Image Retrieval), the field of image identification can be defined to provide the ownership of digital media and certifying whether an image has been modified from its socket. The credibility of the photographic identification evidence can be extremely important in a variety of situations (insurance companies, courts, news agencies, and so on). Therefore, many studies have been interested in this area, by detecting copies of digital media.
The techniques of image identification can be classified into two categories: active and passive methods. Active methods such as watermarking [4] depend on prior information (mark) about the image. However, in many situations, copies of the original content can be disseminated before the application of any marks. So, using passive methods will be required to authenticate the image, which is the focus of our work.
Recent content-based image copy detection (CBCD) methods do not depend on the presence of marks and are more robust in detecting image transformations. However, these methods have a higher computational cost. Our contribution in this paper aimed at reducing the computational algorithm complexity.
Many works, in the literature review addressed the problem of image copy detection. They can be classified into three classes namely spatial domain, frequency domain, and spatial–frequency domain.
For the spatial domain, two types of image copy detection approaches can be distinguished: techniques based on global descriptors and those based on local descriptors extraction. Several works were based on global descriptors which extract the low-level description of images, e.g. extraction of color characteristics, textures, edges, which are built by quantization [5]. The merit of the global description is to reduce the computational time and storage footprint. However, they are not very accurate and can generate many false positives as well as the loss of information about the original descriptor. Furthermore, global descriptors are not very efficient against common modifications like affine transformations, cropping, and rotation. In addition, global descriptors like GIST is invariant to luminance transformations, blur, and resize [6]. But it is not invariant to translation, rotation, clutter, and crop.
The use of local descriptors like PBoF (Packed Bag-of-Features)[7] is more invariant and efficient than GIST: Occlusions and rotation.
As far as local descriptors are concerned, we can consider the works of Berrani et al. [8] and Min [9] which are based on the extraction of interest points. Other approaches based on the BoW (Bag-of-Word) model [10], [11], [12] which takes the image as a collection of discrete regions. The drawback of this model is that it describes only the appearance of images and ignores their spatial structure, which is very important in image representation, thus losing all the geometric information of images. This model can give good results in many fields like image classification, object recognition but it is still not efficient for copy detection because it is not able to detect the geometric transformations. Another approach based on the search of closed patterns was presented in [13], to encode the bag-of-words image representation into data mining transactions. This approach is similar to BoW model, but to overcome the difficulty of the binary representation of the item (visual word), researches represent images by a list of their top K Term Frequency-Inverse Document Frequency () weighed visual words. This approach is robust to some transformations such as JPEG compression, scaling, rotation, slight crops and illumination changes. Also, Ling et al. [14] proposed a fast copy detection method which used local image fingerprints to define visual words. Other approaches can be presented in [15], [16], [17].
As for frequency domain, we can quote the techniques defined by Kim [18] and Baaziz [19], based on Discrete Cosine Transform (DCT) in order to calculate the distance between the query image and the original one. Finishing with spatial–frequency domain, the work of Chang et al. [20] which is based on the use of daubechies wavelet and extracting wavelet coefficients as feature vector can be cited.
The above approaches of image copy detection impose either a high computational time resulting from the complexity of the algorithm or loss of information in order to reduce their complexity. For this reason, we thought of the concept of learning, which enabled us to extract significant and optimized features. The neural network is one of the best techniques of machine learning therefore applying wavelets to it leads to generating wavelet neural network. So, in this paper, we used the Beta Wavelet Network (BWN) [21] based on FWT [22], [23] to propose a new approach of image copy detection based on multiresolution. Our choice was determined because of two challenges. First, the wavelet can represent the image in the spatial–frequency domain properly [24], [25]. Second, it is used for learning. This network, which is a neural network in which the hidden layer contains wavelet functions, reduces computational costs, considerably, and it was an effective tool in various fields such as image compression [26], image retrieval [27], [28], [29], face recognition [30], image classification [31], [32], speech recognition [33] and in particular arabic word recognition [34], object tracking and recognition [35], [36], [37].
CBICD (Content-Based Image Copy Detection) aims to retrieve a multimedia document even when some transformations have been applied to it, more specifically detecting whether or not an image is a copy of some known original. Generally, the transformations include changing contrast, resizing, changing gamma values, rotating, flipping, shifting, cropping, camcording, blurring, stretching, zooming, and text or pattern insertion. In this paper, we used the Copydays dataset, generated by INREA which contains 18 kinds of scale & jpeg and cropping transformations, and the Holidays dataset2 in order to test the robustness of the proposed system in the retrieval phase. The results obtained prove the efficiency of our method against various transformations especially for scale and jpeg transformations.
The paper is organized as follows: Section 2 introduces our Beta wavelet theory, FWT concept and the proposed wavelet network architecture. In Section 3, we present an overview of the proposed method which is divided into two stages: the indexation and the copy detection. Section 4 devoted to present and discuss the experimental results. Finally, the paper ends with a conclusion which summarizes the findings of our approach.
Section snippets
Beta wavelet and wavelet transform
Before speaking of wavelet transform, it is necessary to define the wavelet used to construct the wavelet network.
Overview of the proposed approach
The process of image copy detection application was divided into two parts: the first for the training system (indexing or off-line stage) and the second for image copy detection algorithm (on-line stage).
Implementation overview
The system was implemented using Matlab2012b through the image processing toolboxes. Feature extraction and Image Copy Detection were performed on an Intel(R) Core(TM) i3-3217U CPU 1.80 GHz processor with 4 GB of RAM operating with Microsoft Windows 64 bit system.
Experimental results
To manage the proposed system in the context of quality, we followed three essential phases: the learning, the validation, and the detection phases.
So, to achieve these phases, we performed our experiments on two datasets: Copydays
Conclusion
In this paper, we presented fast descriptors, based on Beta wavelet network, for image copy detection. To define the principle of the proposed approach, based essentially on two phases, the learning phase and the copy detection one, we introduced the steps of the learning algorithm. The training step reduced the computational cost perfectly since for each image, we extracted faster descriptors, containing only coefficients of wavelets and scaling functions having the best contributions to the
Acknowledgement
The authors would like to acknowledge the financial support of this work by grants from General Direction of Scientific Research (DGRST), Tunisia, under the ARUB program.
Asma Eladel was born in Gabes, Tunisia in 1984. She obtained her Baccalaureate degree in 2004 from Abou Loubaba School, Gabes. She obtained her diploma in computer sciences from the higher Institute of Management of Gabes in 2007 and her master degree in computer sciences from the high institute of computer sciences and multimedia, in 2009. Now, she is pursuing her doctoral studies at the engineering school of Sfax, Tunisia (ENIS) and she is a member of Research Groups on Intelligent Machines
References (37)
- et al.
Local ternary co-occurrence patternsa new feature descriptor for MRI and CT image retrieval
Neurocomput. J.
(2013) - et al.
An efficient indexing method for content-based image retrieval
Neurocomput. J.
(2013) - et al.
Color information for region segmentation
Comput. Graph. Image Process.
(1980) - et al.
A statistical design approach to unsupervised codeword selection in image retrieval
Neurocomput. J.
(2015) - et al.
Fast image copy detection approach based on local fingerprint defined visual words
Signal Process.
(2013) - et al.
Graph-based approach for human action recognition using spatio-temporal features
J. Vis. Commun. Image Represent.
(2014) - Y. Pang, Q. Hao, Y. Yuan, T. Hu, R. Cai, L. Zhang, Summarizing tourist destinations by mining user-generated...
- et al.
A multimedia application for watermarking digital images based on a content based image retrieval technique
Multimed. Tools Appl.
(2010) - M. Douze, H. Jgou, H. Sandhawalia, L. Amsaleg, C. Schmid, Evaluation of GIST descriptors for web-scale image search,...
- M. Douze, C. Schmid, Packing bag-of-features, in: IEEE 12th International Conference on Computer Vision (ICCV), Kyoto,...
Generating descriptive visual words and visual phrases for large-scale image applications
IEEE Trans. Image Process.
Learning regularized LDA by clustering
IEEE Trans. Neural Netw. Learn. Syst.
Distributed object detection with linear SVMs
IEEE Trans. Cybern.
Ranking graph embedding for learning to rerank
IEEE Trans. Neural Netw. Learn. Syst.
Cited by (9)
B-CNN: Betadeep Convolutional Neural Network over encrypted data
2023, Proceedings of SPIE - The International Society for Optical EngineeringRethinking Graph Neural Networks for Anomaly Detection
2022, Proceedings of Machine Learning ResearchDeep Convolutional Neural Network Based on Wavelet Transform for Super Image Resolution
2021, Advances in Intelligent Systems and ComputingA survey on generative adversarial networks and their variants methods
2020, Proceedings of SPIE - The International Society for Optical EngineeringImage feature-based affective retrieval employing improved parameter and structure identification of adaptive neuro-fuzzy inference system
2018, Neural Computing and Applications
Asma Eladel was born in Gabes, Tunisia in 1984. She obtained her Baccalaureate degree in 2004 from Abou Loubaba School, Gabes. She obtained her diploma in computer sciences from the higher Institute of Management of Gabes in 2007 and her master degree in computer sciences from the high institute of computer sciences and multimedia, in 2009. Now, she is pursuing her doctoral studies at the engineering school of Sfax, Tunisia (ENIS) and she is a member of Research Groups on Intelligent Machines (REGIM-Lab) and she is an IEEE student member. She is currently teaching as contractual assistant at the National Engineering School of Gabes (ENIG).
Mourad Zaied received the Ph.D in Computer Engineering and the Masters of science (DEA : Diploma in Higher Applied Studies) from the National Engineering School of Sfax (ENIS) respectively in 2008 and in 2003. He obtained the degree of Computer Engineer from the National Engineering School of Monastir (ENIM) in 1995. Since 1997 he has served in several institutes and faculties in university of Gabes as teaching assistant. He joined in 2007 the National Engineering School of Gabes (ENIG) as where he is currently an associate professor in the Department of Electrical Engineering. He has been a member of the REsearch Group on Intelligent Machines laboraory (REGIM) http://www.regim.org in the National Engineering School of Sfax (ENIS) since 2001. His research interests include Computer Vision and Image and video analysis. These research activities are centered around Wavelets and Wavelet networks and their applications to data classification and approximation, pattern recognition and image, audio and video coding and indexing. He organized two Winter Schools on Matlab toolkits (2004) and wavelet and its applications (2005). He is an IEEE senior member and he was the chair of the Workshop on Intelligent Machines: Theories & Applications (WIMTA II 2009), chair of Neural Networks & Applications session of the International Conference on Systems, Man, and Cybernetics (SMC'2014, CA, USA), and he is a member of the scientific commitee of the International Conference on Communications and Information Technology (ICCIT) and the International Conference on Machine Vision (ICMV).
Chokri Ben Amar received the B.S. degree in Electrical Engineering from the National Engineering School of Sfax (ENIS) in 1989, the M.S. and Ph.D. degrees in Computer Engineering from the National Institute of Applied Sciences in Lyon, France, in 1990 and 1994, respectively. He spent one year at the University of ”Haute Savoie” (France) as a teaching assistant and researcher before joining the higher School of Sciences and Techniques of Tunis as Assistant Professor in 1995. In 1999, he joined the Sfax University (USS), where he is currently a professor in the Department of Electrical Engineering of the National Engineering School of Sfax (ENIS), and the Vice director of the REsearch Group on Intelligent Machines (REGIM). His research interests include Computer Vision and Image and video analysis. These research activities are centered on Wavelets and Wavelet networks and their applications to data Classification and approximation, Pattern Recognition and image and video coding, indexing and watermarking. He is a senior member of IEEE, and the chair of the IEEE SPS Tunisia Chapter since 2009. He was the Vice-chair of the International Conference on Image Processing, Applications and Systems (IPAS’2014), the Honorary chair of the International Conference on Information Assurance and Security (IAS’2013), the Program chair of the International Conference on Individual and Collective Behaviors in Robotics (ICBR’2013), the Doctoral Consortium Workshop Organizer behalf the International Conference on Advanced Logistics and Transport (ICALT’2013), the chair of the IEEE NGNS'2011 (IEEE Third International Conference on Next Generation Networks and Services) and the Workshop on Intelligent Machines: Theories & Applications (WIMTA 2008) and the chairman of the organizing committees of the ”Traitement et Analyse de l'Information : Methodes et Applications (TAIMA 2009)” conference, International Conference on Machine Intelligence ACIDCA-ICMI'2005 and International Conference on Signals, Circuits and Systems SCS'2004.