Elsevier

Neurocomputing

Volume 173, Part 2, 15 January 2016, Pages 306-316
Neurocomputing

Fast beta wavelet network-based feature extraction for image copy detection

https://doi.org/10.1016/j.neucom.2015.04.113Get rights and content

Abstract

The applications of authors' rights protection against illegal generation of images do not cease to evolve. However, most of them impose a very high computational cost especially when working with a database containing thousands of images. This paper addresses the problem of authors' rights violation and presents an original scheme, for Content-Based Image Copy Detection (CBICD), based on two screens: approximation screen and details screen. These two screens, based on Fast Beta Wavelet Transform (FBWT), aimed to filter the original images based first on their approximation and then on detail appearances respectively to display the corresponding original image to a given query one (a copy image). Extensive experiments of 8118 images from Copydays and Holidays datasets generated by INREA1 proved the effectiveness as well as the search speed of our approach in CBICD.

Introduction

Nowadays, enormous amounts of images are being generated, stored, and shared by many groups all over the world including radiologists, journalists, librarians, photographers, historians, and even the public [1]. The illegal generation of images resulted from different sources; innovative multimedia products such as the evolution of digital storage devices, the availability of affordable digital cameras, applications and hardware image processing, as well as the expansion and the rapid growth of the Internet.

It is therefore necessary to develop efficient systems for image retrieval and understanding image seeking behavior. The field of Multimedia Information Retrieval aims to satisfy users' queries by finding the correct results. It can be classified into two approaches: Text-based searching which uses textual information (word or composition of words), and Content Based Retrieval (CBR) which is based on the multimedia content itself [2], [3]. As a part of CBR and more specifically CBIR (Content Based Image Retrieval), the field of image identification can be defined to provide the ownership of digital media and certifying whether an image has been modified from its socket. The credibility of the photographic identification evidence can be extremely important in a variety of situations (insurance companies, courts, news agencies, and so on). Therefore, many studies have been interested in this area, by detecting copies of digital media.

The techniques of image identification can be classified into two categories: active and passive methods. Active methods such as watermarking [4] depend on prior information (mark) about the image. However, in many situations, copies of the original content can be disseminated before the application of any marks. So, using passive methods will be required to authenticate the image, which is the focus of our work.

Recent content-based image copy detection (CBCD) methods do not depend on the presence of marks and are more robust in detecting image transformations. However, these methods have a higher computational cost. Our contribution in this paper aimed at reducing the computational algorithm complexity.

Many works, in the literature review addressed the problem of image copy detection. They can be classified into three classes namely spatial domain, frequency domain, and spatial–frequency domain.

For the spatial domain, two types of image copy detection approaches can be distinguished: techniques based on global descriptors and those based on local descriptors extraction. Several works were based on global descriptors which extract the low-level description of images, e.g. extraction of color characteristics, textures, edges, which are built by quantization [5]. The merit of the global description is to reduce the computational time and storage footprint. However, they are not very accurate and can generate many false positives as well as the loss of information about the original descriptor. Furthermore, global descriptors are not very efficient against common modifications like affine transformations, cropping, and rotation. In addition, global descriptors like GIST is invariant to luminance transformations, blur, and resize [6]. But it is not invariant to translation, rotation, clutter, and crop.

The use of local descriptors like PBoF (Packed Bag-of-Features)[7] is more invariant and efficient than GIST: Occlusions and rotation.

As far as local descriptors are concerned, we can consider the works of Berrani et al. [8] and Min [9] which are based on the extraction of interest points. Other approaches based on the BoW (Bag-of-Word) model [10], [11], [12] which takes the image as a collection of discrete regions. The drawback of this model is that it describes only the appearance of images and ignores their spatial structure, which is very important in image representation, thus losing all the geometric information of images. This model can give good results in many fields like image classification, object recognition but it is still not efficient for copy detection because it is not able to detect the geometric transformations. Another approach based on the search of closed patterns was presented in [13], to encode the bag-of-words image representation into data mining transactions. This approach is similar to BoW model, but to overcome the difficulty of the binary representation of the item (visual word), researches represent images by a list of their top K Term Frequency-Inverse Document Frequency (TFIDF) weighed visual words. This approach is robust to some transformations such as JPEG compression, scaling, rotation, slight crops and illumination changes. Also, Ling et al. [14] proposed a fast copy detection method which used local image fingerprints to define visual words. Other approaches can be presented in [15], [16], [17].

As for frequency domain, we can quote the techniques defined by Kim [18] and Baaziz [19], based on Discrete Cosine Transform (DCT) in order to calculate the distance between the query image and the original one. Finishing with spatial–frequency domain, the work of Chang et al. [20] which is based on the use of daubechies wavelet and extracting wavelet coefficients as feature vector can be cited.

The above approaches of image copy detection impose either a high computational time resulting from the complexity of the algorithm or loss of information in order to reduce their complexity. For this reason, we thought of the concept of learning, which enabled us to extract significant and optimized features. The neural network is one of the best techniques of machine learning therefore applying wavelets to it leads to generating wavelet neural network. So, in this paper, we used the Beta Wavelet Network (BWN) [21] based on FWT [22], [23] to propose a new approach of image copy detection based on multiresolution. Our choice was determined because of two challenges. First, the wavelet can represent the image in the spatial–frequency domain properly [24], [25]. Second, it is used for learning. This network, which is a neural network in which the hidden layer contains wavelet functions, reduces computational costs, considerably, and it was an effective tool in various fields such as image compression [26], image retrieval [27], [28], [29], face recognition [30], image classification [31], [32], speech recognition [33] and in particular arabic word recognition [34], object tracking and recognition [35], [36], [37].

CBICD (Content-Based Image Copy Detection) aims to retrieve a multimedia document even when some transformations have been applied to it, more specifically detecting whether or not an image is a copy of some known original. Generally, the transformations include changing contrast, resizing, changing gamma values, rotating, flipping, shifting, cropping, camcording, blurring, stretching, zooming, and text or pattern insertion. In this paper, we used the Copydays dataset, generated by INREA which contains 18 kinds of scale & jpeg and cropping transformations, and the Holidays dataset2 in order to test the robustness of the proposed system in the retrieval phase. The results obtained prove the efficiency of our method against various transformations especially for scale and jpeg transformations.

The paper is organized as follows: Section 2 introduces our Beta wavelet theory, FWT concept and the proposed wavelet network architecture. In Section 3, we present an overview of the proposed method which is divided into two stages: the indexation and the copy detection. Section 4 devoted to present and discuss the experimental results. Finally, the paper ends with a conclusion which summarizes the findings of our approach.

Section snippets

Beta wavelet and wavelet transform

Before speaking of wavelet transform, it is necessary to define the wavelet used to construct the wavelet network.

Overview of the proposed approach

The process of image copy detection application was divided into two parts: the first for the training system (indexing or off-line stage) and the second for image copy detection algorithm (on-line stage).

Implementation overview

The system was implemented using Matlab2012b through the image processing toolboxes. Feature extraction and Image Copy Detection were performed on an Intel(R) Core(TM) i3-3217U CPU 1.80 GHz processor with 4 GB of RAM operating with Microsoft Windows 64 bit system.

Experimental results

To manage the proposed system in the context of quality, we followed three essential phases: the learning, the validation, and the detection phases.

So, to achieve these phases, we performed our experiments on two datasets: Copydays

Conclusion

In this paper, we presented fast descriptors, based on Beta wavelet network, for image copy detection. To define the principle of the proposed approach, based essentially on two phases, the learning phase and the copy detection one, we introduced the steps of the learning algorithm. The training step reduced the computational cost perfectly since for each image, we extracted faster descriptors, containing only coefficients of wavelets and scaling functions having the best contributions to the

Acknowledgement

The authors would like to acknowledge the financial support of this work by grants from General Direction of Scientific Research (DGRST), Tunisia, under the ARUB program.

Asma Eladel was born in Gabes, Tunisia in 1984. She obtained her Baccalaureate degree in 2004 from Abou Loubaba School, Gabes. She obtained her diploma in computer sciences from the higher Institute of Management of Gabes in 2007 and her master degree in computer sciences from the high institute of computer sciences and multimedia, in 2009. Now, she is pursuing her doctoral studies at the engineering school of Sfax, Tunisia (ENIS) and she is a member of Research Groups on Intelligent Machines

References (37)

  • S. Berrani, L. Amsaleg, P. Gros, Robust content-based image searches for copyright protection, in: Proceedings of the...
  • Y. Min, X. Li, Y. Zhao, A new SIFT keypoint descriptor for copy detection, in: The 4th International Congress on Image...
  • S. Zhang et al.

    Generating descriptive visual words and visual phrases for large-scale image applications

    IEEE Trans. Image Process.

    (2011)
  • P. Turcot, D. Lowe, Better matching with fewer features: the selection of useful features in large database recognition...
  • W. Dong, Z. Wang, M. Charikar, K. Li, High-confidance near-duplicate image detection, in: ACM International Conference...
  • Y. Pang et al.

    Learning regularized LDA by clustering

    IEEE Trans. Neural Netw. Learn. Syst.

    (2014)
  • Y. Pang et al.

    Distributed object detection with linear SVMs

    IEEE Trans. Cybern.

    (2014)
  • Y. Pang et al.

    Ranking graph embedding for learning to rerank

    IEEE Trans. Neural Netw. Learn. Syst.

    (2013)
  • Asma Eladel was born in Gabes, Tunisia in 1984. She obtained her Baccalaureate degree in 2004 from Abou Loubaba School, Gabes. She obtained her diploma in computer sciences from the higher Institute of Management of Gabes in 2007 and her master degree in computer sciences from the high institute of computer sciences and multimedia, in 2009. Now, she is pursuing her doctoral studies at the engineering school of Sfax, Tunisia (ENIS) and she is a member of Research Groups on Intelligent Machines (REGIM-Lab) and she is an IEEE student member. She is currently teaching as contractual assistant at the National Engineering School of Gabes (ENIG).

    Mourad Zaied received the Ph.D in Computer Engineering and the Masters of science (DEA : Diploma in Higher Applied Studies) from the National Engineering School of Sfax (ENIS) respectively in 2008 and in 2003. He obtained the degree of Computer Engineer from the National Engineering School of Monastir (ENIM) in 1995. Since 1997 he has served in several institutes and faculties in university of Gabes as teaching assistant. He joined in 2007 the National Engineering School of Gabes (ENIG) as where he is currently an associate professor in the Department of Electrical Engineering. He has been a member of the REsearch Group on Intelligent Machines laboraory (REGIM) http://www.regim.org in the National Engineering School of Sfax (ENIS) since 2001. His research interests include Computer Vision and Image and video analysis. These research activities are centered around Wavelets and Wavelet networks and their applications to data classification and approximation, pattern recognition and image, audio and video coding and indexing. He organized two Winter Schools on Matlab toolkits (2004) and wavelet and its applications (2005). He is an IEEE senior member and he was the chair of the Workshop on Intelligent Machines: Theories & Applications (WIMTA II 2009), chair of Neural Networks & Applications session of the International Conference on Systems, Man, and Cybernetics (SMC'2014, CA, USA), and he is a member of the scientific commitee of the International Conference on Communications and Information Technology (ICCIT) and the International Conference on Machine Vision (ICMV).

    Chokri Ben Amar received the B.S. degree in Electrical Engineering from the National Engineering School of Sfax (ENIS) in 1989, the M.S. and Ph.D. degrees in Computer Engineering from the National Institute of Applied Sciences in Lyon, France, in 1990 and 1994, respectively. He spent one year at the University of ”Haute Savoie” (France) as a teaching assistant and researcher before joining the higher School of Sciences and Techniques of Tunis as Assistant Professor in 1995. In 1999, he joined the Sfax University (USS), where he is currently a professor in the Department of Electrical Engineering of the National Engineering School of Sfax (ENIS), and the Vice director of the REsearch Group on Intelligent Machines (REGIM). His research interests include Computer Vision and Image and video analysis. These research activities are centered on Wavelets and Wavelet networks and their applications to data Classification and approximation, Pattern Recognition and image and video coding, indexing and watermarking. He is a senior member of IEEE, and the chair of the IEEE SPS Tunisia Chapter since 2009. He was the Vice-chair of the International Conference on Image Processing, Applications and Systems (IPAS’2014), the Honorary chair of the International Conference on Information Assurance and Security (IAS’2013), the Program chair of the International Conference on Individual and Collective Behaviors in Robotics (ICBR’2013), the Doctoral Consortium Workshop Organizer behalf the International Conference on Advanced Logistics and Transport (ICALT’2013), the chair of the IEEE NGNS'2011 (IEEE Third International Conference on Next Generation Networks and Services) and the Workshop on Intelligent Machines: Theories & Applications (WIMTA 2008) and the chairman of the organizing committees of the ”Traitement et Analyse de l'Information : Methodes et Applications (TAIMA 2009)” conference, International Conference on Machine Intelligence ACIDCA-ICMI'2005 and International Conference on Signals, Circuits and Systems SCS'2004.

    View full text