Steganalysis classifier training via minimizing sensitivity for different imaging sources
Introduction
Steganography presents a potential security threat to society, in general and corporations, in particular. Embedded messages, named stego messages, are hidden in digital media such as images [1], [2], [3], audio [4] and video [5] for secret communication. Steganalysis is a technique used to determine whether a digital media has a stego message or not. Current learning based steganalysis methods consist of two major components: a feature extractor and a classifier. Steganalysis classifiers are trained by a set of images which consist of both clean and stego images. Among different types of digital media, JPEG image is the most widely used digital media on the Internet. Therefore, JPEG is a favorable carrier of steganography. In particular, for most JPEG steganalysis, both training and testing datasets use JPEG images compressed by the same quantization table. When different compression quantization tables are used for training and testing image sets, the performance of the steganalysis classifier degrades significantly [6], [7].
With the ever proliferation of digital cameras and image editing software available today, JPEG images on the Internet are compressed by many different quantization tables [8]. Moreover, a growing number of digital camera manufacturers, such as Sony, Nikon and Pentax, adopt variable quantization tables which are computed based on the image content dynamically. By using the quantization table for training to re-compressing an image, which is compressed by an unseen quantization table, cannot recover an image directly. Extra quantization step will change the steganalysis features of the JPEG image [9]. Therefore it would be unreasonable or impracticable to assume prior knowledge of compression quantization table of an unseen image examined for stego message.
In the real world situation, both images and quantization tables could be different from those used for training the classifier. In addition to quantization table difference, the difference in image content will also affect the performance of steganalysis [10]. The extracted steganalysis feature values will be different from those of the training images [7]. These differences could be treated as perturbation in features and are unavoidable. So, the robustness of steganalysis classifier with respect to feature perturbations is essential to its performance.
Current steganalysis methods make use of off-the-shelf classification methods such as neural network [11], [12], Support Vector Machine (SVM) [13], [14], dynamic evolving neural fuzzy inference system [15] and ensemble of classifiers [16], [17]. However, none of them addresses the issue of perturbation between training and testing images.
In this work, we propose a Localized Generalization Error Steganalysis classifier (LG-Steganalyzer), which is robust to images compressed by quantization tables different from that of training images. Such a situation is unavoidable in real-world applications. A Radial Basis Function Neural Network (RBFNN) is trained via a minimization of a training error and a stochastic sensitivity. RBFNN is selected because of its fast training speed in the presence of large data which is important for dealing with network security problems. The stochastic sensitivity is proposed to capture the influence of feature perturbation created by changes in JPEG quantization table of testing images with respect to the RBFNN classification. With the proposed steganalyzer training method, the LG-Steganalyzer provides a better robustness of steganalysis in real-world applications. Major contributions of the LG-Steganalyzer to steganalysis are:
- 1.
RBFNN trained by the LG-Steganalyzer is robust to real-world situations, e.g. difference in quantization tables and difference in content between training and testing images.
- 2.
The proposed LG-Steganalyzer could be used with any compression quantization table, and any steganalysis feature extraction technique.
This paper is organized as follows: Section 2 provides a brief introduction on steganalysis and JPEG quantization tables. The LG-Steganalyzer is described in Section 3. Experimental results are presented in Section 4. Section 5 concludes this work.
Section snippets
Steganalysis and quantization table
We first provide a brief introduction to current steganalysis methods in Section 2.1. Section 2.2 discusses the importance of quantization tables in steganalysis. Section 2.3 demonstrates input perturbations caused by changes in quantization tables and image contents.
LG-Steganalyzer
Fig. 6 shows the functional blocks of the LG-Steganalyzer. The two-phase training component works only for the training of LG-Steganalyzer at the beginning or whenever update to the RBFNN is necessary. The steganalysis feature extraction is selected by user and transparent to the LG-Steganalyzer. The binary RBFNN classifier is trained via a minimization of the Localized Generalization Error Model (L-GEM) [22] to learn the two-class classification of stego and clean images, with a given set of
Experimental results
As aforementioned, it is impossible to restrict the imaging source of JPEG images being investigated by a trained steganalysis system. Therefore, the quantization tables used in compressing the unseen images are rarely the same as the ones being used for compressing the training images. In the experiments, we will first compare the testing accuracies for both training and testing images compressed by the same quantization table. To simulate the diversity of possible quantization tables in the
Conclusion
In the real-world application of steganalysis system, we cannot restrict the imaging source of JPEG being transmitted over the Internet. The discussion of this work focuses on the different quantization tables being used by different software and cameras. The steganalysis feature difference created by different quantization tables is formulated as a feature perturbation. Sensitivity is defined to measure the effect of the feature perturbation to a classifier. A two-phase training algorithm for
Acknowledgement
This work is supported by National Natural Science Foundation of China (61272201) and a Program for New Century Excellent Talents in University of China (NCET-11-0162).
References (28)
- et al.
Steganography for MP3 audio by exploiting the rule of window switching
Comput. Security
(2012) - et al.
A new steganography algorithm based on color histograms for data embedding into raw video streams
Comput. Security
(2009) - et al.
Image complexity and feature mining for steganalysis of least significant bit matching steganography
Inform. Sci.
(2008) - et al.
Steganalysis and payload estimation of embedding in pixel differences using neural networks
Pattern Recogn.
(2010) - et al.
Feature mining and pattern classification for steganalysis of LSB matching steganography in grayscale images
Pattern Recogn.
(2008) - et al.
An improved approach to steganalysis of JPEG images
Inform. Sci.
(2010) - et al.
Dynamic fusion method using localized generalization error model
Inform. Sci.
(2012) - et al.
Image classification with the use of radial basis function neural networks and the minimization of the localized generalization error
Pattern Recogn.
(2007) - et al.
Radial basis function network learning using localized generalization error bound
Inform. Sci.
(2009) Model-based methods for steganography and steganalysis
Int. J. Image Graph.
(2005)
Statistically undetectable JPEG steganography: dead ends challenges, and opportunities
Model based steganography
Steganalysis with mismatched covers: do simple classifiers help?
Cited by (19)
RCDD: Contrastive domain discrepancy with reliable steganalysis labeling for cover source mismatch
2024, Expert Systems with ApplicationsSensitivity based robust learning for stacked autoencoder against evasion attack
2017, NeurocomputingCitation Excerpt :The sensitivity measure is defined as the change of learner’s outputs when the input has a small fluctuation. It has been shown that methods with sensitivity measure achieve good performances in many applications, for example, classifier training [25], dynamic fusion [26] and model selection [27]. The algorithm can be applied to the unsupervised representation learning and the supervised fine-tuning phase for the stacked autoencoder.
An adaptive secret image sharing with a new bitwise steganographic property
2016, Information SciencesCitation Excerpt :Authors of [50,51] surveyed state of the art data-based methods to earn embedded data in text-type data sets. Similarly, efforts to extract concealed data in steganography are named steganalysis [33–35]. The aforesaid schemes and similar ones, such as [4,7,46], used Least Significant Bit (LSB), which cannot withstand even simple steganalysis algorithms.
New framework for unsupervised universal steganalysis via SRISP-aided outlier detection
2016, Signal Processing: Image CommunicationCitation Excerpt :However, the performance of these approaches is still inferior to that of the matched scenario. Ng et al. [31] considered feature perturbations caused by the difference in JPEG quantization tables and proposed a steganalysis classifier based on radial basis function neural network by minimizing the defined sensitivity. This classifier is different from the aforementioned ones and has been verified to outperform other learning methods such as SVM and radial basis function neural network without considering feature perturbation.
Cover-Source Mismatch in Steganalysis: Systematic Review
2024, Research Square