Elsevier

Information Sciences

Volume 281, 10 October 2014, Pages 211-224
Information Sciences

Steganalysis classifier training via minimizing sensitivity for different imaging sources

https://doi.org/10.1016/j.ins.2014.05.028Get rights and content

Abstract

Owing to the ever proliferation of digital cameras and image editing software, a large variety of JPEG quantization tables are used to compress JPEG images. As a result, learning-based steganalysis methods using a pre-selected quantization table for training images degrade significantly when the quantization table of testing images is different from the one used for training. Recognizing that it would be undesirable and not practical to train a steganalysis classifier with all possible quantization tables, we propose an approach that the differences in features extracted from images with different quantization tables are formulated as perturbations of those features. Then we define a stochastic sensitivity by the expected square of classifier output changes with respect to these feature perturbations to compute the robustness of classifiers with respect to perturbations. A Radial Basis Function Neural Network based steganalysis classifier trained by minimizing the sensitivity is proposed. Experimental results show that the proposed method outperforms learning methods such as Support Vector Machine and Radial Basis Function Neural Network without considering feature perturbations.

Introduction

Steganography presents a potential security threat to society, in general and corporations, in particular. Embedded messages, named stego messages, are hidden in digital media such as images [1], [2], [3], audio [4] and video [5] for secret communication. Steganalysis is a technique used to determine whether a digital media has a stego message or not. Current learning based steganalysis methods consist of two major components: a feature extractor and a classifier. Steganalysis classifiers are trained by a set of images which consist of both clean and stego images. Among different types of digital media, JPEG image is the most widely used digital media on the Internet. Therefore, JPEG is a favorable carrier of steganography. In particular, for most JPEG steganalysis, both training and testing datasets use JPEG images compressed by the same quantization table. When different compression quantization tables are used for training and testing image sets, the performance of the steganalysis classifier degrades significantly [6], [7].

With the ever proliferation of digital cameras and image editing software available today, JPEG images on the Internet are compressed by many different quantization tables [8]. Moreover, a growing number of digital camera manufacturers, such as Sony, Nikon and Pentax, adopt variable quantization tables which are computed based on the image content dynamically. By using the quantization table for training to re-compressing an image, which is compressed by an unseen quantization table, cannot recover an image directly. Extra quantization step will change the steganalysis features of the JPEG image [9]. Therefore it would be unreasonable or impracticable to assume prior knowledge of compression quantization table of an unseen image examined for stego message.

In the real world situation, both images and quantization tables could be different from those used for training the classifier. In addition to quantization table difference, the difference in image content will also affect the performance of steganalysis [10]. The extracted steganalysis feature values will be different from those of the training images [7]. These differences could be treated as perturbation in features and are unavoidable. So, the robustness of steganalysis classifier with respect to feature perturbations is essential to its performance.

Current steganalysis methods make use of off-the-shelf classification methods such as neural network [11], [12], Support Vector Machine (SVM) [13], [14], dynamic evolving neural fuzzy inference system [15] and ensemble of classifiers [16], [17]. However, none of them addresses the issue of perturbation between training and testing images.

In this work, we propose a Localized Generalization Error Steganalysis classifier (LG-Steganalyzer), which is robust to images compressed by quantization tables different from that of training images. Such a situation is unavoidable in real-world applications. A Radial Basis Function Neural Network (RBFNN) is trained via a minimization of a training error and a stochastic sensitivity. RBFNN is selected because of its fast training speed in the presence of large data which is important for dealing with network security problems. The stochastic sensitivity is proposed to capture the influence of feature perturbation created by changes in JPEG quantization table of testing images with respect to the RBFNN classification. With the proposed steganalyzer training method, the LG-Steganalyzer provides a better robustness of steganalysis in real-world applications. Major contributions of the LG-Steganalyzer to steganalysis are:

  • 1.

    RBFNN trained by the LG-Steganalyzer is robust to real-world situations, e.g. difference in quantization tables and difference in content between training and testing images.

  • 2.

    The proposed LG-Steganalyzer could be used with any compression quantization table, and any steganalysis feature extraction technique.

This paper is organized as follows: Section 2 provides a brief introduction on steganalysis and JPEG quantization tables. The LG-Steganalyzer is described in Section 3. Experimental results are presented in Section 4. Section 5 concludes this work.

Section snippets

Steganalysis and quantization table

We first provide a brief introduction to current steganalysis methods in Section 2.1. Section 2.2 discusses the importance of quantization tables in steganalysis. Section 2.3 demonstrates input perturbations caused by changes in quantization tables and image contents.

LG-Steganalyzer

Fig. 6 shows the functional blocks of the LG-Steganalyzer. The two-phase training component works only for the training of LG-Steganalyzer at the beginning or whenever update to the RBFNN is necessary. The steganalysis feature extraction is selected by user and transparent to the LG-Steganalyzer. The binary RBFNN classifier is trained via a minimization of the Localized Generalization Error Model (L-GEM) [22] to learn the two-class classification of stego and clean images, with a given set of

Experimental results

As aforementioned, it is impossible to restrict the imaging source of JPEG images being investigated by a trained steganalysis system. Therefore, the quantization tables used in compressing the unseen images are rarely the same as the ones being used for compressing the training images. In the experiments, we will first compare the testing accuracies for both training and testing images compressed by the same quantization table. To simulate the diversity of possible quantization tables in the

Conclusion

In the real-world application of steganalysis system, we cannot restrict the imaging source of JPEG being transmitted over the Internet. The discussion of this work focuses on the different quantization tables being used by different software and cameras. The steganalysis feature difference created by different quantization tables is formulated as a feature perturbation. Sensitivity is defined to measure the effect of the feature perturbation to a classifier. A two-phase training algorithm for

Acknowledgement

This work is supported by National Natural Science Foundation of China (61272201) and a Program for New Century Excellent Talents in University of China (NCET-11-0162).

References (28)

  • J. Fridrich et al.

    Statistically undetectable JPEG steganography: dead ends challenges, and opportunities

  • P. Sallee

    Model based steganography

  • Z. Khan, A.B. Mansoor, An analysis of quality factor on image steganalysis, in: The 7th International Conference on...
  • I. Lubenko et al.

    Steganalysis with mismatched covers: do simple classifiers help?

  • Cited by (19)

    • Sensitivity based robust learning for stacked autoencoder against evasion attack

      2017, Neurocomputing
      Citation Excerpt :

      The sensitivity measure is defined as the change of learner’s outputs when the input has a small fluctuation. It has been shown that methods with sensitivity measure achieve good performances in many applications, for example, classifier training [25], dynamic fusion [26] and model selection [27]. The algorithm can be applied to the unsupervised representation learning and the supervised fine-tuning phase for the stacked autoencoder.

    • An adaptive secret image sharing with a new bitwise steganographic property

      2016, Information Sciences
      Citation Excerpt :

      Authors of [50,51] surveyed state of the art data-based methods to earn embedded data in text-type data sets. Similarly, efforts to extract concealed data in steganography are named steganalysis [33–35]. The aforesaid schemes and similar ones, such as [4,7,46], used Least Significant Bit (LSB), which cannot withstand even simple steganalysis algorithms.

    • New framework for unsupervised universal steganalysis via SRISP-aided outlier detection

      2016, Signal Processing: Image Communication
      Citation Excerpt :

      However, the performance of these approaches is still inferior to that of the matched scenario. Ng et al. [31] considered feature perturbations caused by the difference in JPEG quantization tables and proposed a steganalysis classifier based on radial basis function neural network by minimizing the defined sensitivity. This classifier is different from the aforementioned ones and has been verified to outperform other learning methods such as SVM and radial basis function neural network without considering feature perturbation.

    View all citing articles on Scopus
    View full text