Elsevier

Signal Processing

Volume 93, Issue 7, July 2013, Pages 1724-1737
Signal Processing

Application of hypothesis testing theory for optimal detection of LSB matching data hiding

https://doi.org/10.1016/j.sigpro.2013.01.014Get rights and content

Abstract

This paper addresses the problem of detecting the presence of data hidden in digital media by the Least Significant Bit (LSB) matching scheme. In a theoretical context of known digital medium parameters, two important results are presented. First, the use of hypothesis testing theory allows the design of the Most Powerful (MP) test. Second, a study of the MP test provides the opportunity to analytically calculate its statistical properties in order to warrant a given probability of false-alarm. In practice when detecting LSB matching, the unknown medium parameters have to be estimated. Based on a local model of medium content, two different estimations which lead to two different tests are present. A numerical comparison with state-of-the-art detectors shows the good performance of the proposed tests and highlights the relevance of the proposed methodology.

Highlights

► Steganalysis is addressed using hypothesis testing theory. ► The statistical performance of the proposed test is analytically calculated. ► The proposed test permits the guaranteeing of a false-alarm probability. ► This provides an upper bound on the detection performance of any detector. ► Using general statistical concept it can be applied for a wide range of media.

Section snippets

Introduction and contributions

Steganography concerns the reliable transmission of a secret message buried in a host digital medium, such as digital image or audio file. This data hiding technique has been mainly used in information security applications and has receive an increasing interest in the past decade. While a cyphered messages can easily be detected the detection of data hidden within innocuous-looking digital media remains a difficult problem. More generally, the goal of steganalysis is to obtain any information

Statistical model of media

This paper studies uncompressed digital medium and, without loss of generality, focuses on natural images, i.e. recorded with some imaging device. Hence, the column vector C=(c1,,cN)T represents medium, of N=Nx×Ny pixels for a grayscale image. The set of quantized levels is denoted Z={0;;2B1} as sample values are usually unsigned integers encoded with B bits. Each cover sample cn results from the quantization:cn=QΔ(yn),where ynR+ denotes the analogical sample value recorded by the

Optimal Likelihood Ratio Test for simple hypotheses

Let us start with the simplest case, when the embedding rate R and, for all n, the parameters θn are known. In this case, the hypothesis testing problem (9) is reduced to a test between two simple hypotheses.

In virtue of the Neyman–Pearson lemma, see [26, Theorem 3.2.1], the most powerful (MP) test over the class Kα0 (10) is the LRT given by the following decision rule:δR(Z)=H0ifΛR(Z)τα0,H1ifΛR(Z)>τα0,where τα0 is the solution of P0[δ(Z)>τα0]=α0, to insure that δRKα0, and the likelihood ratio

Case of simple hypotheses, when R=2

In this section it is first proposed to study the statistical performance for the case of simple hypotheses, when R=2. The results are then extended to the general case of R(0;1] in Section 4.2. To calculate easily the statistical performance of the LR test δR (11), the asymptotic approach is of crucial interest. Indeed, even though Theorem 1 establishes that υn tends to be distributed as a Gamma distribution, it is not easy to explicit the distribution of the sum n=1Nυn, see [33] for a

Practical design of LR test: dealing with nuisance parameters

In practice, the application of the test δ˜2 (22) is compromised because neither the expectation μn nor the variance σn2 of samples are known. In such a situation, an usual solution consist in replacing the unknown values by their Maximum Likelihood Estimation (MLE), denoted μ^n and σ^n2, respectively, to design a Generalized Likelihood Ratio Test (GLRT).

However, accurate estimation of the parameters μn and σn is a difficult problem but necessary to obtain a high detection performance. In 5.1

Numerical simulations

One of the main motivations for this paper is to show that the hypothesis testing theory can be applied in practice to design a reliable LSB matching detector.

As previously discussed, the reliability of the proposed tests heavily depends of the possibility to theoretically predict the parameters of proposed test in practice. To verify that the proposed test performs as established by Theorem 2, Theorem 3, Theorem 4 a numerical simulation was performed on simulated data. The Monte-Carlo

Conclusion and future works

The first step to fill the gap between hypothesis testing theory and steganalysis was recently proposed in [12], [7], [48]. This paper extends this first step to the case of LSB matching. By casting the problem of LSB matching steganalysis in the framework of hypothesis testing theory, the most powerful Likelihood Ratio Test is designed. Then, a thorough statistical study permits the analytical calculation of its performance in terms of the false-alarm probability and detection power. To apply

References (48)

  • D.C. Wu et al.

    A steganographic method for images by pixel-value differencing

    Pattern Recognition Letters

    (2003)
  • T. Zhang et al.

    Steganalysis of LSB matching based on statistical modeling of pixel difference distributions

    Information Sciences

    (2010)
  • T. Zhang et al.

    A new approach to reliable detection of LSB steganography in natural images

    Signal Processing

    (2003)
  • P. Bas, T. Filler, T. Pevný, Break our steganographic system—the ins and outs of organizing boss, in: Information...
  • R. Böhme

    Advanced Statistical Steganalysis

    (2010)
  • G. Cancelli, G. Doerr, M. Barni, I. Cox, A comparative study of ±1 steganalyzers, in: IEEE Workshop on Multimedia...
  • G. Cancelli, G. Doerr, I. Cox, M. Barni, Detection of ±1 LSB steganography based on the amplitude of histogram local...
  • R. Cogranne, C. Zitzmann, L. Fillatre, I. Nikiforov, F. Retraint, P. Cornu, A cover image model for reliable...
  • R. Cogranne, C. Zitzmann, L. Fillatre, F. Retraint, I. Nikiforov, P. Cornu, Statistical decision by using quantized...
  • R. Cogranne, C. Zitzmann, I. Nikiforov, F. Retraint, L. Fillatre, P. Cornu, Statistical detection of LSB matching in...
  • R. Cogranne, C. Zitzmann, F. Retraint, I. Nikiforov, L. Fillatre, P. Cornu, Statistical detection of LSB Matching Using...
  • I. Cox et al.

    Digital Watermarking and Steganography

    (2007)
  • O. Dabeer et al.

    Detection of hiding in the least significant bit

    IEEE Transactions on Signal Processing

    (2004)
  • R. Dubes et al.

    Random field models in image analysis

    Journal of Applied Statistics

    (1989)
  • Cited by (18)

    • A further study of large payloads matrix embedding

      2015, Information Sciences
      Citation Excerpt :

      Specifically, steganographic scheme embeds secret message into innocuous looking cover data (e.g., digital images) by slightly modifying the cover content in such a way that the intended recipient can precisely extract the embedded message. Unlike digital watermarking, steganography is a fragile data hiding technique, and the most important requirement for a steganographic scheme is its security, i.e., the perceptual and statistical undetectability of the hidden message [4,30,32,41]. Generally, there are mainly two ways to improve the stego-security.

    • A local adaptive model of natural images for almost optimal detection of hidden data

      2014, Signal Processing
      Citation Excerpt :

      More precisely, the first works that cast the problem of steganalysis within the framework of hypothesis theory and established the statistical properties of the proposed test are based on the simplistic assumption that all the pixels share the same expectation and the same variance [14]. Our previous works [6,7,17,48] proposed to use a model in which pixels have different expectations, but this expectation is modeled with a simplistic piecewise-polynomial model. With the same model some prior works [10,12,52] study analytically the impact of quantization of detector performances.

    • Statistical detection of data hidden in least significant bits of clipped images

      2014, Signal Processing
      Citation Excerpt :

      It should also be highlighted that the proposed methodology relies on general statistical concepts and general properties of natural images. It can thus be applied to more complex data hiding schemes: a first step for the statistical detection of LSB matching scheme has been proposed in [6,7]. The paper is organized as follows.

    • Statistical detection of defects in radiographic images using an adaptive parametric model

      2014, Signal Processing
      Citation Excerpt :

      Hence, this method is compromised when the inspected object geometry or acquisition conditions may slightly change. In our previous work, this methodology, using hypothesis testing theory with the use of a linear model of background, allows us to address the problem of hidden data detection [14,15]. It has also been shown, for that detection problem, that the use of a precise model [16,17] allows the obtaining of a statistical test with better performance.

    • On Comparing Ad Hoc Detectors with Statistical Hypothesis Tests

      2023, IH and MMSec 2023 - Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia Security
    View all citing articles on Scopus
    1

    With the financial support from the Prevention of and Fight against Crime Programme of the European Union European Commission - Directorate-General Home Affairs (2centre.eu project). Research partially funded by Troyes University of Technology (UTT) strategic program COLUMBO.

    View full text