Digital image integrity – a survey of protection and verification techniques
Introduction
Increasing ease of editing digital images has important implications for our trust in photography. Preparing convincing forgeries is within reach of mainstream consumers, and new image editing technologies harness state-of-the-art developments in artificial intelligence to further improve the quality and photorealism of photo manipulations. Some of the most noteworthy recent developments include fully (or nearly) automatic techniques for: (a) beautifying faces with Adobe Sensei [1]; (b) changing facial expression or age with FaceApp [2]; (c) replacing skies and matching scene lighting with Adobe Sky Replace [3]; (d) changing visual style of photographs by example (e.g., for weather or time-of-day hallucination) with Deep Photo Style Transfer [4]. Many of these techniques are not only available as easy-to-use 3rd party applications, but also as default mechanisms in our smart devices (e.g., see face beautification features in Huawei or Xiaomi smart-phones [5]). Given that humans are unreliable in identifying fake images [6], automatic & accurate forgery detection schemes are of critical importance in a number of applications.
Researchers in academia and industry have devised four general approaches to image authentication: (a) digital signatures; (b) authentication watermarks; (c) forensic analysis; (d) phylogeny reconstruction. Digital image signatures rely on standard cryptographic primitives and should to be computed upon image acquisition, ideally by a digital camera itself [7], [8]. The signatures are typically attached to the image as meta-data and allow for verifying authenticity of a binary image representation. Such an approach lacks flexibility as the bit-stream is highly fragile and changes when the image is converted to a different format during subsequent transmission or storage. This problem is addressed by robust hash functions, which aim to be sensitive to important semantic changes in the content, while remaining robust to unintentional, global post-processing like brightness adjustments or lossy compression [9].
In contrast to the easily removable meta-data-located signatures, authentication watermarks are irreversibly embedded in the image by means of carefully designed, imperceptible changes in the image content. A dedicated decoder analyzes consistency of the watermark, and verifies the integrity of the investigated image or its individual regions. Authentication watermarks deliver advanced features like precise localization of a forgery [9] or even approximate restoration of the original appearance [10] – both based solely on the visual content of a potentially doctored image.
The main limitation of the above techniques, collectively known as active protection techniques, consists in the necessity to protect the images upon their acquisition by a secure camera [11] or registration in a media repository. Such an approach is not feasible in many applications, and is best suited for strict document work-flows. In general, such side information is not available and image authentication can only be performed with passive forensic analysis. As a result, in the recent years we have witnessed a massive shift of interests towards passive techniques from both the research community and prospective end users.
Despite the of variety of existing techniques, the field of digital image forensics has yet to deliver fully automatic and reliable tampering detection and localization schemes. Even the most mature techniques work only in limited conditions, and remain vulnerable to relatively simple malicious attacks, referred to as anti-forensics. A recent evaluation of state-of-the-art methods on uncontrolled images harvested from the Internet has demonstrated their poor performance and unpredictable behavior when dealing with random images with unknown origin and processing history [12]. In such conditions, the trend seems to proceed towards image phylogeny which aims to identify various versions of a given image within a large photo collection (e.g., available online) and to reconstruct their mutual relationships and processing history.
DARPA has recently launched the MediFor program in order to advance the development of media forensics and image phylogeny techniques [13]. One of the program's goals is to prepare a large realistic test corpus and organize a series of image forensics challenges. Early results from the most recent 2017 edition [14] seem to corroborate the above findings that existing forensic detectors perform poorly outside of laboratory conditions [15].
The goal of this paper is to provide a thorough survey of the most important developments in the field of digital image authentication with emphasis on image integrity protection and verification techniques. The distinction between image authenticity and integrity is not always clear is scientific communication, and bears particular importance in legal proceedings, as exemplified by the recommendations of the Scientific Working Group Imaging Technology (SWGIT) [16] which used to be tasked with drafting recommendations for law enforcement agencies.
Image integrity involves ensuring that the content represented by the image is the same as at the time of its acquisition. Image authenticity is a more general term which refers to the truthfulness of the presented scene in a broader context. Hence, authenticity takes into account the possibility of using an unaltered image in an incorrect context (e.g., taken in a different moment in time) or staging the photographed scene. As a result, authentication will encompass not only content integrity verification, but also source attribution, and context verification. Source attribution addresses problems of confirming the origin of the investigated images, and includes aspects like: (a) confirmation that a given photograph was taken with a specific camera; (b) confirmation that the image is a photograph and not a photo-realistic computer graphic image; (c) detection whether the investigated photograph is re-captured from a print-out or a computer screen. Context verification may include analysis of similar images from social media or other photographers, as well as integration of external sources of knowledge, e.g., consistency of weather with the alleged time of capture. The discussed taxonomy is shown in Fig. 1. For the sake of consistency, in this paper I follow the usual convention in the research community and use the terms image integrity and image authenticity interchangeably. The intended meaning will be clear from the context.
In contrast to recent surveys of image forensics [17], [18], [19], [20], I will address both active and passive approaches to image authentication. My aim is not to review all reported variations of possible analysis techniques. Instead, I analyze and compare state-of-the-art approaches with respect to: (a) analysis capabilities; (b) fundamental limitations; (c) documented vulnerabilities; (d) maturity and availability of software tools. The paper draws on my experience in designing and implementing both active and passive image authentication schemes.
Compared to previous surveys, this paper is structured differently. Discussion of each authentication approach covers all relevant aspects starting from analysis capabilities up to documented attack vectors. For the sake of presentation clarity, longer descriptions are divided into separate sub-sections. Selected techniques are illustrated with operation examples. I generated all of the figures by myself using either my own or publicly available implementations. Tampering examples come from a recent dataset with realistic forgeries [21]. The dataset includes uncompressed and non-resized images from various cameras which allows to demonstrate the behavior of many acquisition traces.
Whenever possible, the discussion covers available software tools, be it commercial or academic. I believe such a survey gives a clear picture of the maturity, applicability and reliability of specific protection/verification schemes. I also believe that such a survey is of particular interest now, when we are on a verge of a revolution in imaging technology. The emerging multi-sensor and multi-lens cameras will render many existing traces useless, and will require another look at our approach to image authentication.
The remaining part of this paper is organized as follows (see Fig. 1 for a compact visualization of the problem taxonomy and the corresponding paper structure). First, I discuss active protection techniques based on digital signatures (Section 2) and authentication watermarking (Section 3). Then, I introduce a model of the image acquisition pipeline, and review various traces that can be used for blind forensic analysis (Section 4). Image phylogeny techniques are presented in Section 5. In Section 6, I briefly describe some recently proposed alternative approaches to image authentication that do not directly fall into any of the discussed classes. In Section 7, I review resources available in the research community, including publicly available datasets and software tools. I conclude and discuss open problems and future research perspectives in Section 8.
Section snippets
Signature-based verification techniques
This section discusses techniques based on generic cryptographic signatures (Section 2.1), robust image signatures (Section 2.2) and more advanced signatures based on distributed source coding (Section 2.3). In all of these techniques, the signature is separated from the image content. It is either stored as easily-removable meta-data, or delivered on demand during online content authentication.
Watermarking-based protection techniques
This section describes active protection methods based on authentication watermarking. The techniques discussed here bear resemblance to digital signatures (Section 2) but rely on a different carrier. Instead of easily removable meta-data, a digital watermark is used for embedding necessary side information directly into the image content.
A general work-flow for watermarking-based protection techniques is shown in Fig. 2. A typical system consists of two modules: (1) an encoder – responsible
Forensic verification techniques
The necessity to actively protect digital images is in many cases an excessively restrictive and impractical requirement. In order to address this issue, scholars have studied passive authentication techniques which exploit intrinsic fingerprints introduced into the photographs during their acquisition or manipulation. These forensic traces allow to reason about the origin, processing history and authenticity of the captured images. A schematic view of the digital photo acquisition pipeline is
General information & analysis capabilities
The problem of image phylogeny is an emerging research direction aiming to recover the relationships and processing history of various versions of an image [202], [203], [204]. In case of a forgery, the identified alternative copies may be used as a reference during authentication. Depending on the application at hand, it may be necessary to first identify candidate, near-duplicated images within a larger photo collection. An example application involves analysis of digital photographs that
Alternative experimental techniques
The techniques described in Sections 2–4 represent the most mature approaches to digital image authentication, both from the perspective of active protection and passive verification. However, researchers are still working on alternative ways of addressing the problem. Naveh and Tromer recently described a prototype of an image authentication system based on a Proof-Carrying Data paradigm [210]. Their approach involves definition of a set of allowed image transformations, which generate a proof
Tools & resources
This section summarizes some of the resources available within the multimedia security community. I briefly introduce publicly available datasets and comprehensive image forensics toolkits. Due to the lack of standardized datasets and publicly available tools for active protection techniques, I focus on passive image forensics.
Summary & discussion
Development of techniques for protection or verification of digital image integrity has been an active research topic for nearly two decades. Available solutions have evolved from general image-agnostic signatures, through sophisticated active protection mechanisms capable of tampering localization and content recovery, up to passive forensic methods that analyze imperceptible telltales left in the image during its acquisition or subsequent post-processing. An emerging field of image phylogeny
Paweł Korus received his M.Sc. and Ph.D. degrees in telecommunications (both with honors) from the AGH University of Science and Technology in 2008, and in 2013, respectively. From 2015 to 2017 he has been a post-doctoral researcher with the College of Information Engineering, Shenzhen University, Shenzhen, China. He is currently an assistant professor with the Department of Telecommunications, AGH University of Science and Technology, Krakow, Poland.
His research interests include various
References (233)
- et al.
Digital image forgery detection using passive techniques: a survey
Digit. Investig.
(2013) - et al.
A bibliography of pixel-based blind image forgery detection techniques
Signal Process. Image Commun.
(2015) - et al.
Alterable-capacity fragile watermarking scheme with restoration capability
Opt. Commun.
(2012) - et al.
Four-scanning attack on hierarchical digital watermarking method for image tamper detection and recovery
Pattern Recognit.
(2008) - et al.
A secure and improved self-embedding algorithm to combat digital document forgery
Signal Process.
(2009) - Adobe Sensei, http://www.adobe.com/sensei.html, visited 11 Apr....
- FaceApp, https://www.faceapp.com/, visited 11 Apr....
Adobe Sky Replace
- et al.
Deep photo style transfer
- Xiaomi Selfie Beautification, http://www.businessinsider.com/xiaomi-selfie-beautification-2015-2?IR=T, visited 11 Apr....
Humans are easily fooled by digital images
Nikon image authentication software
Canon's original data security kit
Digital Watermarking and Steganography
Images with self-correcting capabilities
Secure digital camera
Large-scale evaluation of splicing localization algorithms for web images
Multimed. Tools Appl.
DARPA media forensics program
NIST Nimble Image Forensics Challenge 2017
Spotting the difference: context retrieval and analysis for improved forgery detection and localization
Information forensics: an overview of the first decade
IEEE Access
An overview on image forensics
Realistic Forgery Dataset
The trustworthy digital camera: restoring credibility to the photographic image
IEEE Trans. Consum. Electron.
OpenSSL
Motorola MTP6750 Tetra Portable Radio
Photo Proof by KeeeX
Canon original data security system compromised: ElcomSoft discovers vulnerability
Elcomsoft discovers vulnerability in Nikon's image authentication system
Multi-scale difference map fusion for tamper localization using binary ranking hashing
IEEE Trans. Inf. Forensics Secur.
Content-based image authentication: current status, issues, and challenges
Int. J. Inf. Secur.
Robust and secure image hashing via non-negative matrix factorizations
IEEE Trans. Inf. Forensics Secur.
A robust image authentication method distinguishing JPEG compression from malicious manipulation
IEEE Trans. Circuits Syst. Video Technol.
Robust hash functions for digital watermarking
Robust and secure image hashing
IEEE Trans. Inf. Forensics Secur.
A set theoretic framework for watermarking and its application to semifragile tamper detection
IEEE Trans. Inf. Forensics Secur.
Unicity distance of robust image hashing
IEEE Trans. Inf. Forensics Secur.
pHash – the Open Source Perceptual Hash Library
Image authentication using distributed source coding
IEEE Trans. Image Process.
Fragile watermarking with error free restoration capability
IEEE Trans. Multimedia
Statistical fragile watermarking capable of locating individual tampered pixels
IEEE Signal Process. Lett.
Watermarking of raw digital images in camera firmware
IPSJ Trans. Comput. Vis. Appl.
Towards practical self-embedding for JPEG-compressed digital images
IEEE Trans. Multimed.
Iterative filtering for semi-fragile self-recovery
Performance analysis of a block-neighborhood-based self-recovery fragile watermarking scheme
IEEE Trans. Inf. Forensics Secur.
Counterfeiting attacks on oblivious block-wise independent invisible watermarking schemes
IEEE Trans. Image Process.
Self-recovery fragile watermarking using block-neighborhood tampering characterization
Cryptoanalysis on majority-voting based self-recovery watermarking scheme
Efficient method for content reconstruction with self-embedding
IEEE Trans. Image Process.
Cited by (0)
Paweł Korus received his M.Sc. and Ph.D. degrees in telecommunications (both with honors) from the AGH University of Science and Technology in 2008, and in 2013, respectively. From 2015 to 2017 he has been a post-doctoral researcher with the College of Information Engineering, Shenzhen University, Shenzhen, China. He is currently an assistant professor with the Department of Telecommunications, AGH University of Science and Technology, Krakow, Poland.
His research interests include various aspects of multimedia security & image processing, with particular focus on digital image forensics, content authentication, digital watermarking & information hiding. In 2015 he received a scholarship for outstanding young scientists from the Polish Ministry of Science and Higher Education.