Benchmarking Scientific Image Forgery Detectors

Cardenuto, João P.; Rocha, Anderson

doi:10.1007/s11948-022-00391-4

Benchmarking Scientific Image Forgery Detectors

Original Research/Scholarship
Published: 09 August 2022

Volume 28, article number 35, (2022)
Cite this article

Science and Engineering Ethics Aims and scope Submit manuscript

870 Accesses
12 Altmetric
1 Mention
Explore all metrics

Abstract

The field of scientific image integrity presents a challenging research bottleneck given the lack of available datasets to design and evaluate forensic techniques. The sensitivity of data also creates a legal hurdle that restricts the use of real-world cases to build any accessible forensic benchmark. In light of this, there is no comprehensive understanding on the limitations and capabilities of automatic image analysis tools for scientific images, which might create a false sense of data integrity. To mitigate this issue, we present an extendable open-source algorithm library that reproduces the most common image forgery operations reported by the research integrity community: duplication, retouching, and cleaning. We create a large scientific forgery image benchmark (39,423 images) with enriched ground truth using this library and realistic scientific images. All figures within the benchmark are synthetically doctored using images collected from creative commons sources. While collecting the source images, we ensured that the they did not present any suspicious integrity problems. Because of the high number of retracted papers due to image duplication, this work evaluates the state-of-the-art copy-move detection methods in the proposed dataset, using a new metric that asserts consistent match detection between the source and the copied region. All evaluated methods had a low performance in this dataset, indicating that scientific images might need a specialized copy-move detector. The dataset and source code are available at https://github.com/phillipecardenuto/rsiil.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Digital Image Forensics-Image Verification Techniques

SILA: a system for scientific image analysis

Article Open access 31 October 2022

Towards a Systematic Screening Tool for Quality Assurance and Semiautomatic Fraud Detection for Images in the Life Sciences

Article Open access 15 November 2016

Notes

A biomedical experimental photo resulting of a process to detect proteins from a cell or tissue.
https://ori.hhs.gov/advanced-forensic-actions (Last Access June 2022)
https://imagetwin.ai (Last Access June 2022)
https://www.darpa.mil/program/media-forensics (Last access June, 2022)
http://celltracking.bio.nyu.edu (Last access June, 2022)
https://bbbc.broadinstitute.org (Last access June, 2022)
https://imagej.net/Adiposoft (Last access June, 2022)
https://idr.openmicroscopy.org (Last access June, 2022)
http://retractiondatabase.org (Last access June, 2022)
A non-profit organization affiliated with the Center for Scientific Integrity and dedicated to report and discuss cases of retracted papers and related issues.
https://headt.eu/Image-Integrity-Database (Last access June, 2022)
https://www.stm-assoc.org/standards-technology/working-group-on-image-alterations-and-duplications (Last access June, 2022)
Code available at https://github.com/igorcmoura/inpaint-object-remover. (Last access June, 2022)
https://creativecommons.org/publicdomain/zero/1.0 (Last access June, 2022)
https://creativecommons.org/licenses/by/4.0 (Last access June, 2022)
https://bbbc.broadinstitute.org (Last access June, 2022)
https://www.ncbi.nlm.nih.gov/pmc (Last access June, 2022)
https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist (Last access June, 2022)

References

Al-Qershi, O. M., & Khoo, B. E. (2018). Evaluation of copy-move forgery detection: Datasets and evaluation metrics. Multimedia Tools and Applications, 77(24), 31807–31833. https://doi.org/10.1007/s11042-018-6201-4.
Article Google Scholar
Amerini, I., Ballan, L., Caldelli, R., Bimbo, A. D., & Serra, G. (2011). A SIFT-based forensic method for copy–move attack detection and transformation recovery. IEEE Transactions on Information Forensics and Security, 6(3), 1099–1110. https://doi.org/10.1109/tifs.2011.2129512
Article Google Scholar
Anderson, C. (1994). Easy-to-alter digital images raise fears of tampering. Science, 263(5145), 317–318. https://doi.org/10.1126/science.8278802
Article Google Scholar
Andrade, R.d.O. (2021). Elisabeth Bik: On the trail of scientific fraud. https://revistapesquisa.fapesp.br/en/elisabeth-bik-on-the-trail-of-scientific-fraud/
Azoulay, P., Bonatti, A., & Krieger, J. L. (2017). The career effects of scandal: Evidence from scientific retractions. Research Policy, 46(9), 1552–1569.
Article Google Scholar
Barnes, C., Shechtman, E., Finkelstein, A., & Goldman, D. B. (2009). Patchmatch: A randomized correspondence algorithm for structural image editing. In ACM transactions on graphics (TOG), (vol. 28, p. 24).
Bik, E., Casadevall, A., & Fang, F. (2016). The prevalence of inappropriate image duplication in biomedical research publications. MBio, 7(3), e00809.
Article Google Scholar
Bo, X., Junwen, W., Guangjie, L., & Yuewei, D. (2010). Image copy-move forgery detection based on SURF. In 2010 International conference on multimedia information networking and security. IEEE. https://doi.org/10.1109/mines.2010.189.
Bucci, E. (2018). Automatic detection of image manipulations in the biomedical literature. Nature Cell Death & Disease, 9(3), 400.
Article Google Scholar
Christlein, V., Riess, C., Jordan, J., Riess, C., & Angelopoulou, E. (2012). An evaluation of popular copy-move forgery detection approaches. IEEE Transactions on Information Forensics and Security, 7(6), 1841–1854. https://doi.org/10.1109/tifs.2012.2218597
Article Google Scholar
Christopher, J. (2018). Systematic fabrication of scientific images revealed. FEBS Letters, 592, 3027–3029.
Article Google Scholar
Cozzolino, D., Poggi, G., & Verdoliva, L. (2015). Efficient dense-field copy-move forgery detection. IEEE Transactions on Information Forensics and Security, 10(11), 2284–2297.
Article Google Scholar
Criminisi, A., Pérez, P., & Toyama, K. (2004). Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on image processing, 13(9), 1200–1212.
Article Google Scholar
Cromey, D. (2010). Avoiding twisted pixels: Ethical guidelines for the appropriate use and manipulation of scientific digital images. Springer Science and Engineering Ethics, 16(4), 639–667.
Article Google Scholar
Ehret, T. (2018). Automatic detection of internal copy-move forgeries in images. Image Processing On Line, 8, 167–191. https://doi.org/10.5201/ipol.2018.213
Article Google Scholar
Guan, H., Kozak, M., Robertson, E., Lee, Y., Yates, A.N., Delgado, A., Zhou, D., Kheyrkhah, T., Smith, J., & Fiscus, J. (2019) MFC datasets: Large-scale benchmark datasets for media forensic challenge evaluation. In 2019 IEEE winter applications of computer vision workshops (WACVW) (pp. 63–72). https://doi.org/10.1109/WACVW.2019.00018
Koker, T.E., Chintapalli, S.S., Wang, S., Talbot, B.A., Wainstock, D., Cicconet, M., & Walsh, M.C. (2021). On identification and retrieval of near-duplicate biological images: A new dataset and protocol. In International conference on pattern recognition (ICPR). IEEE. https://ailb-web.ing.unimore.it/icpr/author/3517
Krueger, J. (2002). Forensic examination of questioned scientific images. Accountability in Research, 9(2), 105–125. https://doi.org/10.1080/08989620212970
Article Google Scholar
Li, Y., & Zhou, J. (2019). Fast and effective image copy-move forgery detection via hierarchical feature point matching. IEEE Transactions on Information Forensics and Security, 14(5), 1307–1322. https://doi.org/10.1109/tifs.2018.2876837
Article Google Scholar
Marcus, A. (2019). Pitt researchers sue journal for defamation following retraction. https://retractionwatch.com/2019/12/02/pitt-researchers-sue-journal-for-defamation-following-retraction/
Mongeon, P., & Larivière, V. (2013). The collective consequences of scientific fraud: An analysis of biomedical research. In Proceedings of ISSI 2013, proceedings of the international conference on scientometrics and informetrics (pp. 1897–1899). Austrian Institute of Technology.
Moreira, D., Bharati, A., Brogan, J., Pinto, A., Parowski, M., Bowyer, K., Flynn, P., Rocha, A., & Scheirer, W. (2018). Image provenance analysis at scale. IEEE Transactions on Image Processing, 27(12), 6109–6123.
Article Google Scholar
Naylor, P., Lae, M., Reyal, F., & Walter, T. (2017). Nuclei segmentation in histopathology images using deep neural networks. In 2017 IEEE 14th international symposium on biomedical imaging (ISBI 2017). IEEE. https://doi.org/10.1109/isbi.2017.7950669.
Noorden, R. V. (2015). The image detective who roots out manuscript flaws. Nature. https://doi.org/10.1038/nature.2015.17749
Parrish, D., & Noonan, B. (2009). Image manipulation as research misconduct. Science and Engineering Ethics, 15(2), 161–167. https://doi.org/10.1007/s11948-008-9108-z
Article Google Scholar
Pun, C. M., Yuan, X. C., & Bi, X. L. (2015). Image forgery detection using adaptive oversegmentation and feature point matching. IEEE Transactions on Information Forensics and Security, 10(8), 1705–1716. https://doi.org/10.1109/tifs.2015.2423261
Article Google Scholar
Qi, C., Zhang, J., & Luo, P. (2020). Emerging concern of scientific fraud: Deep learning and image manipulation. bioRxiv.
Rossner, M. (2008). A false sense of security. Journal of Cell Biology, 183(4), 573–574. https://doi.org/10.1083/jcb.200810172
Article Google Scholar
Rossner, M., & Yamada, K. (2004). What’s in a picture? The temptation of image manipulation. The Journal of Cell Biology, 166(1), 11–15.
Article Google Scholar
Taubes, G. (1994). Technology for turning seeing into believing. Science, 263(5145), 318. https://doi.org/10.1126/science.8278803.
Article Google Scholar
Wjst, M. (2021). Scientific integrity is threatened by image duplications. American Journal of Respiratory Cell and Molecular Biology, 64(2), 271–272. https://doi.org/10.1165/rcmb.2020-0419le
Article Google Scholar
Wu, Y., AbdAlmageed, W., & Natarajan, P. (2018). Busternet: Detecting image copy-move forgery with source/target localization. In European conference on computer vision (ECCV). Springer.
Xiang, Z., & Acuna, D. (2020). Scientific image tampering detection based on noise inconsistencies: A method and datasets. arXiv preprint arXiv:2001.07799
Zhou, P., Han, X., Morariu, V.I., & Davis, L.S. (2018). Learning rich features for image manipulation detection. In 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2018.00116.

Download references

Funding

This research was supported by São Paulo Research Foundation (FAPESP), under the thematic project DéjàVu, Grants 2017/12646-3 and 2020/02211-2.

Author information

Authors and Affiliations

Artificial Intelligence Lab. (Recod.ai), Institute of Computing, University of Campinas, Av. Albert Einstein, 1251 - Cidade Universitária, Campinas, SP, 13083-852, Brazil
João P. Cardenuto & Anderson Rocha

Authors

João P. Cardenuto
View author publications
You can also search for this author in PubMed Google Scholar
Anderson Rocha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João P. Cardenuto.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Computer Science and Digital Forensics Glossary

1.
Algorithm Library: A collection of algorithms and routines used by computers to run a software.
2.
Benchmark: The process of assessing the performance of a method using a standard dataset and metric.
3.
Brute Force Algorithm: Solving a problem trying each possible solution with exhaustive search.
4.
Color Histogram: The frequency representation of each pixel color intensity in an image.
5.
Computer Vision: An interdisciplinary area from Computer Science dedicated to inferring, understanding, and producing digital images and videos using computational intelligence.
6.
Dataset: In the scope of this work, a collection of images.
7.
Deep Fakes: Synthetic and realistic images generated by an Artificial Intelligence.
8.
Detection Map: A map informing, for each pixel from an input image, whether the pixel is doctored or not according to a tampering detection method.
9.
Detection Probability Map: A map related to an image that indicates the probability of each pixel being doctored.
10.
False Negative: In the tampering detection scope, the result of predicting a forged data as pristine.
11.
False Positive: In the tampering detection scope, the result of predicting a pristine data as doctored.
12.
Ground Truth: The annotated information from data. In this work, we also use this term as the annotated map that pinpoints the location of each doctored pixel from an image.
13.
True Positive: In the tampering detection scope, the result of correctly predicting a forgery.
14.
True Negative: In the tampering detection scope, the result of correctly predicting pristine data.
15.
JavaScript Object Notation (JSON): A typical file format in Computer Science for sharing object data in a human-readable text.
16.
Metadata: A data description providing information regarding its storage, content, and creation.
17.
Model Fine-tuning: The process of performing small adjustments in a pre-trained model to improve its performance for a specific task.
18.
Model Training: The phase of an Artificial Intelligence algorithm in which a model learns how to accomplish a task.
19.
Object-Mask or Object map: A mask/map that pinpoints the location of each foreground object within an image.

Evaluation Plots and Output Results

This section presents all scores organized in radar plots from the copy-move forgery detection benchmark. We also present output samples from all methods for each evaluated modality.

Figure 16 presents a radar graph visualization based on the evaluation tables in which the forgery modalities are arranged in the radius axes. Each CMFD methods’ result is represented by a different color in the radar chart. In this visualization, we inserted the score of each method along the modality axis (e.g., copy-move with flip), which start from the radar center (score zero) to its border (highest score); thus, the farther the method point (colorpoint) is from the center, the better the method for the axis copy-move modality will be. After inserting all points of a method for each copy-move modality, we connected those points, which results in a polygon. The larger the polygon area, the better the method’s robustness among different forgery modalities. Also, comparing each detector’s robustness to the operations, this visualization type helps identify possible complementary behaviors among different methods. As an example, consider Fig. 19 left panel. In this case, we have five modalities being compared (e.g., Copy-Move with Flip, Cleaning with Brute-Force, and Copy-Move with Translation). This chart shows the results of seven methods represented by each polygon color (see legend on the right). The best method in this figure is OVERSEG (in red), while the two worse methods (HFPM and SIFT-NN) are in superposition at the center (smaller areas).

Evaluation I: Simple Forgery Figures

Figure 16a shows the evaluation of Simple Figure Forgery Detection with \(\text{ F1-score}_{TP}\) (left plot) and \(\text{ F1-score}_{CTP}\) (right char), all scores in percentage (i.e. within a range from zero to one hundred).

Figure 17 presents output detection samples for all methods applied to the evaluated Simple forgery modalities. This figure includes \(\text{ F1-score}_{CTP}\) of each detected map next to each method’s name in the figure.

Evaluation II: Inter-Forgery Compound Figures

Figure 16b shows the evaluation of Inter-Forgery Compound Figure Detection using \(\text{ F1-score}_{CTP}\). In this modality, the radar visualization helped us observe some complementary performance among the chosen detectors. For instance, SURF-NN and Zernike-PM show a complementary behavior to copy-move with rotation and retouching. The flipped copy-move, Splicing, and Overlap forgery proved the most challenging forgeries. In addition, the visual signs are shown to have a noticeable impact in this scenario, reducing by up to seven points from degree 1 to degree 3 for some detectors.

Figure 18 presents output detection samples for all methods applied to the evaluated Inter-Panel forgery modalities with degree of visual sign equals to one.

Evaluation III: Intra-Forgery Compound Figures

Figure 16c shows the evaluation of Intra-Forgery Compound Figure Detection using \(\text{ F1-score}_{CTP}\). In this modality, the detectors scored lower than three percent on \(\text{ F1-score}_{CTP}\) for all evaluated operations. In this figure, the polygon from OVERSEG contains all the others, indicating that OVERSEG had the best performance in this modality.

Figure 18 presents output detection samples for all method applied to the evaluated Inter-Panel forgery modalities with degree of visual sign equal one (Fig. 19).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cardenuto, J.P., Rocha, A. Benchmarking Scientific Image Forgery Detectors. Sci Eng Ethics 28, 35 (2022). https://doi.org/10.1007/s11948-022-00391-4

Download citation

Received: 02 September 2021
Accepted: 22 June 2022
Published: 09 August 2022
DOI: https://doi.org/10.1007/s11948-022-00391-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Benchmarking Scientific Image Forgery Detectors

Abstract

Access this article

Similar content being viewed by others

Digital Image Forensics-Image Verification Techniques

SILA: a system for scientific image analysis

Towards a Systematic Screening Tool for Quality Assurance and Semiautomatic Fraud Detection for Images in the Life Sciences

Notes

References

Funding