Detection of document modification based on deep neural networks

Kim, Noo-ri; Choi, YunSeok; Lee, HyunSoo; Choi, Jae-Young; Kim, Suntae; Kim, Jeong-Ah; Cho, Youngwha; Lee, Jee-Hyong

doi:10.1007/s12652-017-0617-y

Detection of document modification based on deep neural networks

Original Research
Published: 15 November 2017

Volume 9, pages 1089–1096, (2018)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Noo-ri Kim¹,
YunSeok Choi¹,
HyunSoo Lee¹,
Jae-Young Choi¹,
Suntae Kim²,
Jeong-Ah Kim³,
Youngwha Cho¹ &
…
Jee-Hyong Lee¹

318 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we focus on the detection of the semantic and structural modifications in documents. We define the following six inter-document relations that we use to represent document modification: Eliminate, Extend, Merge, Split, Rewrite, and Reorder. We also develop a detection model based on a deep neural network to identify the relations between two given documents. We assumed that several modifications can be applied to a document; in this situation, the modifications can overlap each other, so it can be very difficult to detect the applied modifications. We represent a document pair by using a sentence-based similarity matrix, and the inter-document relations are then detected by applying the deep neural network to the similarity matrix. The experiments show that our model performed impressively in the detection of document modifications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

Diksha Khurana, Aditya Koli, … Sukhdev Singh

Siamese Neural Networks: An Overview

Testing of detection tools for AI-generated text

Article Open access 25 December 2023

Debora Weber-Wulff, Alla Anohina-Naumeca, … Lorna Waddington

Notes

https://translate.google.com/.

References

Dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING, pp 69–78
Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. Advances in neural information processing systems, pp 2042–2050
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: European conference on machine learning. Springer, Berlin Heidelberg, pp 137–142
Google Scholar
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of ACL, pp 212–217
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751
Kim NR, Choi Y, Lee H, Lee JH (2016) Detection of content changes based on deep neural networks. In: International conference on Computer Science and its Applicatioins, pp 806–811
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, pp 1097–1105
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. Conference of the Association for the Advancement of Artificial Intelligence, pp 2267–2273
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, pp 3111–3119
Pang L, Lan Y, Guo J, Xu J, Wan S, Cheng X (2016) Text matching as image recognition. In: Proceedings of AAAI, pp 2793–2799
Park J, Kim J, Lee JH (2014) Keyword extraction for blogs based on content richness. J Inform Sci 40(1):38–49
Article Google Scholar
Qiu L, Kan MY, Chua TS (2006) Paraphrase recognition via dissimilarity significance classification. In: Proceedings of EMNLP, pp 18–26
Rafiei M, Kardan AA (2015) A novel method for expert finding in online communities based on concept map and PageRank. Hum-Centric Comput Inform Sci 5(10):1–18
Sanna G, Angius A, Concas G, Manca D, Pani FE (2015) PCE: a knowledge base of semantically disambiguated contents. JoC 6(2):10–18
Google Scholar
Schutz AT (2008) Keyphrase extraction from single documents in the open domain exploiting linguistic and statistical methods. Doctoral dissertation, National University of Ireland, Galway
Shen Y, Rong W, Sun Z, Ouyang Y, Xiong Z (2015) Question/answer matching for CQA system via combining lexical and sequential information. In: Proceedings of AAAI, pp 275–281
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Socher R, Huang EH, Pennington J, Ng AY, Manning CD (2011) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Advances in neural information processing systems, pp 801–809
Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of EMNLP, pp 1422–1432
Tang D, Qin B, Liu T (2016) Aspect level sentiment classification with deep memory network. arXiv:1605.08900
Vijayarajan V, Dinakaran M, Tejaswin P, Lohani M (2016) A generic framework for ontology-based information retrieval and image retrieval in web data. Hum-Centric Comput Inform Sci 6(18):1–30
Xu J, Chen D, Qiu X, Huang X (2016) Cached long short-term memory neural networks for document-level sentiment classification. arXiv:1610.04989
Yin W, Schütze H, Xiang B, Zhou B (2016) ABCNN: attention-based convolutional neural network for modeling sentence pairs. Trans Assoc Comput Linguist 4:259–272
Google Scholar
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Advances in neural information processing systems, pp 649–657

Download references

Acknowledgements

This research was supported by the Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Science, ICT & Future Planning (NRF-2014M3C4A7030503). This research was also supported by Ministry of Culture, Sports and Tourism (MCST) and Korea Creative Content Agency (KOCCA) in the Culture Technology (CT) Research & Development Program 2016 (R2016030046).

Author information

Authors and Affiliations

College of Information and Communication Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jang an-gu, Suwon-si, Gyeonggi-do, 16419, Republic of Korea
Noo-ri Kim, YunSeok Choi, HyunSoo Lee, Jae-Young Choi, Youngwha Cho & Jee-Hyong Lee
Department of Software Engineering, Chonbuk National University, 567 Baekje-daero, Deokjin-gu, Jeonju-si, Jeollabuk-do, 54896, Republic of Korea
Suntae Kim
Department of Computer Education, Catholic Kwandong University, 24 Beomil-ro 579 beon-gil, Kangneung-si, Kangwon-do, 25601, Republic of Korea
Jeong-Ah Kim

Authors

Noo-ri Kim
View author publications
You can also search for this author in PubMed Google Scholar
YunSeok Choi
View author publications
You can also search for this author in PubMed Google Scholar
HyunSoo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jae-Young Choi
View author publications
You can also search for this author in PubMed Google Scholar
Suntae Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jeong-Ah Kim
View author publications
You can also search for this author in PubMed Google Scholar
Youngwha Cho
View author publications
You can also search for this author in PubMed Google Scholar
Jee-Hyong Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jee-Hyong Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, Nr., Choi, Y., Lee, H. et al. Detection of document modification based on deep neural networks. J Ambient Intell Human Comput 9, 1089–1096 (2018). https://doi.org/10.1007/s12652-017-0617-y

Download citation

Received: 16 March 2017
Accepted: 26 October 2017
Published: 15 November 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s12652-017-0617-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Detection of document modification based on deep neural networks

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Siamese Neural Networks: An Overview

Testing of detection tools for AI-generated text

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detection of document modification based on deep neural networks

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Siamese Neural Networks: An Overview

Testing of detection tools for AI-generated text

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation