Abstract
Text rewriting pattern mining was important for stylistic change detection and machine (aided) writing. This paper combined monolingual sentence alignment and monolingual word alignment for text rewriting pattern mining. Edit distance was used to compute sentence similarity for sentence alignment, and a log-linear modification of IBM Model 2 was used for word alignment. We built a rewriting corpus of Jin Yong’s novels, on which quantitative and qualitative experiments were carried out. Rewriting patterns were extracted and classified, including function word usages and some content word usages, which reflected the stylistic shift of the author.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ho, Y.: Corpus Stylistics in Principles and Practice: A Stylistic Exploration of John Fowles’ The Magus. A&C Black (2011)
Lan, D.H., Cao, L.Y.: On the new revised version of Jin Yong’s novels. J. Hangzhou Dianzi Univ. ( Soc. Sci.) 6(1), 57–61 (2010). (in Chinese)
Xue, D.C.: The edition research of the legend of the condor heroes. Henan University (2011). (in Chinese)
Zhang, F., Litman, D.: Sentence-level rewriting detection. In: Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 149–154 (2014)
Zhang, F., Hashemi, H.B., Hwa, R., et al.: A corpus of annotated revisions for studying argumentative writing. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1568–1578 (2017)
Tan, P.P., Verspoor, K., Miller, T.: Structural alignment as the basis to improve significant change detection in versioned sentences. In: Proceedings of the Australasian Language Technology Association Workshop 2015, pp. 101–109 (2015)
Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)
Nelken, R., Shieber, S.M.: Towards robust context-sensitive sentence alignment for monolingual corpora. In: Proceedings of EACL 2006, pp. 161–168 (2006)
Barzilay, R., Elhadad, N.: Sentence alignment for monolingual comparable corpora. In: Proceedings of EMNLP 2003, pp. 25–32 (2003)
Liu, Z.Y., Wang, H.F., Wu, H., et al.: Collocation extraction using monolingual word alignment method. In: Proceedings of EMNLP 2009, vol. 2, pp. 487–495 (2009)
Dyer, C., Chahuneau, V., Smith, N.A.: A simple, fast, and effective reparameterization of IBM model 2. In: Proceedings of NAACL 2013, pp. 644–648 (2013)
Acknowledgments
We thank Qian Wang and Yue Zhang for their helpful discussions on literary computing. We are grateful to the anonymous reviewers for constructive advices to improve this paper. This work is partially supported by grants from the National Natural Science Foundation of China (No. 61402419) and the National Social Science Foundation of China (No. 14BYY096).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Jia, Y., Wang, L., Zan, H. (2018). Text Rewriting Pattern Mining Based on Monolingual Alignment. In: Hong, JF., Su, Q., Wu, JS. (eds) Chinese Lexical Semantics. CLSW 2018. Lecture Notes in Computer Science(), vol 11173. Springer, Cham. https://doi.org/10.1007/978-3-030-04015-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-04015-4_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04014-7
Online ISBN: 978-3-030-04015-4
eBook Packages: Computer ScienceComputer Science (R0)