Corpus annotation with paraphrase types: new annotation scheme and inter-annotator agreement measures

Vila, Marta; Bertran, Manuel; Martí, M. Antònia; Rodríguez, Horacio

doi:10.1007/s10579-014-9272-5

Corpus annotation with paraphrase types: new annotation scheme and inter-annotator agreement measures

Original Paper
Published: 02 July 2014

Volume 49, pages 77–105, (2015)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Marta Vila¹,
Manuel Bertran¹,
M. Antònia Martí¹ &
…
Horacio Rodríguez²

718 Accesses
7 Citations
3 Altmetric
Explore all metrics

Abstract

Paraphrase corpora annotated with the types of paraphrases they contain constitute an essential resource for the understanding of the phenomenon of paraphrasing and the improvement of paraphrase-related systems in natural language processing. In this article, a new annotation scheme for paraphrase-type annotation is set out, together with newly created measures for the computation of inter-annotator agreement. Three corpora different in nature and in two languages have been annotated using this infrastructure. The annotation results and the inter-annotator agreement scores for these corpora are proof of the adequacy and robustness of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Russian Paraphrase Corpus. Paraphrase Identification and Classification Based on Different Prediction Models

A hybrid approach for paraphrase identification based on knowledge-enriched semantic heuristics

Article Open access 16 April 2019

Low-Level Features for Paraphrase Identification

Notes

See Madnani and Dorr (2010), Section 5 for a discussion on this topic.
http://research.microsoft.com/en-us/downloads/607d14d9-20cd-47e3-85bc-a2f65cd28042/. The readme of the corpus contains a discussion on when a pair of sentences should be considered a paraphrase and when it should not, according to their approach.
http://staffwww.dcs.shef.ac.uk/people/T.Cohn/paraphrase_corpus.html.
http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2002T01.
See Vila et al. (2013) for a more general state of the art on paraphrase corpora. See Vila et al. (2014) for a state of the art on paraphrase typologies: “paraphrase typology” does not equal “paraphrase-type annotation scheme”, but typologies are the linguistic knowledge in which annotation schemes may be based. In this section, and in this article in general, we focus on the latter.
http://wicopaco.limsi.fr/.
http://www.cs.york.ac.uk/semeval-2012/task6/. Although Semeval organisers distinguish between semantic textual similarity and paraphrasing, being the former a sort of graded paraphrasing, this distinction is not relevant here.
Annotation guidelines are available at http://clic.ub.edu/corpus/en/paraphrases-en.
All the examples in this article are extracted from the three annotated corpora, namely P4P, MSRP-A, and WRPA-A. Typos in the original corpora have not been corrected.
It should be taken into account that corpora we annotate consist of positive cases of paraphrasing; therefore, non-paraphrases or non-paraphrase fragments are a minority.
See Vila et al. (2014) for a more detailed presentation of our paraphrase typology and Barrón-Cedeño et al. (2013) for a more detailed description of the types. In this article, we set out short definitions of the types for clarification purposes when required.
We refer to the tags with small capital letters and sometimes using short names, e.g., synthetic/analytic for synthetic/analytic substitutions.
http://www.nist.gov/tac/.
We use the subindex $w$ (words) instead of $t$ (tokens) in order to avoid confusion with the superindex $t$ (type) that will appear in what follows.
http://clt.mq.edu.au/research/projects/hoo/hoo2011/index.html. See also Dale and Kilgarriff (2011) and Dale and Narroway (2012).
The $\pi $ and $\kappa $ factors can be omitted from the calculus (i.e., they can be set to 1) if they are not relevant, as in Barrón-Cedeño et al. (2013).
Annotated corpora are available at http://clic.ub.edu/corpus/en/paraphrases-en as a downloadable package and as a search interface.
http://www.uni-weimar.de/cms/medien/webis/research/corpora/corpus-pan-pc-10.html.
http://clic.ub.edu/corpus/en/paraphrases-en.
The translation is ours.
http://www.lsi.upc.edu/~textmess/.
Strong punctuation marks are full stops, semi-colons, question marks, exclamations, and other punctuation marks that can divide autonomous text fragments (in general, sentences, or clauses), such as parentheses, hyphens, or colons.
For reasons of space, we do not include the per-type scores of inter-annotator agreement. Instead, we point out the most relevant issues in this respect.
Dolan and Brockett (2005)’s agreement value and ours are not directly comparable, as they represent different measures in diverging tasks with different degrees of complexity. Nevertheless, we consider that obtaining a value in the line of that of Dolan and Brockett (2005)’s simpler task shows that ours can be considered a satisfactory result.

References

Agirre, E., Cer, D., Diab, M., & Gonzalez-Agirre, A. (2012). Semeval-2012 task 6: A pilot on semantic textual similarity. In Proceedings of the 1st joint conference on lexical and computational semantics (*SEM 2012) (pp. 385–393). Montréal.
Amigó, E., Giménez, J., Gonzalo, J., & Màrquez, L. (2006). MT evaluation: Human-like vs. human acceptable. In Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics (COLING/ACL 2006) (pp. 17–24). Sydney.
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison-Wesley Longman Publishing Co.
Google Scholar
Barrón-Cedeño, A., Vila, M., Martí, M. A., & Rosso, P. (2013). Plagiarism meets paraphrasing: Insights for the next generation in automatic plagiarism detection. Computational Linguistics, 39(4), 917–947.
Article Google Scholar
Barzilay, R., & McKeown, K. (2001). Extracting paraphrases from a parallel corpus. In Proceedings of the 39th annual meeting of the association for computational linguistics (ACL 2001) (pp. 50–57). Toulouse.
Bès, G. G., & Fuchs, C. (1988). Introduction. In Lexique et paraphrase (pp. 7–11). Presses Universitaires de Lille.
Bhagat, R. (2009). Learning paraphrases from Text, Ph.D. thesis. University of Southern California, Los Angeles.
Chen, D. L., & Dolan, W. B. (2011). Collecting highly parallel data for paraphrase evaluation. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL/HLT 2011) (Vol 1, pp. 190–200). Portland.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Article Google Scholar
Cohn, T., Callison-Burch, C., & Lapata, M. (2008). Constructing corpora for the development and evaluation of paraphrase systems. Computational Linguistics, 34(4), 597–614.
Article Google Scholar
Dale, R., & Kilgarriff, A. (2011). Helping our own: The HOO 2011 pilot shared task. In Proceedings of the 13th European workshop on natural language generation (ENLG 2011) (pp. 242–249). Nancy.
Dale, R., & Narroway, G. (2011). The HOO pilot data set: Notes on release 2.0. Resource document. http://clt.mq.edu.au/research/projects/hoo/hoo2011/files/HOOReleaseNotes20110621.pdf. Accessed 8 February 2013
Dale, R., & Narroway, G. (2012). A framework for evaluating text correction. In Proceedings of the 8th international conference on language resources and evaluation (LREC 2012) (pp. 3015–3018). Istanbul.
Dolan, W. B., & Brockett, C. (2005). Automatically constructing a corpus of sentential paraphrases. In Proceedings of the 3rd international workshop on paraphrasing (IWP 2005) (pp. 9–16). Jeju Island.
Dutrey, C., Bernhard, D., Bouamor, H., & Max, A. (2011). Local modifications and paraphrases in Wikipedia’s revision history. Procesamiento del Lenguaje Natural, 46, 51–58.
Google Scholar
España-Bonet, C., Vila, M., Rodríguez, H., & Martí, M. A. (2009). CoCo, a web interface for corpora compilation. Procesamiento del Lenguaje Natural, 43, 367–368.
Google Scholar
Fleiss, J. L. (1981). Statistical methods for rates and proportions. New York: Wiley.
Google Scholar
Fuchs, C. (1988). Paraphrases prédicatives et contraintes énonciatives. In: Bès G., & Fuchs C. (Eds.), Lexique et Paraphrase, no. 6 in Lexique, Presses Universitaires de Lille, Villeneuve d’Ascq (pp. 157–171).
Hovy, E., Lin, C. Y., Zhou, L., & Fukumoto, J. (2006). Automated summarization evaluation with basic elements. In Proceedings of the 5th international conference on language resources and evaluation (LREC 2006) (pp. 899–902). Genoa.
Kupper, L. L., & Hafner, K. B. (1989). On assessing interrater agreement for multiple attribute responses. Biometrics, 45(3), 957–967.
Article Google Scholar
Lin, C. Y., & Hovy, E. (2003). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 4th annual meeting of the north american chapter of the association for computational linguistics: Human language technologies (NAACL/HLT 2003), Edmonton (Vol. 1, pp. 71–78).
Lin, C. Y., & Och, F. J. (2004). ORANGE: A method for evaluating automatic evaluation metrics for machine translation. In Proceedings of the 20th international conference on computational linguistics (COLING 2004), Geneva.
Liu, C., Dahlmeier, D., & Ng, H. T. (2010) PEM: A paraphrase evaluation metric exploiting parallel texts. In Proceedings of the 2010 conference on empirical methods in natural language processing (EMNLP 2010), Cambridge (pp. 923–932).
Madnani, N., & Dorr, B. J. (2010). Generating phrasal and sentential paraphrases: A survey of data-driven methods. Computational Linguistics, 36(3), 341–387.
Article Google Scholar
Max, A., & Wisniewski, G. (2010). Mining naturally-occurring corrections and paraphrases from Wikipedia’s revision history. In Proceedings of the 7th international conference on language resources and evaluation (LREC 2010), Valletta (pp. 3143–3148).
Milićević, J. (2007). La paraphrase. Modélisation de la paraphrase langagière. Bern: Peter Lang.
Google Scholar
Nenkova, A., & Passonneau, R. (2004). Evaluating content selection in summarization: the pyramid method. In Proceedings of the 5th annual meeting of the North American chapter of the association for computational linguistics: human language technologies (NAACL/HLT 2004), Boston (pp 145–152).
Potthast, M., Stein, B., Barrón-Cedeño, A., & Rosso, P. (2010). An evaluation framework for plagiarism detection. In Proceedings of the 23rd international conference on computational linguistics (COLING 2010), Beijing (pp. 997–1005).
Recasens, M., & Vila, M. (2010). On paraphrase and coreference. Computational Linguistics, 36(4), 639–647.
Article Google Scholar
Romano, L., Kouylekov, M., Szpektor, I., Dagan, I., & Lavelli, A. (2006). Investigating a generic paraphrase-based approach for relations extraction. In Proceedings of the 11th conference of the European chapter of the association for computational linguistics (EACL 2006), Trento (pp. 409–416).
Vila, M., & Dras, M. (2012). Tree edit distance as a baseline approach for paraphrase representation. Procesamiento del Lenguaje Natural, 48, 89–95.
Google Scholar
Vila, M., Rodríguez, H., & Martí, M. A. (2013). Relational paraphrase acquisition from Wikipedia. The WRPA method and corpus: Natural language engineering. doi:10.1017/S1351324913000235.
Vila, M., Martí, M. A., & Rodríguez, H. (2014). Is this a paraphrase? What kind? Paraphrase boundaries and typology. Open Journal of Modern Linguistics, 4, 205–218.
Article Google Scholar
Zaenen, A. (2006). Mark-up barking up the wrong tree. Computational Linguistics, 32(4), 577–580.
Article Google Scholar

Download references

Acknowledgments

We are grateful to the people that participated in the annotation of the corpora: Rita Zaragoza, Montse Nofre, Patricia Fernández, and Oriol Borrega. We would also like to thank Alberto Barrón-Cedeño for his help in shaping inter-annotator agreement measure formulae. This work is supported by the Spanish government through the projects DIANA (TIN2012-38603-C02-02) and SKATER (TIN2012-38584-C06-01) from Ministerio de Ciencia e Innovación, as well as a FPU Grant (AP2008-02185) from Ministerio de Educación, Cultura y Deporte.

Author information

Authors and Affiliations

CLiC, Universitat de Barcelona, Gran Via 585, 08007, Barcelona, Spain
Marta Vila, Manuel Bertran & M. Antònia Martí
TALP, Universitat Politècnica de Catalunya, Jordi Girona Salgado 1-3, 08034, Barcelona, Spain
Horacio Rodríguez

Authors

Marta Vila
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Bertran
View author publications
You can also search for this author in PubMed Google Scholar
M. Antònia Martí
View author publications
You can also search for this author in PubMed Google Scholar
Horacio Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marta Vila.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vila, M., Bertran, M., Martí, M.A. et al. Corpus annotation with paraphrase types: new annotation scheme and inter-annotator agreement measures. Lang Resources & Evaluation 49, 77–105 (2015). https://doi.org/10.1007/s10579-014-9272-5

Download citation

Received: 18 February 2013
Accepted: 13 June 2014
Published: 02 July 2014
Issue Date: March 2015
DOI: https://doi.org/10.1007/s10579-014-9272-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Corpus annotation with paraphrase types: new annotation scheme and inter-annotator agreement measures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Russian Paraphrase Corpus. Paraphrase Identification and Classification Based on Different Prediction Models

A hybrid approach for paraphrase identification based on knowledge-enriched semantic heuristics

Low-Level Features for Paraphrase Identification

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now