Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity

Pipitone, Arianna; Tirone, Giuseppe; Pirrone, Roberto

doi:10.1007/978-3-319-70169-1_8

Arianna Pipitone¹⁷,
Giuseppe Tirone¹⁷ &
Roberto Pirrone¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10640))

Included in the following conference series:

Conference of the Italian Association for Artificial Intelligence

1508 Accesses

Abstract

This work proposes a novel approach in Named Entity rEcognition and Linking (NEEL) in tweets, applying the same strategy already presented for Question Answering (QA) by the same authors. The previous work describes a rule-based and ontology-based system that attempts to retrieve the correct answer to a query from the DBPedia ontology through a similarity measure between the query and the ontology labels. In this paper, a tweet is interpreted as a query for the QA system: both the text and the thread of a tweet are a sequence of statements that have been linked to the ontology. Provided that tweets make extensive use of informal language, the similarity measure and the underlying processes have been devised differently than in the previous approach; also the particular structure of a tweet, that is the presence of mentions, hashtags, and partially structured statements, is taken into consideration for linguistic insights. NEEL is achieved actually as the output of annotating a tweet with the names of the ontological entities retrieved by the system. The strategy is explained in detail along with the architecture and the implementation of the system; also the performance as compared to the systems presented at the #Micropost2016 workshop NEEL Challenge co-located with the World Wide Web conference 2016 (WWW ’16) is reported and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

WeLink: A Named Entity Disambiguation Approach for a QAS over Knowledge Bases

Entity-Aware Social Media Reading Comprehension

Repairing Networks of $${\mathcal{E}\mathcal{L}}_{\perp }$$ Ontologies Using Weakening and Completing

References

Beaufort, R., Roekhaut, S., Cougnon, L.A., Fairon, C.: A hybrid rule/model-based finite-state framework for normalizing SMS messages. In: Hajic, J., Carberry, S., Clark, S. (eds.) ACL, pp. 770–779. The Association for Computer Linguistics (2010). http://dblp.uni-trier.de/db/conf/acl/acl2010.html#BeaufortRCF10
Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., Petrak, J., Bontcheva, K.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manag. 51(2), 32–49 (2015)
Article Google Scholar
Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. The MIT Press, Cambridge; London (1998)
Google Scholar
Habib, M.B., van Keulen, M.: Need4tweet: a twitterbot for tweets named entity extraction and disambiguation. In: Proceedings of the System Demonstrations of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China. The Association for Computer Linguistics, Beijing, July 2015
Google Scholar
Habib, M., van Keulen, M.: A generic open world named entity disambiguation approach for tweets. In: 5th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2013. SciTePress, September 2013. http://doc.utwente.nl/86471/
Han, B., Baldwin, T.: Lexical normalisation of short text messages: makn sens a #twitter. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 368–378. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2002472.2002520
Hoover, W.A., Gough, P.B.: The simple view of reading. Read. Writ. 2(2), 127–160 (1990). https://doi.org/10.1007/BF00401799
Article Google Scholar
Kaufmann, M., Kalita, J.: Syntactic normalization of Twitter messages. In: International Conference on Natural Language Processing, Kharagpur, India (2010)
Google Scholar
Kobus, C., Yvon, F., Damnati, G.: Normalizing SMS: are two metaphors better than one? In: Proceedings of the 22nd International Conference on Computational Linguistics, COLING 2008, vol. 1, pp. 441–448. Association for Computational Linguistics, Stroudsburg (2008). http://dl.acm.org/citation.cfm?id=1599081.1599137
Li, C., Weng, J., He, Q., Yao, Y., Datta, A., Sun, A., Lee, B.S.: Twiner: named entity recognition in targeted Twitter stream. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 721–730. ACM, New York (2012). http://doi.acm.org/10.1145/2348283.2348380
Liu, F., Weng, F., Wang, B., Liu, Y.: Insertion, deletion, or substitution?: normalizing text messages without pre-categorization nor supervision (2011)
Google Scholar
Nothman, J., Ringland, N., Radford, W., Murphy, T., Curran, J.R.: Learning multilingual named entity recognition from Wikipedia. Artif. Intell. 194, 151–175 (2013). https://doi.org/10.1016/j.artint.2012.03.006
Article MATH MathSciNet Google Scholar
Pipitone, A., Campisi, M.C., Pirrone, R.: An A* based semantic tokenizer for increasing the performance of semantic applications. In: 2013 IEEE Seventh International Conference on Semantic Computing, Irvine, CA, USA, 16–18 September 2013, pp. 393–394. IEEE Computer Society (2013). https://doi.org/10.1109/ICSC.2013.75
Pipitone, A., Tirone, G., Pirrone, R.: QuASIt: a cognitive inspired approach to question answering for the Italian language. In: Adorni, G., Cagnoni, S., Gori, M., Maratea, M. (eds.) AI*IA 2016. LNCS, vol. 10037, pp. 464–476. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49130-1_34
Chapter Google Scholar
Plu, J., Rizzo, G., Troncy, R.: Enhancing entity linking by combining NER models. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 17–32. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_2
Chapter Google Scholar
Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 1524–1534. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145595
Rizzo, G., van Erp, M., Plu, J., Troncy, R.: Making sense of microposts (#microposts2016) named entity recognition and linking (NEEL) challenge. In: Dadzie, A., Preotiuc-Pietro, D., Radovanovic, D., Basave, A.E.C., Weller, K. (eds.) Proceedings of the 6th Workshop on ‘Making Sense of Microposts’ co-located with the 25th International World Wide Web Conference (WWW 2016), Montréal, Canada, 11 April 2016. CEUR Workshop Proceedings, vol. 1691, pp. 50–59. CEUR-WS.org (2016). http://ceur-ws.org/Vol-1691/microposts2016_neel-challenge-report/
Rupley, W.H., Blair, T.R., Nichols, W.D.: Effective reading instruction for struggling readers: the role of direct/explicit teaching. Read. Writ. Q. 25(2–3), 125–138 (2009). https://doi.org/10.1080/10573560802683523
Article Google Scholar
Wang, A., Chen, T., Kan, M.Y.: Re-tweeting from a linguistic perspective. In: Proceedings of the Second Workshop on Language in Social Media, LSM 2012, pp. 46–55. Association for Computational Linguistics, Stroudsburg (2012). http://dl.acm.org/citation.cfm?id=2390374.2390380

Download references

Author information

Authors and Affiliations

Dipartimento dell’Innovazione Industriale e Digitale (DIID), Università degli Studi di Palermo, Palermo, Italy
Arianna Pipitone, Giuseppe Tirone & Roberto Pirrone

Authors

Arianna Pipitone
View author publications
Search author on:PubMed Google Scholar
Giuseppe Tirone
View author publications
Search author on:PubMed Google Scholar
Roberto Pirrone
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Roberto Pirrone .

Editor information

Editors and Affiliations

University of Bari, Bari, Italy
Floriana Esposito
University of Rome Tor Vergata, Rome, Italy
Roberto Basili
University of Bari, Bari, Italy
Stefano Ferilli
University of Bari, Bari, Italy
Francesca A. Lisi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pipitone, A., Tirone, G., Pirrone, R. (2017). Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-70169-1_8
Published: 07 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70168-4
Online ISBN: 978-3-319-70169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity