research-article

Named Entity Recognition and Relation Extraction: State-of-the-Art

Authors:

Syed Waqar Jaffry,

Muhammad Kamran MalikAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 54, Issue 1

Article No.: 20, Pages 1 - 39

https://doi.org/10.1145/3445965

Published: 11 February 2021 Publication History

Abstract

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

References

[1]

I. Muslea et al. 1999. Extraction patterns for information extraction tasks: A survey. In Proceedings of the AAAI Workshop on Machine Learning for Information Extraction.

[2]

G. Simoes, H. Galhardas, and L. Coheur. 2009. Information extraction tasks: A survey. In Proceedings of the INForum.

[3]

Linguistic Data Consortium. 2017. MUC Data Sets. Retrieved from http://www-nlpir.nist.gov/related_projects/muc/muc_data/muc_data_index.html.

[4]

A. Rodriguez. 2017. MUC - Cohen Courses. Retrieved from http://curtis.ml.cmu.edu/w/courses/index.php/MUC.

[5]

Linguistic Data Consortium. 2002. Annotation Tasks and Specifications. Retrieved from https://www.ldc.upenn.edu/collaborations/past-projects/ace/annotation-tasks-and-specifications.

[6]

National Institute of Standards and Technology (NIST). 2017. TAC Knowledge Base Population (KBP). In Proceedings of the Text Analytic Conference.

[7]

I. Augenstein, M. Das, S. Riedel, L. Vikraman, and A. McCallum. 2017. SemEval 2017 Task 10: ScienceIE - extracting keyphrases and relations from scientific publications. ArXiv170402853 Cs Stat, Apr. 2017.

[8]

L. Neve. 2019. GENIA Corpus. The ORBIT Project. Retrieved from https://orbit.nlm.nih.gov/browse-repository/dataset/human-annotated/83-genia-corpus.

[9]

R. Merchant, M. E. Okurowski, and N. Chinchor. 1996. The multilingual entity task (MET) overview. In Proceedings of a Workshop on held at Vienna, Virginia: May 6--8, 1996. 445--447.

Digital Library

[10]

Asian Federation of Natural Language Processing. 2008. IJCNLP-08 Workshop on NER for South and South East Asian Languages. Retrieved from http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=5.

[11]

M. K. Malik. 2017. Urdu named entity recognition and classification system using artificial neural network. ACM Trans Asian Low-Resour Lang. Inf. Proc. 17, 1 (2017), 2:1--2:13.

Digital Library

[12]

N. Peng and M. Dredze. 2015. Named entity recognition for Chinese social media with jointly trained embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 548--554.

[13]

W. Wang, F. Bao, and G. Gao. 2015. Mongolian named entity recognition using suffixes segmentation. In Proceedings of the International Conference on Asian Language Processing (IALP’15). 169--172.

[14]

D. Nadeau and S. Sekine. 2007. A survey of named entity recognition and classification. Lingvisticae Investig. 30, 1 (2007), 3--26.

[15]

N. Kanya and T. Ravi. 2012. Modelings and techniques in named entity recognition-an information extraction task. In Proceedings of the IET Chennai 3rd International on Sustainable Energy and Intelligent Systems (SEISCON’12).

[16]

G. K. Palshikar. 2013. Techniques for named entity recognition. Bioinforma. Concepts Methodol. Tools Appl. 400 (2013).

[17]

R. Sharnagat. 2014. Named entity recognition: A literature survey. Report 11305R013. Cent. Indian Lang. Technol.

[18]

N. Patil, A. S. Patil, and B. Pawar. 2016. Survey of named entity recognition systems with respect to Indian and foreign languages. Int. J. Comput. Appl. 134, 16 (2016).

[19]

L. Ratinov and D. Roth. 2019. Design challenges and misconceptions in named entity recognition. 147--155. Retrieved from http://dl.acm.org/citation.cfm?id=1596374.1596399.

[20]

N. Rizzolo and D. Roth. 2007. Modeling discriminative global inference. In Proceedings of the International Conference on Semantic Computing (ICSC’07). 597--604.

[21]

D. Klein, J. Smarr, H. Nguyen, and C. D. Manning. 2003. Named entity recognition with character-level models. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, Volume 4. 180--183.

Digital Library

[22]

G. Luo, X. Huang, C.-Y. Lin, and Z. Nie. 2015. Joint named entity recognition and disambiguation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 879--880.

[23]

D. B. Nguyen, M. Theobald, and G. Weikum. 2016. J-NERD: Joint named entity recognition and disambiguation with rich linguistic features. Trans. Assoc. Comput. Linguist. 4 (2016), 215--229.

[24]

W. Liao and S. Veeramachaneni. 2009. A simple semi-supervised algorithm for named entity recognition. In Proceedings of the NAACL HLT Workshop on Semi-Supervised Learning for Natural Language Processing. 58--65. Retrieved from http://dl.acm.org/citation.cfm?id=1621829.1621837.

[25]

O. Etzioni et al. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artif. Intell. 165, 1 (2005), 91--134.

[26]

D. Nadeau, P. Turney, and S. Matwin. 2006. Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. Adv. Artif. Intell. Lecture Notes in Computer Sciences, vol. 4013. Springer, 266--277.

Digital Library

[27]

I. Gallo, E. Binaghi, M. Carullo, and N. Lamberti. 2008. Named entity recognition by neural sliding window. In Proceedings of the 8th IAPR International Workshop on Document Analysis Systems. 567--573.

Digital Library

[28]

A. Passos, V. Kumar, and A. McCallum. 2017. Lexicon infused phrase embeddings for named entity resolution. ArXiv14045367 Cs, Apr. 2014.

[29]

M. Peters, W. Ammar, C. Bhagavatula, and R. Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1756--1765.

[30]

M. Peters et al. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2227--2237.

[31]

M. Rondeau and Y. Su. 2015. Full-rank linear-chain NeuroCRF for sequence labeling. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'15). 5281--5285.

[32]

M. A. Rondeau and Y. Su. 2015. Recent improvements to NeuroCRFs for named entity recognition. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU’15). 390--396.

[33]

X. Ma and E. Hovy. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. ArXiv Prepr. ArXiv160301354, 2016.

[34]

E. Strubell, P. Verga, D. Belanger, and A. McCallum. 2017. Fast and accurate sequence labeling with iterated dilated convolutions. ArXiv170202098 Cs, Feb. 2017.

[35]

F. Liu, T. Baldwin, and T. Cohn. 2017. Capturing long-range contextual dependencies with memory-enhanced conditional random fields. ArXiv Prepr. ArXiv170903637, 2017.

[36]

K. Riaz. 2010. Rule-based named entity recognition in Urdu. In Proceedings of the Named Entities Workshop. 126--135. Retrieved from http://dl.acm.org/citation.cfm?id=1870457.1870476.

[37]

R. Alfred, L. C. Leong, C. K. On, and P. Anthony. 2014. Malay named entity recognition based on rule-based approach. Int. J. Mach. Learn. Comput. 4, 3 (2014), 300.

[38]

D. M. Bikel, R. Schwartz, and R. M. Weischedel. 1999. An algorithm that learns what's in a name. Mach. Learn. 34, 1--3 (1999), 211--231.

Digital Library

[39]

R. Ageishi and T. Miura. 2008. Named entity recognition based on a hidden Markov model in part-of-speech tagging. In Proceedings of the 1st International Conference on the Applications of Digital Information and Web Technologies (ICADIWT’08). 397--402.

[40]

A. E. Borthwick. 1999. A Maximum Entropy Approach to Named Entity Recognition. Ph.D. Dissertation. New York University, New York, NY.

Digital Library

[41]

A. McCallum and W. Li. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, Volume 4. 188--191.

Digital Library

[42]

G. Fu and K.-K. Luke. 2005. Chinese named entity recognition using lexicalized HMMs. SIGKDD Explor. Newsl. 7, 1 (2005), 19--25.

Digital Library

[43]

X. Yu, S. Mayhew, M. Sammons, and D. Roth. 2019. On the strength of character language models for multilingual named entity recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Association for Computational Linguistics. 3073--3077. Retrieved from https://aclweb.org/anthology/papers/D/D18/D18-1345/.

[44]

D. Khashabi et al. 2018. Cogcompnlp: Your Swiss army knife for NLP. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18).

[45]

S. Strassel and J. Tracey. 2016. Lorelei language packs: Data, tools, and resources for technology development in low resource languages. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 3273--3280.

[46]

E. Brill. 1995. Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Comput. Linguist. 21, 4 (1995), 543--565.

Digital Library

[47]

Q. L. L. Buco, J. L. L. Capcap, J. C. A. Hermocilla, C. S. Yumul, R. A. Sagum, and A. G. Pastrana. 2013. The application of transformation-based learning in the development of a named entity recognition system for Filipino text. J. Ind. Intell. Inf. 1, 1 (2013).

[48]

S. Cucerzan and D. Yarowsky. 2002. Language independent NER using a unified model of internal and contextual evidence. In Proceedings of the 6th Conference on Natural Language Learning, Volume 20. 1--4.

Digital Library

[49]

R. A. Leonandya, B. Distiawan, and N. H. Praptono. 2015. A semi-supervised algorithm for Indonesian named entity recognition. In Proceedings of the 3rd International Symposium on Computational and Business Intelligence (ISCBI’15). 45--50.

Digital Library

[50]

J. Straková, M. Straka, and J. Hajič. 2016. Neural networks for featureless named entity recognition in Czech. In Proceedings of the International Conference on Text, Speech, and Dialogue. 173--181.

[51]

L. Liu et al. 2018. Empower sequence labeling with task-aware neural language model. In Proceedings of the AAAI Conference on Artificial Intelligence.

[52]

X.-D. Doan, T.-T. Dang, and M. L. Nguyen. 2019. Effectiveness of character language model for Vietnamese named entity recognition. In Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation. Retrieved from https://aclweb.org/anthology/papers/Y/Y18/Y18-1018/.

[53]

C. Lee. 2017. LSTM-CRF models for named entity recognition. IEICE Trans. Inf. Syst. 100, 4 (2017), 882--887.

[54]

G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer. 2016. Neural architectures for named entity recognition. In Proceedings of NAACL-HLT, Association for Computational Linguistics. 260--270.

[55]

W. Wang, F. Bao, and G. Gao. 2016. Mongolian named entity recognition with bidirectional recurrent neural networks. In Proceedings of the IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI’16). 495--500.

[56]

O. Täckström. 2012. Nudging the envelope of direct transfer methods for multilingual named entity recognition. In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure. Retrieved from http://dl.acm.org/citation.cfm?id=2390426.2390435.

Digital Library

[57]

O. Täckström, R. McDonald, and J. Uszkoreit. 2012. Cross-lingual word clusters for direct transfer of linguistic structure. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Retrieved from http://dl.acm.org/citation.cfm?id=2382029.2382096.

[58]

L. Qu, G. Ferraro, L. Zhou, W. Hou, and T. Baldwin. 2016. Named entity recognition for novel types by transfer learning. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics. 899--905.

[59]

L. Chen, A. Moschitti, G. Castellucci, A. Favalli, and R. Romagnoli. 2018. Transfer learning for industrial applications of named entity recognition. In Proceedings of the 2nd Workshop on Natural Language for Artificial Intelligence (NL4AI’18) co-located with 17th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2018). 129--140. Retrieved from http://ceur-ws.org/Vol-2244/paper_12.pdf.

[60]

S. Mayhew, C.-T. Tsai, and D. Roth. 2017. Cheap translation for cross-lingual named entity recognition. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics. 2536--2545.

[61]

R. Murthy, M. M. Khapra, and P. Bhattacharyya. 2018. Improving NER tagging performance in low-resource languages via multilingual learning. ACM Trans. Asian Low-Resour. Lang. Inf. Proc. 18, 2 (2018), 9:1--9:20.

Digital Library

[62]

X. Feng, X. Feng, B. Qin, Z. Feng, and T. Liu. 2018. Improving low resource named entity recognition using cross-lingual knowledge transfer. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 4071--4077. Retrieved from http://dl.acm.org/citation.cfm?id=3304222.3304336.

[63]

P. Cao, Y. Chen, K. Liu, J. Zhao, and S. Liu. 2018. Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. 182--192. Retrieved from https://aclweb.org/anthology/papers/D/D18/D18-1017/.

[64]

A. Rahimi, Y. Li, and T. Cohn. 2019. Massively multilingual transfer for NER. In Proceedings of the 57th Conference of the Association for Computational Linguistics. 151--164.

[65]

A. Johnson, P. Karanasou, J. Gaspers, and D. Klakow. 2019. Cross-lingual transfer learning for Japanese named entity recognition. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers). 182--189.

[66]

A. Bharadwaj, D. Mortensen, C. Dyer, and J. Carbonell. 2016. Phonologically aware neural model for named entity recognition in low resource transfer settings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1462--1472.

[67]

J. Xie, Z. Yang, G. Neubig, N. A. Smith, and J. G. Carbonell. 2018. Neural cross-lingual named entity recognition with minimal resources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 369--379.

[68]

Z. Nasar, S. W. Jaffry, and M. K. Malik. 2018. Information extraction from scientific articles: A survey. Scientometrics.

Digital Library

[69]

M. Abdelmagid, M. Himmat, and A. Ahmed. 2014. Survey on information extraction from chemical compound literatures: Techniques and challenges. J. Theor. Appl. Inf. Technol. 67, 2 (2014), 284--289.

[70]

G. Duck, G. Nenadic, M. Filannino, A. Brass, D. L. Robertson, and R. Stevens. 2016. A survey of bioinformatics database and software usage through mining the literature. PLoS One 11, 6 (2016), e0157989.

[71]

B. Shickel, P. Tighe, A. Bihorac, and P. Rashidi. 2017. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. ArXiv Prepr. ArXiv170603446, 2017.

[72]

T. Eftimov, B. Koroušić Seljak, and P. Korošec. 2017. A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. PLoS One 12, 6 (2017), e0179488.

[73]

D. M. de Oliveira, A. H. F. Laender, A. Veloso, and A. S. da Silva. 2013. FS-NER: A lightweight filter-stream approach to named entity recognition on Twitter data. In Proceedings of the 22nd International Conference on World Wide Web. 597--604.

Digital Library

[74]

D. Bonadiman, A. Severyn, and A. Moschitti. 2015. Deep neural networks for named entity recognition in Italian. In Proceedings of the 2nd Italian Conference on Computational Linguistics (CLiC It’15).

[75]

J. Xu, H. He, X. Sun, X. Ren, and S. Li. 2018. Cross-domain and semisupervised named entity recognition in Chinese social media: A unified model. IEEE/ACM Trans. Audio Speech Lang. Proc. 26, 11 (2018), 2142--2152.

Digital Library

[76]

Z. Zhao et al. 2016. ML-CNN: A novel deep learning based disease named entity recognition architecture. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM’16). 794--794.

[77]

J. Li et al. 2016. BioCreative V CDR task corpus: A resource for chemical disease relation extraction. Datab.- J. Biol. Datab. Curat. May (2016).

[78]

R. I. Doğan, R. Leaman, and Z. Lu. 2014. NCBI disease corpus: A resource for disease name recognition and concept normalization. J. Biomed. Inform. 47 (2014), 1--10.

[79]

X. Dong, L. Qian, Y. Guan, L. Huang, Q. Yu, and J. Yang. 2016. A multiclass classification method based on deep learning for named entity recognition in electronic medical records. In Proceedings of the New York Scientific Data Summit (NYSDS’16). 1--10.

[80]

L. Wang, S. Li, Q. Yan, and G. Zhou. 2018. Domain-specific named entity recognition with document-level optimization. ACM Trans. Asian Low-Resour. Lang. Inf. Proc. 17, 4 (2018), 33:1--33:15.

Digital Library

[81]

J. R. Finkel and C. D. Manning. 2009. Nested named entity recognition. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Volume 1. 141--150. Retrieved from http://dl.acm.org/citation.cfm?id=1699510.1699529.

[82]

W. Lu and D. Roth. 2015. Joint mention extraction and classification with mention hypergraphs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 857--867.

[83]

A. Katiyar and C. Cardie. 2018. Nested named entity recognition revisited. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 861--871.

[84]

M. Ju, M. Miwa, and S. Ananiadou. 2018. A neural layered model for nested named entity recognition. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1446--1459.

[85]

K. Mai et al. 2018. An empirical study on fine-grained named entity recognition. In Proceedings of the 27th International Conference on Computational Linguistics. 711--722.

[86]

N. Reimers and I. Gurevych. 2017. Optimal hyperparameters for deep LSTM-Networks for sequence labeling tasks. Retrieved from https://arxiv.org/abs/1707.06799.

[87]

S. Zhang and N. Elhadad. 2013. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. J. Biomed. Inform. 46, 6 (2013).

Digital Library

[88]

F. Li, M. Zhang, G. Fu, and D. Ji. 2017. A neural joint model for entity and relation extraction from biomedical text. BMC Bioinf. 18 (2017).

[89]

N. Bach and S. Badaskar. 2007. A review of relation extraction. Unpublished Report. Retrieved from www.cs.cmu.edu/&sim;nbach/papers/A-survey-on-Relation-Extraction.pdf.

[90]

N. Konstantinova. 2014. Review of relation extraction methods: What is new out there? In Analysis of Images, Social Networks and Texts. 15--28.

[91]

N. Asghar. 2016. Automatic extraction of causal relations from natural language texts: A comprehensive survey. ArXiv Prepr. ArXiv160507895, 2016.

[92]

S. Brin. 1998. Extracting patterns and relations from the world wide web. In Proceedings of the International Workshop on the World Wide Web and Databases. 172--183.

Digital Library

[93]

E. Agichtein and L. Gravano. 2000. Snowball: Extracting relations from large plain-text collections. In Proceedings of the 5th ACM Conference on Digital Libraries. 85--94.

[94]

O. Etzioni et al. 2004. Web-scale information extraction in knowitall:(preliminary results). In Proceedings of the 13th International Conference on World Wide Web. 100--110.

Digital Library

[95]

M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. 2007. Open information extraction from the web. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’07). 2670--2676.

[96]

R. McDonald, F. Pereira, S. Kulick, S. Winters, Y. Jin, and P. White. 2005. Simple algorithms for complex relation extraction with applications to biomedical IE. In Proceedings of the 43rd Meeting on Association for Computational Linguistics. 491--498.

Digital Library

[97]

S. Kumar. 2017. A survey of deep learning methods for relation extraction. ArXiv170503645 Cs, May 2017.

[98]

A. Smirnova and P. Cudré-Mauroux. 2018. Relation extraction using distant supervision: A survey. ACM Comput. Surv. 51, 5 (2018), 106:1--106:35.

Digital Library

[99]

K. Fundel, R. Küffner, and R. Zimmer. 2006. RelEx—Relation extraction using dependency parse trees. Bioinformatics 23, 3 (2006), 365--371.

Digital Library

[100]

C. Nédellec. 2005. Learning language in logic-genic interaction extraction challenge. In Proceedings of the 4th Learning Language in Logic Workshop (LLL’05). 31--37.

[101]

Kamel Nebhi. 2013. A rule-based relation extraction system using DBpedia and syntactic parsing. In Proceedings of the 2013th International Conference on NLP 8 DBpedia (NLP-DBPEDIA'13), Vol. 1064. 74--79. Retrieved from http://dl.acm.org/citation.cfm?id=2874479.2874487.

[102]

S. Rosset, C. Grouin, K. Fort, O. Galibert, J. Kahn, and P. Zweigenbaum. 2012. Structured named entities in two distinct press corpora: Contemporary broadcast news and old newspapers. In Proceedings of the 6th Linguistic Annotation Workshop. 40--48.

[103]

G. Leroy and H. Chen. 2001. Filling preposition-based templates to capture information from medical. In Proceedings of the Pacific Symposium on Biocomputing.

[104]

C. Blaschke and A. Valencia. 2001. The potential use of SUISEKI as a protein interaction discovery tool. Genome Inform. 12 (2001), 123--134.

[105]

N. Kambhatla. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. 22--25.

Digital Library

[106]

A. Ratnaparkhi. 1999. Learning to parse natural language with maximum entropy models. Mach. Learn. 34, 1--3 151--175.

Digital Library

[107]

Z. GuoDong, S. Jian, Z. Jie, and Z. Min. 2005. Exploring various knowledge in relation extraction. In Proceedings of the 43rd Meeting on Association for Computational Linguistics. 427--434.

Digital Library

[108]

Y. S. Chan and D. Roth. 2011. Exploiting syntactico-semantic structures for relation extraction. In Proceedings of the 49th Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1. 551--560. Retrieved from http://dl.acm.org/citation.cfm?id=2002472.2002542.

[109]

D. Zelenko, C. Aone, and A. Richardella. 2002. Kernel methods for relation extraction. In Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing, Volume 10. 71--78.

Digital Library

[110]

A. Culotta and J. Sorensen. 2004. Dependency tree kernels for relation extraction. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. 423--429.

Digital Library

[111]

Z. Min, Z. GuoDong, and A. Aiti. 2008. Exploring syntactic structured features over parse trees for relation extraction using kernel methods. Inf. Proc. Manag. 44, 2 (2008), 687--701.

Digital Library

[112]

K. Tymoshenko and C. Giuliano. 2017. FBK-IRST: Semantic relation extraction using cyc. In Proceedings of the 5th International Workshop on Semantic Evaluation. 214--217. Retrieved from http://dl.acm.org/citation.cfm?id=1859664.1859711.

[113]

Z. Zhang. 2004. Weakly-supervised relation classification for information extraction. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. 581--588.

Digital Library

[114]

P. Pantel and M. Pennacchiotti. 2006. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Meeting of the Association for Computational Linguistics. 113--120.

[115]

A. Culotta, A. McCallum, and J. Betz. 2006. Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 296--303.

Digital Library

[116]

A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka Jr, and T. M. Mitchell. 2010. Coupled semi-supervised learning for information extraction. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 101--110.

Digital Library

[117]

S. De Saeger et al. 2011. Relation acquisition using word classes and partial patterns. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Retrieved from http://dl.acm.org/citation.cfm?id=2145432.2145524.

[118]

K.-W. Chang, S. W. Yih, B. Yang, and C. Meek. 2014. Typed tensor decomposition of knowledge bases for relation extraction. Retrieved from https://www.microsoft.com/en-us/research/publication/typed-tensor-decomposition-of-knowledge-bases-for-relation-extraction/.

[119]

S. Riedel, L. Yao, A. McCallum, and B. M. Marlin. 2013. Relation extraction with matrix factorization and universal schemas. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’13). 74--84.

[120]

M. Mintz, S. Bills, R. Snow, and D. Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Volume 2. Retrieved from http://dl.acm.org/citation.cfm?id=1690219.1690287.

Digital Library

[121]

S. Riedel, L. Yao, and A. McCallum. 2010. Modeling relations and their mentions without labeled text. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 148--163.

Digital Library

[122]

M. Wick, K. Rohanimanesh, K. Bellare, A. Culotta, and A. McCallum. 2011. SampleRank: Training factor graphs with atomic gradients. In Proceedings of the 28th International Conference on Machine Learning. 777--784. Retrieved from http://dl.acm.org/citation.cfm?id=3104482.3104580.

[123]

S. Takamatsu, I. Sato, and H. Nakagawa. 2012. Reducing wrong labels in distant supervision for relation extraction. In Proceedings of the 50th Meeting of the Association for Computational Linguistics: Long Papers, Volume 1. 721--729. Retrieved from http://dl.acm.org/citation.cfm?id=2390524.2390626.

[124]

H. Zhang and Y. Zhao. 2013. Improving few occurrence feature performance in distant supervision for relation extraction. In Advanced Data Mining and Applications, H. Motoda, Z. Wu, L. Cao, O. Zaiane, M. Yao, and W. Wang (Eds). Springer Berlin, 414--422.

[125]

J. Chen, D. Ji, C. L. Tan, and Z. Niu. 2005. Unsupervised feature selection for relation extraction. In Companion Volume to the Proceedings of Second International Joint Conference on Natural Language Processing, [Online]. Retrieved from https://www.aclweb.org/anthology/I05-2045.

[126]

S. Sekine. 2005. Automatic paraphrase discovery based on context and keywords between NE pairs. In Proceedings of the International Workshop on Paraphrasing (IWP’05). 4--6.

[127]

D. Downey, S. Schoenmackers, and O. Etzioni. 2017. Sparse information extraction: Unsupervised language models to the rescue. In Proceedings of the Meeting of the Association for Computational Linguistics. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.8780.

[128]

B. Min, S. Shi, R. Grishman, and C.-Y. Lin. 2012. Towards large-scale unsupervised relation extraction from the web. Int. J. Seman. Web Inf. Syst. 8, 3 (2012), 1--23.

Digital Library

[129]

D. Roth and W. Yih. 2007. Global inference for entity and relation identification via a linear programming formulation. In Introduction to Statistical Relational Learning. The MIT Press, Cambridge, MA, 553--580.

[130]

R. J. Kate and R. J. Mooney. 2010. Joint entity and relation extraction using card-pyramid parsing. In Proceedings of the 14th Conference on Computational Natural Language Learning. Retrieved from http://dl.acm.org/citation.cfm?id=1870568.1870592.

[131]

X. Yu and W. Lam. 2010. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 1399--1407. Retrieved from http://dl.acm.org/citation.cfm?id=1944566.1944726.

[132]

M. Miwa and Y. Sasaki. 2014. Modeling joint entity and relation extraction with table representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1858--1869.

[133]

J. Duchi, E. Hazan, and Y. Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12 (2011), 2121--2159.

Digital Library

[134]

M. Collins. 2002. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Volume 10. 1--8.

Digital Library

[135]

A. Mejer and K. Crammer. 2010. Confidence in structured-prediction using confidence-weighted models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 971--981.

[136]

K. Crammer, A. Kulesza, and M. Dredze. 2009. Adaptive regularization of weight vectors. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 414--422.

[137]

M.-W. Chang and W. Yih. 2013. Dual coordinate descent algorithms for efficient large margin structured prediction. Trans. Assoc. Comput. Linguist. 1 (2013), 207--218.

[138]

R. Socher, B. Huval, C. D. Manning, and A. Y. Ng. 2012. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 1201--1211.

[139]

D. Zeng, K. Liu, S. Lai, G. Zhou, J. Zhao, et al. 2014. Relation classification via convolutional deep neural network. In Proceedings of the International Conference on Computational Linguistics (COLING’14). 2335--2344.

[140]

L. Wang, Z. Cao, G. de Melo, and Z. Liu. 2016. Relation classification via multi-level attention CNNs. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1298--1307.

[141]

S. Takase, N. Okazaki, and K. Inui. 2016. Modeling semantic compositionality of relational patterns. Eng. Appl. Artif. Intell. 50, C (2016), 256--264.

Digital Library

[142]

J. Liu, H. Ren, M. Wu, J. Wang, and H.-J. Kim. 2018. Multiple relations extraction among multiple entities in unstructured text. Soft Comput. 22, 13 (2018), 4295--4305.

Digital Library

[143]

X. Zhang, P. Li, W. Jia, and H. Zhao. 2019. Multi-labeled relation extraction with attentive capsule network. In Proceedings of the AAAI Conference on Artificial Intelligence. 33 (2019), 7484--7491.

[144]

D. He, H. Zhang, W. Hao, R. Zhang, and K. Cheng. 2017. A customized attention-based long short-term memory network for distant supervised relation extraction. Neural Comput. 29, 7 (2017), 1964--1985.

[145]

C. Ru, J. Tang, S. Li, S. Xie, and T. Wang. 2018. Using semantic similarity to reduce wrong labels in distant supervision for relation extraction. Inf. Proc. Manag. 54, 4 (2018), 593--608.

Digital Library

[146]

J. Qu, D. Ouyang, W. Hua, Y. Ye, and X. Li. 2018. Distant supervision for neural relation extraction integrated with word attention and property features. Neural Netw. 100, C (2018), 59--69.

Digital Library

[147]

Y. Li, Z. Zhong, and N. Jing. 2018. Multi-path convolutional neural network for distant supervised relation extraction. In Proceedings of the 2nd International Conference on Computer Science and Application Engineering. 119:1--119:7.

Digital Library

[148]

Q. Li and H. Ji. 2014. Incremental joint extraction of entity mentions and relations. In Proceedings of the Meeting of the Association for Computational Linguistics. 402--412.

[149]

M. Miwa and M. Bansal. 2016. End-to-end relation extraction using lstms on sequences and tree structures. ArXiv Prepr. ArXiv160100770, 2016.

[150]

S. Di, Y. Shen, and L. Chen. 2019. Relation extraction via domain-aware transfer learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. 1348--1357.

Digital Library

[151]

X. Ling and D. S. Weld. 2012. Fine-grained entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence. 94--100.

[152]

T. Liu, X. Zhang, W. Zhou, and W. Jia. 2018. Neural relation extraction via inner-sentence noise reduction and transfer learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2195--2204.

[153]

S. Yang, W. Lu, D. Yang, X. Li, C. Wu, and B. Wei. 2017. KeyphraseDS: Automatic generation of survey by exploiting keyphrase information. Neurocomputing 224 (2017), 58--70.

Digital Library

[154]

T. Hasegawa, S. Sekine, and R. Grishman. 2004. Discovering relations among named entities from large corpora. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. 415--422.

Digital Library

[155]

L. Yao, S. Riedel, and A. McCallum. 2010. Collective cross-document relation extraction without labelled data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1013--1023. Retrieved from http://dl.acm.org/citation.cfm?id=1870658.1870757.

[156]

Y. Cao, D. Chen, H. Li, and P. Luo. 2019. Nested relation extraction with iterative neural network. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1001--1010.

Digital Library

[157]

V. Sze, Y.-H. Chen, T.-J. Yang, and J. Emer. 2017. Efficient processing of deep neural networks: A tutorial and survey. ArXiv Prepr. ArXiv170309039, 2017.

Cited By

Sághy EElsharkawy MMoriarty FKovács SWittmann IZemplényi A(2025)A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstractsFrontiers in Digital Health10.3389/fdgth.2025.14958797Online publication date: 4-Feb-2025
https://doi.org/10.3389/fdgth.2025.1495879
Liu YZhang KTong RCai CChen DWu X(2025)A Flat-Span Contrastive Learning Method for Nested Named Entity RecognitionInternational Journal of Asian Language Processing10.1142/S271755452450013935:01Online publication date: 27-Jan-2025
https://doi.org/10.1142/S2717554524500139
Yuan LCai YXu JLi QWang T(2025)A Fine-Grained Network for Joint Multimodal Entity-Relation ExtractionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348510737:1(1-14)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3485107
Show More Cited By

Index Terms

Named Entity Recognition and Relation Extraction: State-of-the-Art
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

A Flexible Text Mining System for Entity and Relation Extraction in PubMed
DTMBIO '15: Proceedings of the ACM Ninth International Workshop on Data and Text Mining in Biomedical Informatics

Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means ...
Automatic gazette creation for named entity recognition and application to resume processing
COMPUTE '12: Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies

Named entities are important content-carrying units within documents. Consequently named entity recognition (NER) is an important part of information extraction. One fast and accurate approach to NER uses a list or gazette consisting of known instances. ...
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 54, Issue 1

January 2022

844 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3446641

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 February 2021

Accepted: 01 October 2020

Revised: 01 August 2020

Received: 01 February 2019

Published in CSUR Volume 54, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

137
Total Citations
View Citations
5,507
Total Downloads

Downloads (Last 12 months)1,260
Downloads (Last 6 weeks)164

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sághy EElsharkawy MMoriarty FKovács SWittmann IZemplényi A(2025)A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstractsFrontiers in Digital Health10.3389/fdgth.2025.14958797Online publication date: 4-Feb-2025
https://doi.org/10.3389/fdgth.2025.1495879
Liu YZhang KTong RCai CChen DWu X(2025)A Flat-Span Contrastive Learning Method for Nested Named Entity RecognitionInternational Journal of Asian Language Processing10.1142/S271755452450013935:01Online publication date: 27-Jan-2025
https://doi.org/10.1142/S2717554524500139
Yuan LCai YXu JLi QWang T(2025)A Fine-Grained Network for Joint Multimodal Entity-Relation ExtractionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348510737:1(1-14)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3485107
Alhassan ASchlegel VAloud MBatista-Navarro RNenadic G(2025)Discontinuous named entities in clinical text: A systematic literature reviewJournal of Biomedical Informatics10.1016/j.jbi.2025.104783162(104783)Online publication date: Feb-2025
https://doi.org/10.1016/j.jbi.2025.104783
Zhang LZhao WCheng ZJiang YTian KShi JJiang ZHua Y(2025)Osteosarcoma KGQA system: deep learning-based knowledge graph and large language model fusionIntelligent Medicine10.1016/j.imed.2024.12.001Online publication date: Feb-2025
https://doi.org/10.1016/j.imed.2024.12.001
Xu QXu XZhou CLiu ZHuang FLi SZhu LBai ZXu YHu W(2025)Towards normalized clinical information extraction in Chinese radiology report with large language modelsExpert Systems with Applications10.1016/j.eswa.2025.126585271(126585)Online publication date: May-2025
https://doi.org/10.1016/j.eswa.2025.126585
Yang WQin YHuang RChen Y(2025)Adaptive feature extraction for entity relation extractionComputer Speech and Language10.1016/j.csl.2024.10171289:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.csl.2024.101712
Shan S(2025)Towards Reflexive AI: A Comprehensive Exploration of Enhancing Social Science Research Through NLPAdvances in Information and Communication10.1007/978-3-031-84460-7_49(765-792)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-84460-7_49
Dalal ESingh P(2024)TextRefine: A Novel approach to improve the accuracy of LLM ModelsData and Metadata10.56294/dm20243313(331)Online publication date: 20-May-2024
https://doi.org/10.56294/dm2024331
Gohourou DKuwabara K(2024)Knowledge Graph Extraction of Business Interactions from News Text for Business Networking AnalysisMachine Learning and Knowledge Extraction10.3390/make60100076:1(126-142)Online publication date: 7-Jan-2024
https://doi.org/10.3390/make6010007
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents