Abstract
Fine-grained entity typing (FGET) is to classify the mentions of entities into hierarchical fine-grained semantic types. There are two main issues with existing FGET approaches. Firstly, the process of training corpora for FGET is normally to label the data automatically, which inevitably induces noises. Existing approaches either directly tweak noisy labels in corpora by heuristics or algorithmically retreat to parental types, both leading to coarse-grained type labels instead of fine-grained ones. Secondly, existing approaches usually use recurrent neural networks to generate feature representations of mention phrases and their contexts, which, however, perform relatively poor on long contexts and out-of-vocabulary (OOV) words. In this paper, we propose a transfer learning-based approach to extract more efficient feature representations and offset label noises. More precisely, we adopt three transfer learning schemes: (i) transferring sub-word embeddings to generate more efficient OOV embeddings; (ii) using a pre-trained language model to generate more efficient context features; (iii) using a pre-trained topic model to transfer the topic-type relatedness through topic anchors and select confusing fine-grained types at inference time. The pre-trained topic model can offset the label noises without retreating to coarse-grained types. The experimental results demonstrate the effectiveness of our transfer learning approach for FGET.
Similar content being viewed by others
Notes
In this paper, we use capitalized italic prints to denote level 1 type labels (e.g., /Person), non-capitalized italic prints to denote level 2 and 3 types (e.g., /Person/artist). The symbol “/” is used to represent the hierarchical relationship.
Some other datasets are not publicly available.
References
Abhishek A, Anand A, Awekar A (2017) Fine-grained entity type classification by jointly learning representations and label embeddings. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, Long Papers, pp 797–807. Association for Computational Linguistics, Valencia, Spain. https://www.aclweb.org/anthology/E17-1075
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Baheti A, Ritter A, Li J, Dolan B (2018) Generating more interesting responses in neural conversation models with distributional constraints. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 3970–3980. https://www.aclweb.org/anthology/D18-1431
Banerjee D, Islam K, Xue K, Mei G, Xiao L, Zhang G, Xu R, Lei C, Ji S, Li J (2019) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. Knowl Inf Syst 60(3):1693–1724
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Brown PF, Della Pietra VJ, Desouza PV, Lai JC, Mercer RL (1992) Class-based n-gram models of natural language. Comput Linguist 18(4):467–480
Silla Carlos N, Freitas Alex A (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22(1):31–72
Clark K, Luong MT, Le QV, Manning CD (2020) ELECTRA: pre-training text encoders as discriminators rather than generators. In: Proceddings of ICLR, pp 1–17. Retrieved March 19, 2020, from https://openreview.net/pdf?id=r1xMH1BtvB
Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp 160–167
Daniel G, Nevena L, Kuzman G, Jesse K, David H (2014) Context-dependent fine-grained entity type tagging. arXiv preprint arXiv:1412.1820
Daume H III, Marcu D (2006) Domain adaptation for statistical classifiers. J Artif Intell Res 26:101–126
Deng D, Jing L, Yu J, Sun S, Ng MK (2019) Sentiment lexicon construction with hierarchical supervision topic model. IEEE/ACM Trans Audio Speech Language Process 27(4):704–718. https://doi.org/10.1109/TASLP.2019.2892232
Dong L, Wei F, Sun H, Zhou M, Xu K (2015) A hybrid neural model for type classification of entity mentions. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence (IJCAI 2015), pp 1243–1249
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159
Ekbal A, Sourjikova E, Frank A, Ponzetto SP (2010) Assessing the challenge of fine-grained named entity recognition and classification. In: Proceedings of the 2010 named entities workshop, pp 93–101
Eunsol C, Omer L, Yejin C, Luke Z (2018) Ultra-fine entity typing. In: Proceedings of the 56th annual meeting of the association for computational linguistics, pp 87–96
Fleischman M, Hovy E (2002) Fine grained classification of named entities. In: COLING 2002: The 19th international conference on computational linguistics, pp 1–7. https://www.aclweb.org/anthology/C02-1130
Ghaddar A, Langlais P (2018) Transforming Wikipedia into a large-scale fine-grained entity type corpus. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018), pp. 4413–4420. European language resources association (ELRA), Miyazaki, Japan. Retrieved April 02, 2019, from https://www.aclweb.org/anthology/L18-1699
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Griffiths TL, Steyvers M, Blei DM, Tenenbaum JB (2005) Integrating topics and syntax. In: Advances in neural information processing systems, pp 537–544
Hailong J, Lei H, Juanzi L, Tiansi D (2018) Attributed and predictive entity embedding for fine-grained entity typing in knowledge bases. In: Proceedings of the 27th international conference on computational linguistics, pp 282–292
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jacob D, Ming-Wei C, Kenton L, Kristina T (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Jeffrey P, Richard S, Christopher DM (2014) GloVe: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543
Jin M, Luo X, Zhu H, Zhuo HH (2018) Combining deep learning and topic modeling for review understanding in context-aware recommendation. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (Long Papers), pp. 1605–1614. Association for Computational Linguistics, New Orleans, Louisiana. https://doi.org/10.18653/v1/N18-1145
Joshi M, Chen D, Liu Y, Weld DS, Zettlemoyer L, Levy O (2020) Spanbert: improving pre-training by representing and predicting spans. Trans Assoc Comput Linguist 8:64–77
Keren G, Sabato S, Schuller B (2020) Analysis of loss functions for fast single-class classification. Knowl Inf Syst 62(1):337–358
Liu M, He M, Wang R, Li S (2019) A new local density and relative distance based spectrum clustering. Knowl Inf Syst 61(2):965–985
Ma D, Chen Y, Chang KCC, Du X, Xu C, Chang Y (2018) Leveraging fine-grained Wikipedia categories for entity search. In: Proceedings of the 2018 world wide web conference, pp 1623–1632
Mendes PN, Jakob M, García-Silva A, Bizer C (2011) DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th international conference on semantic systems, pp. 1–8. ACM
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2020) Deep learning based text classification: a comprehensive review. arXiv preprint arXiv:2004.03705
Amir Yosef Mohamed, Sandro Bauer, Johannes Hoffart, Marc Spaniol, Gerhard Weikum (2012) HYENA: hierarchical type classification for entity names. Proc COLING 2012:1361–1370
Neelakantan A, Chang MW (2015) Inferring missing entity type instances for knowledge base completion: New dataset and methods. In: Proceedings of the 2015 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 515–525. Association for Computational Linguistics, Denver, Colorado. https://doi.org/10.3115/v1/N15-1054
Nitish G, Sameer S, Dan R (2017) Entity linking via joint encoding of types, descriptions, and context. In: Proceedings of the conference on empirical methods in natural language processing, pp 2671–2680
Peng X, Denilson B (2018) Neural fine-grained entity type classification with hierarchy-aware loss. In: Proceedings of NAACL-HLT, pp 16–25
Peters M, Ammar W, Bhagavatula C, Power R (2017) Semi-supervised sequence tagging with bidirectional language models. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), pp 1756–1765. Association for Computational Linguistics, Vancouver, Canada. https://doi.org/10.18653/v1/P17-1161
Rabinovich M, Klein D (2017) Fine-grained entity typing with high-multiplicity assignments. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: short papers), pp 330–334. Association for Computational Linguistics, Vancouver, Canada. Retrieved April 02, 2019, from https://doi.org/10.18653/v1/P17-2052
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training pp 1–12. Retrieved April 01, 2019, from https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/language understanding paper.pdf
Radford W, Curran JR (2013) Joint apposition extraction with syntactic and semantic constraints. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 2: short papers), pp 671–677. Association for Computational Linguistics, Sofia, Bulgaria. Retrieved April 02, 2019, from https://www.aclweb.org/anthology/P13-2118
Rahman A, Ng V (2010) Inducing fine-grained semantic classes via hierarchical and collective classification. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), pp 931–939
Ralph W, Martha P, Mitchell M, Eduard H, Sameer P, Lance R, Nianwen X, Ann T, Jeff K, Michelle F (2013) Ontonotes release 5.0 with OntoNotes DB tool v0.999 beta. In: Linguistic data consortium, pp 1–53. Retrieved April 02, 2019, from https://hdl.handle.net/11272.1/AB2/MKJJ2R
Recasens M, de Marneffe MC, Potts C (2013) The life and death of discourse entities: Identifying singleton mentions. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 627–633. Association for Computational Linguistics, Atlanta, Georgia. Retrieved April 03, 2019, from https://www.aclweb.org/anthology/N13-1071
Ren X, He W, Qu M, Huang L, Ji H, Han J (2016) Afet: automatic fine-grained entity typing by hierarchical partial-label embedding. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 1369–1378
Ren X, He W, Qu M, Voss CR, Ji H, Han J (2016) Label noise reduction in entity typing by heterogeneous partial-label embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1825–1834
Sang EF, De Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint arXiv:cs/0306050
Sanjeev K, Ulli W, Hinrich S (2017) End-to-end trainable attentive decoder for hierarchical entity classification. In: Proceedings of European chapter of association for computational linguistics, pp 752–758
Shimaoka S, Stenetorp P, Inui K, Riedel S (2017) Neural architectures for fine-grained entity type classification. In: Proceedings of the 15th Conference of the European chapter of the association for computational linguistics: volume 1, long papers, pp 1271–1280. Association for Computational Linguistics, Valencia, Spain. Retrieved April 03, 2019, from https://www.aclweb.org/anthology/E17-1119
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Suzuki M, Matsuda K, Sekine S, Okazaki N, Inui K (2016) Fine-grained named entity classification with wikipedia article vectors. In: 2016 IEEE/WIC/ACM international conference on web intelligence (WI), pp 483–486. IEEE
Tomas M, Greg C, Kai C, Jeffrey D (2013) Efficient estimation of word representations in vector space. In: ICLR workshop, pp 1–12
Tomas M, Ilya S, Kai C, Greg C, Jeffrey D (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Wiedemann G, Ruppert E, Jindal R, Biemann C (2018) Transfer learning from lda to bilstm-cnn for offensive language detection in twitter. In: Proceedings of GermEval 2018, 14th conference on natural language processing (KONVENS 2018), pp 85–94
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, et al (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144
Xiao L, Daniel SW (2012) Fine-grained entity recognition. In: Proceedings of 26th AAAI conference on artificial intelligence, pp 94–100
Yaghoobzadeh Yadollah, Adel Heike, Schutze Hinrich (2018) Corpus-level fine-grained entity typing. J Artif Intell Res 61:835–862
Yaghoobzadeh Y, Adel H, Schütze H (2017) Noise mitigation for neural entity typing and relation extraction. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, long papers, pp 1183–1194. Association for Computational Linguistics, Valencia, Spain. Retrieved April 03, 2019, from https://www.aclweb.org/anthology/E17-1111
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237
Yang Z, Salakhutdinov R, Cohen WW (2017) Transfer learning for sequence tagging with hierarchical recurrent networks. In: Proceedings of ICLR, pp 1–10
Yogatama D, Gillick D, Lazic N (2015) Embedding methods for fine grained entity type classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), pp 291–296. Association for Computational Linguistics, Beijing, China. https://doi.org/10.3115/v1/P15-2048
Yukun M, Erik C, Sa G (2016) Label embedding for zero-shot fine-grained named entity typing. In: Proceedings of the 26th international conference on computational linguistics: technical papers, pp 171–180
Zha D, Li C (2019) Multi-label dataless text classification with topic modeling. Knowl Inf Syst 61(1):137–160
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1441–1451. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1139
Zhang Z, Zhao H, Ling K, Li J, Li Z, He S, Fu G (2019) Effective subword segmentation for text comprehension. IEEE/ACM Trans Audio Speech Language Process 27(11):1664–1674
Zhao W, Peng H, Eger S, Cambria E, Yang M (2019) Towards scalable and reliable capsule networks for challenging NLP applications. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1549–1559. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1150
Zhong X, Cambria E, Hussain A (2020) Extracting time expressions and named entities with constituent-based tagging schemes. Cognitive Comput 12:1–19
Acknowledgements
We thank anonymous reviewers for their very thoughtful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hou, F., Wang, R. & Zhou, Y. Transfer learning for fine-grained entity typing. Knowl Inf Syst 63, 845–866 (2021). https://doi.org/10.1007/s10115-021-01549-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01549-5