Abstract
Identification and extraction of proper names from Internet-based sources currently suffers from a lack of verification methods that check the validity of these extracted names. A language-independent method for assigning probabilities to extracted proper names using frequency data harvested from the Internet is presented. Verification mechanisms are built on top of this technique to exclude misidentified proper names automatically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Azzam, S., Humphreys, K., Gaizauskas, R.: Coreference resolution in a multilingual information extraction. In: Proc. Workshop on Linguistic Coreference, Granada, Spain (1998)
Aone, C., McKee, D.: A language-independent anaphora resolution system for understanding multilingual texts. In: Giunchiglia, F. (ed.) AIMSA 1998. LNCS (LNAI), vol. 1480, pp. 1–13. Springer, Heidelberg (1998)
Mitkov, R.: Anaphora Resolution. Longman, London (2002)
Harabagiu, S., Bunescu, R., Maiorano, S.: Text and knowledge mining for coreference resolution. In: Proc. 2nd Meeting of the North American Chapter of the Association of Computational Linguistics (NAACL 2001), Pittsburgh, PA, pp. 55–62 (2001)
Mitkov, R.: Multilingual anaphora resolution. Machine Translation 14(3-4), 281–299 (1999)
Galicia-Haro, S.N., Gelbukh, A., Bolshakov, I.A.: Recognition of named entities in Spanish texts. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 420–429. Springer, Heidelberg (2004)
Calvo, H., Gelbukh, A.: Improving disambiguation of prepositional phrase attachments using the web as corpus. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 604–610. Springer, Heidelberg (2003)
Muñoz, R., Montoyo, A., Llopis, F., Suárez, A.: Reconocimiento de entitades en el sistema EXIT. Procesamiento del Lenguaje Natural 23, 47–53 (1998)
Stevenson, M., Gaizauskas, R.: Using Corpus-derived Name Lists for Named Entity Recognition. In: Proc. ANLP 2000, Seattle, USA (2000)
Levenstein, V.I.: Binary codes capable of correcting insertions and reversals. Sov. Phys. Dokl. 10, 707–710 (1966)
Sachs, L.: Statistische Auswertung Methoden, 3rd edn. Springer, Berlin (1972)
Kenney, J.F., Keeping, E.S.: Calculation of the Standard Deviation. In: Mathematics of Statistics, Part 1, 3rd edn., pp. 77–80. Van Nostrand, Princeton (1962)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dalli, A. (2004). An Internet-Based Method for Verification of Extracted Proper Names. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2004. Lecture Notes in Computer Science, vol 2945. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24630-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-24630-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21006-1
Online ISBN: 978-3-540-24630-5
eBook Packages: Springer Book Archive