Skip to main content

Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search

  • Conference paper
Conceptual Modeling – ER 2011 (ER 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6998))

Included in the following conference series:

Abstract

Valuable local information is often available on the web, but encoded in a foreign language that non-local users do not understand. Can we create a system to allow a user to query in language L 1 for facts in a web page written in language L 2? We propose a suite of multilingual extraction ontologies as a solution to this problem. We ground extraction ontologies in each language of interest, and we map both the data and the metadata among the language-specific extraction ontologies. The mappings are through a central, language-agnostic ontology that allows new languages to be added by only having to provide one mapping rather than one for each language pair. Results from an implemented early prototype demonstrate the feasibility of cross-language information extraction and semantic search. Further, results from an experimental evaluation of ontology-based query translation and extraction accuracy are remarkably good given the complexity of the problem and the complications of its implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Sarawagi, S.: Information extraction. Foundations and Trends in Databases 1(3), 261–377 (2008)

    Article  MATH  Google Scholar 

  2. Turmo, J., Ageno, A., Català, N.: Adaptive information extraction. ACM Computing Surveys 38(2) (July 2006)

    Google Scholar 

  3. Grefenstette, G. (ed.): Cross-Language Information Retrieval. Kluwer, Boston (1998)

    Google Scholar 

  4. Klavans, J., Hovy, E., Furh, C., Frederking, R.E., Oard, D., Okumura, A., Ishikawa, K., Satoh, K.: Multilingual (or cross-lingual) information retrieval. In: Hovy, E., Ide, N., Frederking, R., Mariani, J., Zampolli, A. (eds.) Multilingual Information Management: Current Levels and Future Abilities. Linguistica Computazionale, vol. XIV–XV, Insituti Editoriali e Poligrafici Internazionali, Pisa (2001)

    Google Scholar 

  5. Falaise, A., Rouquet, D., Schwab, D., Blanchon, H., Boitet, C.: Ontology driven content extraction using interlingual annotation of texts in the OMNIA project. In: Proceedings of the 4th International Workshop on Cross Lingual Information Access, Beijing, China (August 2010)

    Google Scholar 

  6. Lonsdale, D.W., Franz, A.M., Leavitt, J.R.R.: Large-scale machine translation: An interlingua approach. In: Proceedings of the Seventh International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE 1994), Austin, Texas, USA, pp. 525–530 (May/June 1994)

    Google Scholar 

  7. Dorr, B.J., Hovy, E.H., Levin, L.S.: Machine translation: Interlingual methods. In: Natural Language Processing and Machine Translation, Encyclopedia of Language and Linguistics, 2nd edn. Elsevier Ltd., Amsterdam (2004)

    Google Scholar 

  8. Murray, C., Dorr, B.J., Lin, J., Hajič, J., Pecina, P.: Leveraging reusability: Cost-effective lexical acquisition for large-scale ontology translation. In: Proceedings of the Association for Computational Linguistics (ACL 2006), Sydney, Australia, pp. 945–952 (July 2006)

    Google Scholar 

  9. Embley, D.W., Campbell, D.M., Jiang, Y.S., Liddle, S.W., Lonsdale, D.W., Ng, Y.-K., Smith, R.D.: Conceptual-model-based data extraction from multiple-record web pages. Data & Knowledge Engineering 31(3), 227–251 (1999)

    Article  MATH  Google Scholar 

  10. Embley, D.W., Liddle, S.W., Lonsdale, D.W.: Conceptual modeling foundations for a web of knowledge. In: Embley, D.W., Thalheim, B. (eds.) Handbook of Conceptual Modeling: Theory, Practice, and Research Challenges, ch. 15, pp. 477–516. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Al-Muhammed, M., Embley, D.W.: Ontology-based constraint recognition for free-form service requests. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE 2007), Istanbul, Turkey, pp. 366–375 (April 2007)

    Google Scholar 

  12. Embley, D.W., Kurtz, B.D., Woodfield, S.N.: Object-oriented Systems Analysis: A Model-Driven Approach. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  13. Xu, L., Embley, D.W.: A composite approach to automating direct and indirect schema mappings. Information Systems 31(8), 697–732 (2006)

    Article  Google Scholar 

  14. Lonsdale, D., Mitamura, T., Nyberg, E.: Acquisition of large lexicons for practical knowledge-based MT. Machine Translation 9, 251–283 (1995)

    Article  Google Scholar 

  15. Geng, Z., Tijerino, Y.A.: Using cross-lingual data extraction ontology for web service interaction – for a restaurant web service. In: 2010 Workshop on Cross-Cultural and Cross-Lingual Aspects of the Semantic Web, Shanghai, China (November 2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Embley, D.W., Liddle, S.W., Lonsdale, D.W., Tijerino, Y. (2011). Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search. In: Jeusfeld, M., Delcambre, L., Ling, TW. (eds) Conceptual Modeling – ER 2011. ER 2011. Lecture Notes in Computer Science, vol 6998. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24606-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24606-7_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24605-0

  • Online ISBN: 978-3-642-24606-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics