Abstract
The objective of Eurogene is to collect a critical mass of educational content in the field of human genetics in nine European languages and to build a platform that will support the retrieval, sharing and navigation over the learning content. The Eurogene platform is already operational and is being used by the genetics community. In this paper, a part of the Eurogene platform related to the retrieval and machine translation of domain specific content is described. Our contribution lies in an approach for domain-specific adaption of cross-language information retrieval (CLIR) and machine translation (MT). The CLIR system is based on a multilingual domain ontology which is also used as a synchronization component between CLIR and MT. The MT system is adapted to the target domain using the terminology represented in the ontology and using statistical training performed on a collection of parallel texts. In the statistical training phase, new translations of a term can be discovered and used for ontology updating. The paper is organized as follows. First, we describe the motivation for our approach and the multilingual domain ontology. Later, the CLIR and MT components and their domain adaption and synchronization are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Knoth, P., Collins, T., Sklavounou, E., Zdrahal, Z. (2010). EUROGENE: Multilingual Retrieval and Machine Translation Applied to Human Genetics. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_74
Download citation
DOI: https://doi.org/10.1007/978-3-642-12275-0_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)