Ontology-Based Information Extraction from Tourism Websites
The enlarging amount of semistructured and unstructured data on heterogeneously designed tourism websites creates a need for information extraction (IE) mechanisms for semiautomatic data acquisition in order to build tourism recommender systems or tourism Web portals. In this article we analyze heterogeneity aspects of individually maintained accommodation websites and discuss the applicability of different IE types and techniques for this domain. We then develop a rule/ontology-based IE approach and discuss the components of our prototype crawler. Finally, we discuss some relevant issues that emerged during the development and evaluation of the prototype.
Keywords: E-TOURISM; GATE (GENERAL ARCHITECTURE FOR TEXT ENGINEERING); INFORMATION EXTRACTION
Document Type: Research Article
Publication date: 01 August 2009
- Information Technology & Tourism is the first scientific journal dealing with the exciting relationship between information technology and tourism. Information and communication systems embedded in a global net have profound influence on the tourism and travel industry. Reservation systems, distributed multimedia systems, highly mobile working places, electronic markets, and the dominant position of tourism applications in the Internet are noticeable results of this development. And the tourism industry poses several challenges to the IT field and its methodologies.
- Access Key
- Free content
- Partial Free content
- New content
- Open access content
- Partial Open access content
- Subscribed content
- Partial Subscribed content
- Free trial content