Abstract
Tourism product descriptions are strongly supported on natural language expressions. Appropriate offer selection, according to tourist needs, depends highly on how these are communicated. Since no human interaction is available while presenting tourism products online, the way these are presented, even when using only textual information, is a key success factor for tourism web sites to achieve a purchase. Due to the large amount of tourism offers and the high dynamics in this sector, manual data management is not a reliable or a scalable solution. This paper presents a prototype developed for automatic extraction of relevant knowledge from tourism-related natural language texts. Captured knowledge is represented in a normalized format and new textual descriptions are produced according to available marketing channels. At this phase, the prototype is focused on hotel descriptions and is already using real operational data retrieved from the KEY for Travel tourism platform.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aitken, J.S.: Learning information extraction rules: An inductive logic programming approach. In: van Harmelen, F. (ed.) ECAI 2002 15th European Conference on Artificial Intelligence, Lyon, France, pp. 355–359 (2002)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 100–110 (1999)
Development, H.: Jena – A Semantic Web Framework (March 2010), http://jena.sourceforge.net
Freitag, D., McCallum, A.: Information extraction with hmm structures learned by stochastic optimization. In: AI 2000 17th National Conference on Artificial Intelligence, pp. 584–589. AAAI Press (2000)
Grau, J.: Travel Agencies Online. eMarketer (2005)
Grishman, R.: Information Extraction: Techniques and Challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS, vol. 1299, pp. 10–27. Springer, Heidelberg (1997)
Hobbs, J.R., Bear, J., Israel, D., Tyson, M.: Fastus: A finite-state processor for information extraction from real-world text. In: IJCAI 1993 13th International Joint Conference on Artificial Intelligence, pp. 1172–1178 (1993)
Joachims, T.: Transductive inference for text classification using support vector machines. In: ICML 1999 16th International Conference on Machine Learning (1999)
Klein, D., Manning, C.D.: Conditional structure versus conditional estimation in nlp models. In: ACL 2002 Conference on Empirical Methods in Natural Language Processing, pp. 9–16 (2002)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001 18th International Conference on Machine Learning, pp. 282–289 (2001)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 707–710 (1966); originally publish in Russian
Martin, A., Przybocki, M. (eds.): 2003 NIST Language Recognition Evaluation (2003)
McCallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)
Mladenić, D., Grobelnik, M.: Feature selection for unbalanced class distribution and naïve Bayes. In: ICML 1999 16th International Conference on Machine Learning, pp. 258–267 (1999)
Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208. MIT Press (1999)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Salton, G., Wang, A., Yang, C.: A vector space model for information retrieval. Journal of the American Society for Information Retrieval 18, 613–620 (1975)
Schütze, H., Hull, D., Pedersen, J.: A comparison of classifiers and document representations for the routing problem. In: SIGIR 1995 18th ACM International Conference on Research and Developement in Information Retrieval, Seattle, US, pp. 229–237 (1995)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)
Tong, R., Appelbaum, L.: Machine learning for knowledge-based document routing. In: Harman (ed.) TREC 2002 2nd Text Retrieval Conference (1994)
Vapnik, V.: Statistical learning theory. Wiley, NY (1998)
ViaTecla: KEYforTravel platform (March 2010), http://www.keyfortravel.com
Voorhees, E. (ed.): MUC7, 7th Message Understanding Conference. Science Applications International Corporation (SAIC), Fairfax, Virginia (1998)
W3C: OWL Web Ontology Language Guide (March 2010), http://www.w3.org/TR/owl-guide
Witten, I., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miranda, N., Raminhos, R., Seabra, P., Gonçalves, T., Saias, J., Quaresma, P. (2011). Information Extraction for Standardization of Tourism Products. In: Lozano, J.A., Gámez, J.A., Moreno, J.A. (eds) Advances in Artificial Intelligence. CAEPIA 2011. Lecture Notes in Computer Science(), vol 7023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25274-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-25274-7_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25273-0
Online ISBN: 978-3-642-25274-7
eBook Packages: Computer ScienceComputer Science (R0)