Skip to main content

Information Extraction for Standardization of Tourism Products

  • Conference paper
Book cover Advances in Artificial Intelligence (CAEPIA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7023))

Included in the following conference series:

  • 1304 Accesses

Abstract

Tourism product descriptions are strongly supported on natural language expressions. Appropriate offer selection, according to tourist needs, depends highly on how these are communicated. Since no human interaction is available while presenting tourism products online, the way these are presented, even when using only textual information, is a key success factor for tourism web sites to achieve a purchase. Due to the large amount of tourism offers and the high dynamics in this sector, manual data management is not a reliable or a scalable solution. This paper presents a prototype developed for automatic extraction of relevant knowledge from tourism-related natural language texts. Captured knowledge is represented in a normalized format and new textual descriptions are produced according to available marketing channels. At this phase, the prototype is focused on hotel descriptions and is already using real operational data retrieved from the KEY for Travel tourism platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aitken, J.S.: Learning information extraction rules: An inductive logic programming approach. In: van Harmelen, F. (ed.) ECAI 2002 15th European Conference on Artificial Intelligence, Lyon, France, pp. 355–359 (2002)

    Google Scholar 

  2. Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 100–110 (1999)

    Google Scholar 

  3. Development, H.: Jena – A Semantic Web Framework (March 2010), http://jena.sourceforge.net

  4. Freitag, D., McCallum, A.: Information extraction with hmm structures learned by stochastic optimization. In: AI 2000 17th National Conference on Artificial Intelligence, pp. 584–589. AAAI Press (2000)

    Google Scholar 

  5. Grau, J.: Travel Agencies Online. eMarketer (2005)

    Google Scholar 

  6. Grishman, R.: Information Extraction: Techniques and Challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS, vol. 1299, pp. 10–27. Springer, Heidelberg (1997)

    Google Scholar 

  7. Hobbs, J.R., Bear, J., Israel, D., Tyson, M.: Fastus: A finite-state processor for information extraction from real-world text. In: IJCAI 1993 13th International Joint Conference on Artificial Intelligence, pp. 1172–1178 (1993)

    Google Scholar 

  8. Joachims, T.: Transductive inference for text classification using support vector machines. In: ICML 1999 16th International Conference on Machine Learning (1999)

    Google Scholar 

  9. Klein, D., Manning, C.D.: Conditional structure versus conditional estimation in nlp models. In: ACL 2002 Conference on Empirical Methods in Natural Language Processing, pp. 9–16 (2002)

    Google Scholar 

  10. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001 18th International Conference on Machine Learning, pp. 282–289 (2001)

    Google Scholar 

  11. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 707–710 (1966); originally publish in Russian

    MathSciNet  MATH  Google Scholar 

  12. Martin, A., Przybocki, M. (eds.): 2003 NIST Language Recognition Evaluation (2003)

    Google Scholar 

  13. McCallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)

    Google Scholar 

  14. Mladenić, D., Grobelnik, M.: Feature selection for unbalanced class distribution and naïve Bayes. In: ICML 1999 16th International Conference on Machine Learning, pp. 258–267 (1999)

    Google Scholar 

  15. Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208. MIT Press (1999)

    Google Scholar 

  16. Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  17. Salton, G., Wang, A., Yang, C.: A vector space model for information retrieval. Journal of the American Society for Information Retrieval 18, 613–620 (1975)

    MATH  Google Scholar 

  18. Schütze, H., Hull, D., Pedersen, J.: A comparison of classifiers and document representations for the routing problem. In: SIGIR 1995 18th ACM International Conference on Research and Developement in Information Retrieval, Seattle, US, pp. 229–237 (1995)

    Google Scholar 

  19. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)

    Google Scholar 

  20. Tong, R., Appelbaum, L.: Machine learning for knowledge-based document routing. In: Harman (ed.) TREC 2002 2nd Text Retrieval Conference (1994)

    Google Scholar 

  21. Vapnik, V.: Statistical learning theory. Wiley, NY (1998)

    MATH  Google Scholar 

  22. ViaTecla: KEYforTravel platform (March 2010), http://www.keyfortravel.com

  23. Voorhees, E. (ed.): MUC7, 7th Message Understanding Conference. Science Applications International Corporation (SAIC), Fairfax, Virginia (1998)

    Google Scholar 

  24. W3C: OWL Web Ontology Language Guide (March 2010), http://www.w3.org/TR/owl-guide

  25. Witten, I., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Miranda, N., Raminhos, R., Seabra, P., Gonçalves, T., Saias, J., Quaresma, P. (2011). Information Extraction for Standardization of Tourism Products. In: Lozano, J.A., Gámez, J.A., Moreno, J.A. (eds) Advances in Artificial Intelligence. CAEPIA 2011. Lecture Notes in Computer Science(), vol 7023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25274-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25274-7_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25273-0

  • Online ISBN: 978-3-642-25274-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics