Skip to main content

Mining the Web to Add Semantics to Retail Data Mining

  • Conference paper
Web Mining: From Web to Semantic Web (EWMF 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3209))

Included in the following conference series:

  • 500 Accesses

Abstract

While research on the Semantic Web has mostly focused on basic technologies that are needed to make the Semantic Web a reality, there has not been a lot of work aimed at showing the effectiveness and impact of the Semantic Web on business problems. This paper presents a case study where Web and Text mining techniques were used to add semantics to data that is stored in transactional databases of retailers. In many domains, semantic information is implicitly available and can be extracted automatically to improve data mining systems. This is a case study of a system that is trained to extract semantic features for apparel products and populate a knowledge base with these products and features. We show that semantic features of these items can be successfully extracted by applying text learning techniques to the descriptions obtained from websites of retailers. We also describe several applications of such a knowledge base of product semantics that we have built including recommender systems and competitive intelligence tools and provide evidence that our approach can successfully build a knowledge base with accurate facts which can then be used to create profiles of individual customers, groups of customers, or entire retail stores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press/The MIT Press (1996)

    Google Scholar 

  2. Borgelt, C.: Apriori, http://fuzzy.cs.Uni-Magdeburg.de/~borgelt/

  3. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to construct knowledge bases from the world wide web. Artificial Intelligence 118(1-2), 69–114 (2000)

    Article  MATH  Google Scholar 

  4. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  5. Ghani, R., Fano, A.E.: Building recommender systems using a knowledge base of product semantics. In: Proceedings of the Workshop on Recommendation and Personalization in ECommerce at the 2nd International Conference on Adaptive Hypermedia and Adaptive Web based Systems (2002)

    Google Scholar 

  6. Ghani, R., Jones, R., Mladenic, D., Nigam, K., Slattery, S.: Data mining on symbolic knowledge extracted from the web. In: Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2000)

    Google Scholar 

  7. Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in NIPS 11 (1999)

    Google Scholar 

  8. Joachims, T.: Transductive inference for text classification using support vector machines. In: Machine Learning: Proceedings of the Sixteenth International Conference (1999)

    Google Scholar 

  9. Lewis, D.D.: Naive (Bayes) at forty: The independence assumption in information retrieval. In: Machine Learning: ECML 1998, Tenth European Conference on Machine Learning, pp. 4–15 (1998)

    Google Scholar 

  10. McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classification. In: Learning for Text Categorization: Papers from the AAAI Workshop, pp. 41–48, Tech. rep. WS-98-05, AAAI Press (1998)

    Google Scholar 

  11. Nahm, U.Y., Mooney, R.J.: Text mining with information extraction. In: AAAI 2002 Spring Symposium on Mining Answers from Texts and Knowledge Bases (2002)

    Google Scholar 

  12. Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2/3), 103–134 (2000)

    Article  MATH  Google Scholar 

  13. Schafer, J., Konstan, J., Riedl, J.: Electronic commerce recommender applications. Journal of Data Mining and Knowledge Discovery 5, 115–152 (2000)

    Article  Google Scholar 

  14. Seymore, K., McCallum, A., Rosenfeld, R.: Learning hidden Markov model structure for information extraction. In: Machine Learning for Information Extraction: Papers from the AAAI Workshop, Tech. rep. WS-99-11, AAAI Press (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ghani, R. (2004). Mining the Web to Add Semantics to Retail Data Mining. In: Berendt, B., Hotho, A., Mladenič, D., van Someren, M., Spiliopoulou, M., Stumme, G. (eds) Web Mining: From Web to Semantic Web. EWMF 2003. Lecture Notes in Computer Science(), vol 3209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30123-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30123-3_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23258-2

  • Online ISBN: 978-3-540-30123-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics