Skip to main content

A Novel Web Page Analysis Method for Efficient Reasoning of User Preference

  • Conference paper
Book cover Computer-Human Interaction (APCHI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5068))

Included in the following conference series:

Abstract

The amount of information on the Web is rapidly increasing. Recommender systems can help users selectively filter this information based on their preferences. One way to obtain user preferences is to analyze characteristics of content that is accessed by the user. Unfortunately, web pages may contain elements irrelevant to user interests (e.g., navigation bar, advertisements, and links.). Hence, existing analysis approaches using the TF-IDF method may not be suitable. This paper proposes a novel user preference analysis system that eliminates elements that repeatedly appear in web pages. It extracts user interest keywords in the identified primary content. Also, the system has features that collect the anchor tag, and track the user’s search route, in order to identify keywords that are of core interest to the user. This paper compares the proposed system with pure TF-IDF analysis method. The analysis confirms its effectiveness in terms of the accuracy of the analyzed user profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Melamed, D., Shapira, B., Elovici, Y.: MarCol: A Market-Based Recommender System. IEEE Intelligent Systems 22(3), 74–78 (2007)

    Article  Google Scholar 

  2. Aciar, S., Zhang, D., Simoff, S., Debenham, J.: Informed Recommender: Basing Recommendations on Consumer Product Reviews. IEEE Intelligent Systems 22(3), 39–47 (2007)

    Article  Google Scholar 

  3. Balabanovic, M., Shoham, Y.: Fab: Content-based, Collaborative Recommendation. Communication of the ACM, 66–72 (March 1997)

    Google Scholar 

  4. Linden, G., Smith, B., York, J.: Amazon.com Recommendations Item-to-Item Collaborative Filtering. IEEE Internet Computing, 76–80 (January 2003)

    Google Scholar 

  5. Robertson, S.: Understanding Inverse Document Frequency: on theoretical arguments for IDF. Journal of documentation 60(5), 503–520 (2005)

    Article  Google Scholar 

  6. Salton, G.: Introduction to Modern Information Retrieval. Mcgraw Hill, New York (1983)

    MATH  Google Scholar 

  7. Debnath, S., Mitra, P., Pal, N., Giles, C.: Automatic Identification of Informative Sections of Web Pages. IEEE Transactions on Knowledge and Data Engineering 17(9), 1233–1246 (2005)

    Article  Google Scholar 

  8. Chang, S., Sikora, T., Puri, A.: Overview of the MPEG-7 Standard. IEEE Transactions on Circuits and Systems for Video Technology 11(6) (June 2001)

    Google Scholar 

  9. Beil, F., Ester, M., Xu, X.: Frequent Term-Based Text Clustering. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pp. 436–442 (2002)

    Google Scholar 

  10. Pazzani, M., Muramatsu, J., Billsus, D.: Syskill & Webert: Identifying interesting web sites. In: Proceedings of the 13th national conference on Artificial Intelligence (1996)

    Google Scholar 

  11. Chen, L., Sycara, K.: WebMate: A Personal Agent for Browsing and Searching. In: Proceedings of the 2nd international conference on Autonomous Agent, pp. 132–139 (1998)

    Google Scholar 

  12. Pierre, S., Kacan, C., Probst, W.: An agent –based approach for integrating user profile into a knowledge management process. Elsevier Knowledge-Based Systems 13, 307–314 (2000)

    Article  Google Scholar 

  13. Lin, S., Ho, J.: Discovering Informative Content blocks from Web Documents. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge Discovery and Data Mining, July 2002, pp. 588–593 (2002)

    Google Scholar 

  14. Ramaswamy, L., Lyengar, A., Liu, L., Douglis, F.: Automatic Fragment Detection in Dynamic Web Pages and Its Impact on Caching. IEEE Transactions on Knowledge and Data Engineering 17(6), 859–874 (2005)

    Article  Google Scholar 

  15. http://www.w3.org/dom/

  16. Armstrong, R., Freitag, D., Joachims, T., Mitchell, T.: WebWatcher: A Learning Apprentice for the World Wide Web (February 1995)

    Google Scholar 

  17. Frakes, W.B., Baeza-Yates, R.: Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  18. Yu, Z., Zhou, X.: TV3P: An Adaptive Assistant for Personalized TV. IEEE Transaction on Consumer Electronics 50(1), 393–399 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Seongil Lee Hyunseung Choo Sungdo Ha In Chul Shin

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, S., Jung, M., Lee, E. (2008). A Novel Web Page Analysis Method for Efficient Reasoning of User Preference. In: Lee, S., Choo, H., Ha, S., Shin, I.C. (eds) Computer-Human Interaction. APCHI 2008. Lecture Notes in Computer Science, vol 5068. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70585-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70585-7_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70584-0

  • Online ISBN: 978-3-540-70585-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics