Skip to main content

Adaptive Classification of Web Documents to Users Interests

  • Conference paper
  • First Online:
Advances in Informatics (PCI 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2563))

Included in the following conference series:

Abstract

Current Web search engines are not able to adapt their operations to the evolving needs, interests and preferences of the users. To cope with this problem we developed a system able to classify HTML (or, XML) documents into user pre-specified categories of interests. The system processes the user profile and a set of representative documents- for each category of interest, and produces a classification schema- presented as a set of representative category vectors. The classification schema is then utilized in order to classify new incoming Web documents to one (or, more) of the pre-specified categories of interest. The system offers the users the ability to modify and enrich his/her profile depending on his/her current search needs and interests. In this respect the adaptive and personalized delivery of Web-based information is achieved. Experimental results on an indicative collection of Web-pages show the reliability and effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balabanovic M., Shoham Y. and Yun Y.: “An adaptive Agent for Automated Web Browsing”, Journal on Intelligent and Cooperative Information Systems, Vol.6, No.4, pp.127–158, 1992. 148

    Google Scholar 

  2. Barry C. L.: “User-Defined Relevance Criteria: an Exploratory Study”, Journal of the American Society for Information Science, Vol.45, No.3, pp.149–159, 1994. 147

    Article  Google Scholar 

  3. Barry C. L. and Schamber L.: “Users’ Criteria for Relevance Evaluation: a Cross-Situational Comparison”, Information Processing and Management, Vol.34, No.2/3, pp.219–236, 1998. 147

    Article  Google Scholar 

  4. Chang C. H. and Hsu C.C.: “Enabling Concept-Based Relevance Feedback for Information Retrieve on theWorldWideWeb”, IEEE Transaction on Knowledge and Data Engineering, Special issue on Web Technologies, Vol.11, No.4, pp.595–609, 1999. 155

    Article  Google Scholar 

  5. Craven M., DiPasquo D., Freitag D., McCallum A., Mitchell T., Nigam K. and Slattery S.: “Learning to Construct Knowledge Bases from the World Wide Web”, Artificial Intelligence, Vol.118, No.1–2, pp.69–113, 2000. 148

    Article  MATH  Google Scholar 

  6. Esposito F., Malerba D., DiPace L. and Leo P.: “WebClass: an Intermediary for the Classification of HTML Pages”, Demo paper for AI*IA’ 99, Bologna, Italy, 1999. 148

    Google Scholar 

  7. Fawcett T. and Provost F.: “Combining Data Mining and Machine Learning for Effective User Profiling”, Proceedings 2nd KDDM Conference, pp.8–13, 1996. 150

    Google Scholar 

  8. Harman D.: “Relevance Feedback Revisited”, Proceedings 15th ACM SIGIR Conference, pp.1–10, 1992. 155

    Google Scholar 

  9. Kilander F.: “IntFilter Home Page-K2LAB”, Department of Computer Sciences, Stockholm University, Sweden, 1996. Available from: http://www.dsv.su.se/~fk/if_Doc/IntFilter.html. 148

    Google Scholar 

  10. Krulwich B.: “InfoFinder Internet. Andersen Consulting’s Center for Strategic Technology Research”, 1996. Available from: http://www.ac.com/cstar/hsil/agents/framedef if.html. 148

  11. Lawrence S.: “Context inWeb Search”, IEEE Data Engineering Bulletin, Vol.23, No.3, pp.25–32, 2000. 147

    Google Scholar 

  12. Lewis D.D.: “An Evaluation of Phrasal and Clustered Representation on a Text Categorization Task”, Proceedings 15th ACM SIGIR Conference, Compenhagen, Denmark, pp.37–50, 1992. 148

    Google Scholar 

  13. Mitchell T.: “Machine Learning”, McGraw Hill, 1997. 148

    Google Scholar 

  14. Moukas A.: “Amalthaea: Information Discovery and Filtering Using a Multiagent Evolving Ecosystem”, Proceedings Conference on the Practical Application on Intelligent Agents and Multi-Agent Technology, London, UK, 1996. Available from: http://moux.www.media.mit.edu/people/moux/ papers/ PAAM96/. 148

  15. Pazzani M. and Billsus D.: “Learning and Revising User Profiles: the Identification of Interesting Web Sites”, Machine Learning, Vol.27, pp.313–331, 1997. 150

    Article  Google Scholar 

  16. Porter M. F.: “An Algorithm for Suffix Stripping”, Program, Vol.14, No.3, pp.130–137, 1980. 150, 155

    Google Scholar 

  17. Quek C. Y.: “Classification of World Wide Web Documents”, Senior Honors Thesis. School of Computer Science, CMU, 1997. Available from: http:// www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/choonthesis.html. 148

  18. Salton G. and McGill M. J.: “Introduction to Modern Information Retrieval”, McGraw-Hill, New York, 1983. 148, 149, 152

    MATH  Google Scholar 

  19. Text-Mining: “Text Mining: Foundations, Techniques and Applications”, Proceedings IJCAI’99 Workshop, Stockholm, Sweden, 1999. 148

    Google Scholar 

  20. Yan T. and Garcia-Molina H.: “SIFT-a tool for Wide Area Information Dissemination”, Proceedings 1995 USENIX Technical Conference, pp.177–186, 1995. Available from: ftp://db.stanford.edu/pub/sift/sift.ps. 148

  21. Yu C.T., Lam K. and Salton G.: “Term Weighting in Information Retrieval Using the Term Precision Model”, Journal of the Association for Computing Machinery, Vol.29, No.1, pp.152–170, 1982. 152

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Potamias, G. (2003). Adaptive Classification of Web Documents to Users Interests. In: Manolopoulos, Y., Evripidou, S., Kakas, A.C. (eds) Advances in Informatics. PCI 2001. Lecture Notes in Computer Science, vol 2563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-38076-0_10

Download citation

  • DOI: https://doi.org/10.1007/3-540-38076-0_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-07544-8

  • Online ISBN: 978-3-540-38076-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics