Abstract
Our web user profiles consist of Page Interest Estimators (PIE’s) and Web Access Graphs (WAG’s). We discuss a non-invasive approach to estimating the user’s interest of a web page without directly asking the user. A time and space efficient method is proposed for locating multi-word phrases to enrich the common bag-of-words representation for text documents. PIE’s are then learned to predict the user’s interest on any web page. A WAG summarizes the web page access patterns of a user. We describe how a user profile can be utilized to analyze search results and recommend new and interesting pages. Our empirical results on PIE’s are encouraging.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
C. Apte, F. Damerau, and S. Weiss. Towards language independent automated learning of text categorization models. In Proc. ACM SIGIR-94, pages 23–30, 1994.
M. Balabanovic. An adaptive web page recommendation service. In Proc. 1st Intl. Conf. Autonomous Agents, pages 378–385, 1997.
L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.
L. Breiman. Stacked regressions. Machine Learning, 24:41–48, 1996.
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, Belmont,CA, 1984.
P. Chan and S. Stolfo. Meta-learning for multistrategy and parallel learning. In Proc. Second Intl. Work. Multistrategy Learning, pages 150–165, 1993.
P. Chan and S. Stolfo. A comparative evaluation of voting and meta-learning on partitioned data. In Proc. Twelfth Intl. Conf. Machine Learning, pages 90–98, 1995.
P. Chan, S. Stolfo, and D. Wolpert, editors. Working Notes for the AAAI-96 Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms, Portland, OR, 1996. AAAI.
L. Chen and K. Sycara. Webmate: A personal agent for browsing and searching. In Proc. 2nd Intl. Conf. Autonomous Agents, pages 132–139, 1998.
W. Cohen. Fast effective rule induction. In Proc. 12th Intl. Conf. Machine Learning, pages 115–123, 1995.
B. Croft, H. Turtle, and D. Lewis. The use of phrases and structure queries in information retrieval. In Proc. SIGIR-91, pages 32–45, 1991.
R. Duda and P. Hart. Pattern classification and scene analysis. Wiley, New York, NY, 1973.
J. Fagan. Experiments in Automatic Phrase Indexing for Document Retrieval. PhD thesis, Linguistics, Cornell Univ., Ithaca,NY, 1987.
W. Frakes and R. Baeza-Yates, editors. Information retrieval: data structures and algorithms. Prentice Hall, Englewood Cliffs,NJ, 1992.
D. Goldberg, D. Nichols, B. Oki, and D. Terry. Using collaborative filtering to weave an information tapestry. Comm. ACM, 35(12):61–70, 1992.
J. Hartigan. Clustering algorithms. Wiley, New York,NY, 1975.
W. Hill, L. Stead, M. Rosenstein, and G. Furnas. Recommending and evaluating choices in a virtual community of use. In Proc. ACM CHI-95, pages 194–201, 1995.
L. Holson. Feeding a frenzy: Why internet investors are still ravenous. New York Times, June 61999.
J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl. GroupLens: Applying collaborative filtering to usenet news. Comm. ACM, 40(3):77–87, 1997.
S. Kullback. Information Theory and Statistics. Dover, New York,NY, 1968.
D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proc. ACM SIGIR-94, pages 3–12, 1994.
D. Lewis, R. Schapire, J. Callan, and R. Papka. Training algorithms for linear text classifiers. In Proc. ACM SIGIR-96, pages 298–306, 1996.
H. Lieberman. Letizia: An agent that assits web browsing. In Proc. IJCAI-95, 1995.
G. Miller. WordNet: A lexical database for English. Comm. ACM, 38(11):39–41, 1995.
M. Morita and Y. Shinoda. Information filtering based on user behavior analysis and best match text retrieval. In Proc. SIGIR-94, pages 272–281, 1994.
Netscape. Netscape Mozilla. http://www.mozilla.org/.
Netscape. Netscape Navigator. http://www.netscape.org/.
M. Pazzani and D. Billsus. Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27:313–331, 1997.
M. Pazzani, J. Muramatsu, and D. Billsus. Syskill &Webert: Identifying interesting web sites. In Proc. AAAI-96, 1996.
M. Perkowitz and O. Etzioni. Adaptive web sites: Automatically synthesizing web pages. In Proc. AAAI-98, 1998.
M. Porter. An algorithm for suffix stripping. Program, 14(3):130–137, 1980.
J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, San Mateo,CA, 1993.
P. Resnick, N. Iacovou, M. Sushak, P. Bergstrom, and J. Riedl. GroupLens: An open architecture for collaborative filtering of netnews. In Proc. CSCW-94, 1994.
C. Van Rijsbergen. Information Retrieval. Butterworths, London, 1979.
R. Rosenfeld. A maximum entropy approach to adaptive statistical language modeling. Computer, Speech, and Language, 10, 1996.
G. Salton. Automatic Text Processing. Addison-Wesley, Reading,MA, 1988.
R. Schapire. The strength of weak learnability. Machine Learning, 5:197–226, 1990.
E. Selberg and O. Etzioni. Multi-service search and comparison using the metacrawler. In Proc. WWW4, 1995.
E. Selberg and O. Etzioni. The metacrawler architecture for resource aggregration on the web. IEEE Expert, 12(1):8–14, 1997.
U. Shardanand and P. Maes. Social information filtering: Algorithms for automating “word of mouth”. In Proc. ACM CHI-95, pages 210–217, 1995.
Squid. Squid internet object cache. http://squid.nlanr.net/Squid/.
R. Stiefelhagen, M. Finke, J. Yang, and A. Waibel. From gaze to focus of attention. In Proc. Work. Perceptual User Interfaces, pages 25–30, 1998.
R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proc. Work. Perceptual User Interfaces, pages 98–100, 1997.
F. Wall. Statistical Data Analysis Handbook. McGraw-Hill, New York,NY, 1986.
A. Wexelblat and P. Maes. Footprints: Visualizing histories for web browsing. http://wex.www.media.mit.edu/people/wex/Footprints/footprints1.html, 1997.
D. Wolpert. Stacked generalization. Neural Networks, 5:241–259, 1992.
Y. Yang. An evaluation of statistical approaches to text categorization. Technical Report CMU-CS-97-127, CMU, Pittsburgh, PA, 1997.
Y. Yang and J. Pedersen. A comparative study on feature selection in text categorization. In Proc. Intl. Conf. Machine Learning, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chan, P.K. (2000). Constructing Web User Profiles: A Non-invasive Learning Approach. In: Masand, B., Spiliopoulou, M. (eds) Web Usage Analysis and User Profiling. WebKDD 1999. Lecture Notes in Computer Science(), vol 1836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44934-5_3
Download citation
DOI: https://doi.org/10.1007/3-540-44934-5_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67818-2
Online ISBN: 978-3-540-44934-8
eBook Packages: Springer Book Archive