Abstract
Information on the web is huge in size and to find the relevant information according to the information need of the user is a big challenge. Information scent of the clicked pages of the past query sessions has been used in the literature to generate web page recommendations for satisfying the information need of the current user. High scent information retrieval works on the bedrock of keyword vector of query sessions clustered using information scent. The dimensionality of the keyword vector is very high which affects the classification accuracy and computational efficiency associated with the processing of input queries and ultimately affects the precision of information retrieval. All the keywords in the keyword vector are not equally important for identifying the varied and differing information needs represented by clusters. Fuzzy Rough Set Attribute Reduction (FRSAR) has been applied in the presented work to reduce the high dimensionality of the keyword vector to obtain reduced relevant keywords resulting in improvement in space and time complexities. The effectiveness of fuzzy rough approach for high scent web page recommendations in information retrieval is verified with the experimental study conducted on the data extracted from the web history of Google search engine.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allan, J.: Incremental relevance feedback for information filtering. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 270–278 (1996)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Baeza-Yates, R., Hurtado, C.A., Mendoza, M.: Query recommendation using query logs in search engines. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004)
Bazan, J., Peters, J.F., Skowron, A., Nguyen, H.S., Szczuka, M.: Rough set approach for pattern extraction from classifiers. Electronics Notes in Theoretical Computer Science 82(4), 20–29 (2003)
Bedi, P., Chawla, S.: Improving information retrieval precision using query log mining and information scent. Information Technology Journal 6(4), 584–588 (2007)
Bedi, P., Chawla, S.: Use of fuzzy rough set attribute reduction in high scent web page recommendations. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS, vol. 5908, pp. 192–200. Springer, Heidelberg (2009)
Chawla, S., Bedi, P.: Improving information retrieval precision by finding related queries with similar information need using information scent. In: Proceedings ICETET 2008. The 1st International Conference on Emerging Trends in Engineering and Technology, pp. 486–491 (2008)
Chawla, S., Bedi, P.: Personalized web search using information scent. In: Proceedings CISSE 2007 - International Joint Conferences on Computer, Information and Systems Science and Engineering, Springer, Heidelberg (2007)
Chawla, S., Bedi, P.: Finding hubs and authorities using information scent to improve the information retrieval precision. In: Proceedings ICAI 2008 -The 2008 International Conference on Artificial Intelligence, WORLDCOMP 2008, July 14-17 (2008)
Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.: Using information scent to model user information needs and actions on the Web. In: Proceedings ACM CHI 2001 Conference on Human Factors in Computing Systems, pp. 490–497 (2001)
Chouchoulas, A., Shen, Q.: Rough set aided keyword reduction for text categorization. Applied Artificial Intelligence 15(9), 843–873 (2001)
Cornelis, C., Jensen, R., Hurtado, G., Ślęzak, D.: Attribute selection with fuzzy decision reducts. Information Sciences 180, 209–224 (2010)
Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Transactions on Knowledge and Data Engineering 15(4), 829–839 (2003)
Duan, Q., Miao, D., Zhang, H., Zheng, J.: Personalized web retrieval based on rough fuzzy method. Journal of Computational Information Systems 3(2), 203–208 (2007)
Efron, M.: Query expansion and dimensionality reduction: Notions of optimality in Rocchio relevance feedback and latent semantic indexing. Information Processing and Management 44, 163–180 (2008)
Gordon, A.D.: How many clusters? An investigation of five procedures for detecting nested cluster structure. In: Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H., Baba, Y. (eds.) Data Science, Classification, and Related Methods, Springer, Tokyo (1998)
Gudivada, V.N., Raghavan, V.V., Grosky, W., Kasanagottu, R.: Information retrieval on World Wide Web. IEEE Expert., 58–68 (1997)
Heer, J., Chi, E.H.: Identification of web user traffic composition using multi-modal clustering and information scent. In: Proceedings of Workshop on Web Mining, SIAM Conference on Data Mining, pp. 51–58 (2001)
Jain, A.K., Murty, M.N., Flyn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Jansen, J., Spink, A., Bateman, J., Saracevic, T.: Real life information retrieval: a study of user queries on the web. ACM SIGIR Forum 32(1), 5–17 (1998)
Jensen, R., Shen, Q.: Semantic-preserving dimensionality reduction: rough and fuzzy-rough based approach. IEEE Transactions on Knowledge and Data Engineering 16(12) (2004)
Jensen, R., Shen, Q.: Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets and Systems 141(3), 469–485 (2004)
Kryszkiewicz, M., Lasek, P.: TI-DBSCAN: Clustering with DBSCAN by Means of the Triangle Inequality. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS, vol. 6086, pp. 60–69. Springer, Heidelberg (2010)
Kryszkiewicz, M., Lasek, P.: A neighborhood-based clustering by means of the triangle inequality. In: Fyfe, C., Tino, P., Charles, D., Garcia-Osorio, C., Yin, H. (eds.) IDEAL 2010. LNCS, vol. 6283, pp. 284–291. Springer, Heidelberg (2010)
Nguyen, H.S.: On the decision table with maximal number of reducts. Electronic Notes in Theoretical Computer Science 82(4), 198–205 (2003)
Olston, C., Chi, E.H.: ScentTrails: Integrating browsing and searching on the World Wide Web. ACM Transactions on Computer-Human Interaction 10, 177–197 (2003)
Patra, B.K., Hubballi, N., Biswas, S., Nandi, S.: Distance based fast hierarchical clustering method for large datasets. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS, vol. 6086, pp. 50–59. Springer, Heidelberg (2010)
Patra, B.K., Nandi, S.: Fast single-link clustering method based on tolerance rough set model. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS (LNAI), vol. 5908, pp. 414–422. Springer, Heidelberg (2009)
Pawlak, Z., Grzymala-Busse, J., Slowinski, R., Ziarko, W.: Rough sets. Communications of the ACM 38(11), 88–95 (1995)
Pirolli, P., Card, S.K.: Information foragings. Psychological Review 106, 643–675 (1999)
Pirolli, P.: Computational models of information scent-following in a very large browsable text collection. In: Proceedings ACM CHI 1997- Conference on Human Factors in Computing Systems, pp. 3–10 (1997)
Pirolli, P.: The use of proximal information scent to forage for distal content on the World Wide Web. In: Working with Technology in Mind: Brunswikian, Resources for Cognitive Science and Engineering, Oxford University Press, Oxford (2004)
Rocchio, J.J.: Relevance feedback in information retrieval, pp. 313–343. Prentice Hall, Englewood Cliffs (1971)
Salton, G., McGill, M.: An Introduction to Modern Information Retrieval. Mc-Graw-Hill, New York (1983)
Spath, H.: Cluster Analysis Algorithms for Data Reduction and Classification. Ellis Horwood, Chichester (1980)
Vechtomova, O., Karamuftuoglu, M.: Elicitation and use of relevance feedback information. Information Processing and Management 42, 191–206 (2006)
Vijaya, P.A., Murty, M.N., Subramanian, D.K.: Leaders-Subleaders: an efficient hierarchical clustering algorithm for large data sets. Pattern Recognition Letters 25(4), 505–513 (2004)
Zhong, N., Dong, J., Ohsuga, S.: Using rough set with heuristics for feature selection. Journal of Intelligent Information System 16, 199–214 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bedi, P., Chawla, S. (2011). High Scent Web Page Recommendations Using Fuzzy Rough Set Attribute Reduction. In: Peters, J.F., et al. Transactions on Rough Sets XIV. Lecture Notes in Computer Science, vol 6600. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21563-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-21563-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21562-9
Online ISBN: 978-3-642-21563-6
eBook Packages: Computer ScienceComputer Science (R0)