Improved Web Search Engine by New Similarity Measures

Kakulapati, Vijayalaxmi; Kolikipogu, Ramakrishna; Revathy, P.; Karunanithi, D.

doi:10.1007/978-3-642-22726-4_30

Vijayalaxmi Kakulapati⁶,
Ramakrishna Kolikipogu⁶,
P. Revathy⁷ &
…
D. Karunanithi⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 193))

Included in the following conference series:

International Conference on Advances in Computing and Communications

1847 Accesses
3 Citations

Abstract

Information retrieval is a process of managing the user’s needed information. IR system captures dynamically crawling items that are to be stored and indexed into repositories; this dynamic process facilitates retrieval of needed information by search process and customized presentation to the visualization space. Search engines plays major role in finding the relevant items from the huge repositories, where different methods are used to find the items to be retrieved. The survey on search engines explores that the Naive users are not satisfying with the current searching results; one of the reason to this problem is “lack of capturing the intention of the user by the machine”. Artificial intelligence is an emerging area that addresses these problems and trains the search engine to understand the user’s interest by inputting training data set. In this paper we attempt this problem with a novel approach using new similarity measures. The learning function which we used maximizes the user’s preferable information in searching process. The proposed function utilizes the query log by considering similarity between ranked item set and the user’s preferable ranking. The similarity measure facilitates the risk minimization and also feasible for large set of queries. Here we have demonstrated the framework based on the comparison of performance of algorithm particularly on the identification of clusters using replicated clustering approach. In addition, we provided an investigation analysis on clustering performance which is affected by different sequence representations, different distance measures, number of actual web user clusters, number of web pages, similarity between clusters, minimum session length, number of user sessions, and number of clusters to form.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Efficient Ranking Framework for Information Retrieval Using Similarity Measure

Ameliorating Search Results Recommendation System Based on K-Means Clustering Algorithm and Distance Measurements

Clustering Web Search Results to Identify Information Domain

References

Baeza-Yates, R.A., Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Amsterdam (1999)
MATH Google Scholar
Beitzel, D.M., Jensen, E.C., Chowdhury, A., Grossman, D., Frieder, O.: Hourly analysis of a very large topically categorized Web query log. In: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 321–328 (2004)
Google Scholar
Shen, X., Dumais, S., Horvitz, E.: Analysis of topic dynamics in Web search. In: Proceedings of the International Conference on World Wide Web, pp. 1102–1103 (2005)
Google Scholar
Kumar, P., Bapi, R., Krishna, P.: SeqPAM: A Sequence Clustering Algorithm for Web Personalization. Institute for Development and Research in Banking Technology, India
Google Scholar
Cohen, W., Shapire, R., Singer, Y.: Learning to order things. Journal of Artificial Intelligence Research
Google Scholar
Shen, H.-z., Zhao, J.-d., Yang, Z.-z.: A Web Mining Model for Real-time Webpage Personalization. ACM, New York (2006)
Google Scholar
Kolikipogu, R., Padmaja Rani, B., Kakulapati, V.: Information Retrieval in Indian Languages: Query Expansion model for telugu language as a case study. In: IITAIEEE, China, vol. 4(1) (November 2010)
Google Scholar
Kolikipogu, R.: WordNet Based Term Selection for PRF Query Expansion Model. In: ICCMS 2011, vol. 1 (January 2011)
Google Scholar
Vojnovi, M., Cruise, J., Gunawardena, D., Marbach, P.: Ranking and Suggesting Popular Item. IEEE Journal 21 (2009)
Google Scholar
Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization. ACM Transactions on Internet Technology 3(1), 1–27 (2003)
Article Google Scholar
Asasa Robertson, S.E., Spark Jones, K.: Relevance Weighting of Search Terms. J. American Society for Information Science 27(3) (1976)
Google Scholar
Salton, G.E., Fox, E.A., Wu, H.: Extended Boolean Information Retrieval. Communications of the ACM 26(12), 1022–1036 (1983)
Article MathSciNet MATH Google Scholar
Kelly, D., Teevan, J.: Implicit feedback for inferring user preference: A bibliography. ACM SIGIR Forum 37(2), 18–28 (2003)
Article Google Scholar
Fox, S., Karnawat, K., Mydland, M., Dumais, S., White, T.: Evaluating implicit measures to improve web search. ACM Transactions on Information Science (TOIS) 23(2), 147–168 (2005)
Article Google Scholar
Radlinski, F., Kurupu, M.: How Does Clickthrough Data Reflect Retrieval Quality? In: CIKM 2008, Napa Valley, California, USA, October 26-30 (2008)
Google Scholar
Zhao, Q., Hoi, S.C.H., Liu, T.-Y.: Time-dependent semantic similarity measure of queries using historical click-through data. In: 5th International Conference on WWW. ACM, New York (2006)
Google Scholar
Xu, X.F.: Improving quality of training data for learning to rank using click-through data. In: ACM Proceedings of WSDM 2010 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, JNT University, Hyderabad, India
Vijayalaxmi Kakulapati & Ramakrishna Kolikipogu
Education & Research, Infosys Technologies Limited, Mysore, India
P. Revathy
Information Technology Department, Hindustan University, Chennai, India
D. Karunanithi

Authors

Vijayalaxmi Kakulapati
View author publications
You can also search for this author in PubMed Google Scholar
Ramakrishna Kolikipogu
View author publications
You can also search for this author in PubMed Google Scholar
P. Revathy
View author publications
You can also search for this author in PubMed Google Scholar
D. Karunanithi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Machine Intelligence Research Labs (MIR Labs), Auburn, 98071-2259, Washington, USA
Ajith Abraham
Departamento de Comunicaciones, Universidad Politcnica de Valencia, 46071, Valencia, Spain
Jaime Lloret Mauri
Avaya Labs Research, Basking Ridge, NJ, USA
John F. Buford
University of Massachusetts, 100 Morrissey Blvd., 02125-3393, Boston, MA, USA
Junichi Suzuki
Rajagiri School of Engineering and Technology, Rajagiri Valley Kakkanad, 682 039, Kochi, India
Sabu M. Thampi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kakulapati, V., Kolikipogu, R., Revathy, P., Karunanithi, D. (2011). Improved Web Search Engine by New Similarity Measures. In: Abraham, A., Mauri, J.L., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22726-4_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-22726-4_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22725-7
Online ISBN: 978-3-642-22726-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improved Web Search Engine by New Similarity Measures

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Efficient Ranking Framework for Information Retrieval Using Similarity Measure

Ameliorating Search Results Recommendation System Based on K-Means Clustering Algorithm and Distance Measurements

Clustering Web Search Results to Identify Information Domain

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improved Web Search Engine by New Similarity Measures

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Efficient Ranking Framework for Information Retrieval Using Similarity Measure

Ameliorating Search Results Recommendation System Based on K-Means Clustering Algorithm and Distance Measurements

Clustering Web Search Results to Identify Information Domain

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation