skip to main content
10.1145/1458082.1458089acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A language for manipulating clustered web documents results

Published: 26 October 2008 Publication History

Abstract

We propose a novel conception language for exploring the results retrieved by several internet search services (like search engines) that cluster retrieved documents. The goal is to offer users a tool to discover relevant hidden relationships between clustered documents.
The proposal is motivated by the observation that visualization paradigms, based on either the ranked list or clustered results, do not allow users to fully exploit the combined use of several search services to answer a request.
When the same query is submitted to distinct search services, they may produce partially overlapped clustered results, where clusters identified by distinct labels collect some common documents. Moreover, clusters with similar labels, but containing distinct documents, may be produced as well. In such a situation, it may be useful to compare, combine and rank the cluster contents, to filter out relevant documents. In the proposed language, we define several operators (inspired by relational algebra) that work on groups of clusters. New clusters (and groups) can be generated by combining (i.e., overlapping, refining and intersecting) clusters (and groups), in a set oriented fashion. Furthermore, several ranking functions are also proposed, to model distinct semantics of the combination.

References

[1]
S. K. Card, J. D. Mackinlay, and B. Shneiderman. Readings in information visualization: Using vision to think. Morgan Kaufmann Publishers Inc. San Francisco, CA, 1999.
[2]
H. Chen and S. Dumais. Bringing order to the web: Automatically categorizing search results. Proceedings of the SIGCHI conference on Human factors in computing systems, IR-76:145--152, 2000.
[3]
W. Chung, H. Chen, and J. J. Nunamaker. Business intelligence explorer: a knowledge map framework for discovering business intelligence on the web. System Sciences, Proceedings of the 36th Annual Hawaii International Conference on System Sciences:10, 2003.
[4]
T. Coates, D. Connolly, D. Dack, L. Daigle, R. Denenberg, M. Durst, P. Grosso, S. Hawke, R. Iannella, G. Klyne, L. Masinter, M. Mealling, M. Needleman, and N. Walsh. URIs, URLs, and URNs: Clarifications and recommendations 1.0. Technical report, World Wide Web Consortium, URI Planning Interest Group W3C/IETF, http://www.w3.org/TR/2001/NOTE-uri-clarification-20010921/, 2001.
[5]
A. L. N. Fred and A. K. Jain. Robust data clustering. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), 2:128, 2003.
[6]
M. A. Hearst and J. O. Pederson. Reexamining the cluster hypothesis: Scatter/gather on retrieval results. Proceedings of the Conference on Research and Development in Information Retrieval, 1996.
[7]
N. Kampanya, R. Shen, S. Kim, C. North, and E. A. Fox. Citiviz: A visual user interface to the citidel system. LNCS, Springer Verlag, 3232:122--133, 2004.
[8]
A. V. Leouski and W. B. Croft. An evaluation of techniques for clustering search results. Technical Report of the Department of Computer Science f University of Massachusetts at Amherst, IR-76:122--133, 1996.
[9]
M. V. M. Pagani, G. Bordogna. Mining multidimensional data using clustering techniques. Proceedings of DEXA Workshop FLEXDBIST-07, 2007.
[10]
S. Osinski. An algorithm for clustering of web search results. Master's thesis, Department of Computing Science, Poznan' University of Technology, http://project.carrot2.org/publications/osinski-2003-lingo.pdf, 2003.
[11]
M. M. Sebrechts, J. Vasilakis, M. S. Miller, and S. J. L. J. V. Cugini. Visualization of search results: A comparative evaluation of text, 2d, and 3d interfaces. Proceedings of SIGIR '99, 1999.
[12]
E. Staley and M. Twidale. Graphical interfaces to support information search. Technical report, University of Illinois, http://people.lis.uiuc.edu/~twidale/irinterfaces/bib-main.html, 2000.
[13]
A. Strehl and J. Ghosh. Cluster ensembles - a knowledge reuse framework for combining partitionings. Proceedings of AAAI, 2002, 2002.
[14]
L. Zadeh. Fuzzy sets. Information and control, 8:338--353, 1965.
[15]
O. Zamir and O. Etzioni. Grouper: a dynamic clustering interface to web search results. Proceedings of the 8th International World Wide Web Conference, 1999.

Cited By

View all
  • (2021)Towards Flexible Retrieval, Integration and Analysis of JSON Data Sets through Fuzzy Sets: A Case StudyInformation10.3390/info1207025812:7(258)Online publication date: 22-Jun-2021
  • (2019)A Fuzzy Technique for On-Line Aggregation of POIs from Social Media: Definition and Comparison with Off-Line Random-Forest ClassifiersInformation10.3390/info1012038810:12(388)Online publication date: 7-Dec-2019
  • (2019)On-line aggregation of POIs from Google and FacebookProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297576(1081-1089)Online publication date: 8-Apr-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
October 2008
1562 pages
ISBN:9781595939913
DOI:10.1145/1458082
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. exploratory search
  2. query languages
  3. search service
  4. web documents

Qualifiers

  • Research-article

Conference

CIKM08
CIKM08: Conference on Information and Knowledge Management
October 26 - 30, 2008
California, Napa Valley, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Towards Flexible Retrieval, Integration and Analysis of JSON Data Sets through Fuzzy Sets: A Case StudyInformation10.3390/info1207025812:7(258)Online publication date: 22-Jun-2021
  • (2019)A Fuzzy Technique for On-Line Aggregation of POIs from Social Media: Definition and Comparison with Off-Line Random-Forest ClassifiersInformation10.3390/info1012038810:12(388)Online publication date: 7-Dec-2019
  • (2019)On-line aggregation of POIs from Google and FacebookProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297576(1081-1089)Online publication date: 8-Apr-2019
  • (2017)A flexible framework to cross-analyze heterogeneous multi-source geo-referenced informationProceedings of the International Conference on Web Intelligence10.1145/3106426.3106537(499-508)Online publication date: 23-Aug-2017
  • (2016)Accurate and efficient query clustering via top ranked search resultsWeb Intelligence10.3233/WEB-16033514:2(119-138)Online publication date: 25-Apr-2016
  • (2012)Web Search Results Discovery by Multi-granular GraphsQuantitative Semantics and Soft Computing Methods for the Web10.4018/978-1-60960-881-1.ch006(118-136)Online publication date: 2012
  • (2011)Discovering and analyzing multi-granular web search resultsProceedings of the 9th international conference on Flexible Query Answering Systems10.1007/978-3-642-24764-4_20(221-233)Online publication date: 26-Oct-2011
  • (2010)A Flexible Language for Exploring Clustered Search ResultsScalable Fuzzy Algorithms for Data Management and Analysis10.4018/978-1-60566-858-1.ch007(179-213)Online publication date: 2010
  • (2009)Query Disambiguation Based on Novelty and Similarity User's FeedbackProceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 0310.1109/WI-IAT.2009.246(125-128)Online publication date: 15-Sep-2009
  • (2009)The Role of Clustering in Search ComputingProceedings of the 2009 20th International Workshop on Database and Expert Systems Application10.1109/DEXA.2009.89(432-436)Online publication date: 31-Aug-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media