research-article

Beyond bags of words: effectively modeling dependence and features in information retrieval

Author:
Donald Metzler

University of Massachusetts, Amherst, MA

University of Massachusetts, Amherst, MA
View Profile

Authors Info & Claims

ACM SIGIR Forum Volume 42 Issue 1June 2008pp 77https://doi.org/10.1145/1394251.1394271

Published:01 June 2008Publication History

ACM SIGIR Forum

Abstract

Current state of the art information retrieval models treat documents and queries as bags of words. There have been many attempts to go beyond this simple representation. Unfortunately, few have shown consistent improvements in retrieval effectiveness across a wide range of tasks and data sets. Here, we propose a new statistical model for information retrieval based on Markov random fields. The proposed model goes beyond the bag of words assumption by allowing dependencies between terms to be incorporated into the model. This allows for a variety of textual and non-textual features to be easily combined under the umbrella of a single model. Within this framework, we explore the theoretical issues involved, parameter estimation, feature selection, and query expansion. We give experimental results from a number of information retrieval tasks, such as ad hoc retrieval and web search.

Index Terms

Beyond bags of words: effectively modeling dependence and features in information retrieval
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Beyond bags of words: effectively modeling dependence and features in information retrieval
Read More
Beyond bag-of-words: Bigram-enhanced context-dependent term weights

While term independence is a widely held assumption in most of the established information retrieval approaches, it is clearly not true and various works in the past have investigated a relaxation of the assumption. One approach is to use n-grams in ...
Read More
Medical image retrieval based on unclean image bags

Traditional content-based image retrieval (CBIR) scheme with assumption of independent individual images in large-scale collections suffers from poor retrieval performance. In medical applications, images usually exist in the form of image bags and each ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM SIGIR Forum Volume 42, Issue 1
June 2008
76 pages
ISSN:0163-5840
DOI:10.1145/1394251
Issue’s Table of Contents

Copyright © 2008 Author
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 2008
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 209
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Beyond bags of words: effectively modeling dependence and features in information retrieval

ACM SIGIR Forum

Abstract

Cited By

Index Terms

Recommendations

Beyond bags of words: effectively modeling dependence and features in information retrieval

Beyond bag-of-words: Bigram-enhanced context-dependent term weights

Medical image retrieval based on unclean image bags

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Beyond bags of words: effectively modeling dependence and features in information retrieval

ACM SIGIR Forum

Abstract

Cited By

Index Terms

Recommendations

Beyond bags of words: effectively modeling dependence and features in information retrieval

Beyond bag-of-words: Bigram-enhanced context-dependent term weights

Medical image retrieval based on unclean image bags

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media