Loading [a11y]/accessibility-menu.js
WIKI-CMR: A web cross modality dataset for studying and evaluation of cross modality retrieval models | IEEE Conference Publication | IEEE Xplore

WIKI-CMR: A web cross modality dataset for studying and evaluation of cross modality retrieval models


Abstract:

With the popularity of Web multimedia data, cross-modality retrieval becomes an urgent and challenging problem. Bridging the semantic gap between different modalities and...Show More

Abstract:

With the popularity of Web multimedia data, cross-modality retrieval becomes an urgent and challenging problem. Bridging the semantic gap between different modalities and dealing with abundant data are the main challenges for cross-modality retrieval. A well-designed dataset could provide a platform for developing the state-of-the-art cross-modality retrieval algorithms. However, existing Web cross-modality datasets are small in size, or do not contain the full information, for example, the hyperlink structure. In this paper, we introduce a new Web cross-modality dataset called “WIKI-CMR” by selecting Wikipedia as the reliable and information-rich data resource, and collect data with a smart crawling strategy. This dataset is comprised of 74961 documents with textual paragraphs, images and hyperlinks. All documents are categorized into 11 semantic topics. We point out several challenges on this dataset and use this dataset to evaluate some well-known cross-modality retrieval models.
Date of Conference: 15-19 July 2013
Date Added to IEEE Xplore: 26 September 2013
Electronic ISBN:978-1-4799-0015-2

ISSN Information:

Conference Location: San Jose, CA

Contact IEEE to Subscribe

References

References is not available for this document.