Article

Measuring similarity of semi-structured documents with context weights

Authors:

SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 719 - 720

https://doi.org/10.1145/1148170.1148334

Published: 06 August 2006 Publication History

Get Access

Abstract

In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. Experimental results based on a benchmark showed superior performance of the proposed measure over the baseline which ignores structural knowledge of XML documents.

References

[1]

D. Carmel, Y.S. Maarek, M. Mandelbrod, Y. Mass and A. Soffer. "Searching XML Documents via XML Fragments", In Proceedings of SIGIR' 2003, Toronto, Canada, 2003

Digital Library

Google Scholar

[2]

V. Kakade and P. Raghavan. "Encoding XML in Vector Spaces", In Proceedings of ECIR'2005, Santiago, Spain

Digital Library

Google Scholar

[3]

Initiative for the evaluation of XML retrieval http://qmir.dcs.qmul.ac.hk/INEX/

Google Scholar

[4]

S. Liu, Q. Zhu and W.W. Chu. "Configurable Indexing and Ranking for XML Information Retrieval", In Proceedings of SIGIR' 2003, Toronto, Canada

Digital Library

Google Scholar

Cited By

View all

WANG YCHEN ZHUANG X(2010)Element Retrieval Using Namespace Based on Keyword Search over XML DocumentsJournal of Software Engineering and Applications10.4236/jsea.2010.3100803:01(65-72)Online publication date: 2010
https://doi.org/10.4236/jsea.2010.31008
Ling SShengen LQiang LWei HTongjiang Y(2009)An approach for measuring similarity between XML documentsProceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 710.5555/1802134.1802225(410-414)Online publication date: 14-Aug-2009
https://dl.acm.org/doi/10.5555/1802134.1802225
Song LLi SCui WZhang DNiu X(2009)An approach to XML Path retrieval2009 IEEE International Conference on Granular Computing10.1109/GRC.2009.5255069(513-516)Online publication date: Aug-2009
https://doi.org/10.1109/GRC.2009.5255069
Show More Cited By

Index Terms

Measuring similarity of semi-structured documents with context weights
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Measuring the structural similarity among XML documents and DTDs

Measuring the structural similarity between an XML document and a DTD has many relevant applications that range from document classification and approximate structural queries on XML documents to selective dissemination of XML documents and document ...
Measuring Similarity among Legal Court Case Documents
Compute '17: Proceedings of the 10th Annual ACM India Compute Conference

Computing the similarity between two legal documents is an important challenge in the Legal Information Retrieval domain. Efficient calculation of this similarity has useful applications in various tasks such as identifying relevant prior cases for a ...
Content-based filtering for semi-structured documents

Comments

Information & Contributors

Information

Published In

SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

August 2006

768 pages

ISBN:1595933697

DOI:10.1145/1148170

General Chair:
Efthimis N. Efthimiadis
University of Washington
,
Program Chairs:
Susan Dumais
Microsoft Research, Redmond
,
David Hawking
CSIRO ICT Centre, Canberra, Australia
,
Kalervo Järvelin,
University of Tampere, Finland

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 August 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGIR06

Sponsor:

SIGIR06: The 29th Annual International SIGIR Conference

August 6 - 11, 2006

Washington, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
457
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

WANG YCHEN ZHUANG X(2010)Element Retrieval Using Namespace Based on Keyword Search over XML DocumentsJournal of Software Engineering and Applications10.4236/jsea.2010.3100803:01(65-72)Online publication date: 2010
https://doi.org/10.4236/jsea.2010.31008
Ling SShengen LQiang LWei HTongjiang Y(2009)An approach for measuring similarity between XML documentsProceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 710.5555/1802134.1802225(410-414)Online publication date: 14-Aug-2009
https://dl.acm.org/doi/10.5555/1802134.1802225
Song LLi SCui WZhang DNiu X(2009)An approach to XML Path retrieval2009 IEEE International Conference on Granular Computing10.1109/GRC.2009.5255069(513-516)Online publication date: Aug-2009
https://doi.org/10.1109/GRC.2009.5255069
Song LMa JLei JZhang DWang Z(2009)Semantic Structural Similarity Measure for Clustering XML DocumentsProceedings of the International Conference on Web Information Systems and Mining10.1007/978-3-642-05250-7_25(232-241)Online publication date: 22-Nov-2009
https://dl.acm.org/doi/10.1007/978-3-642-05250-7_25
Wan X(2007)A novel document similarity measure based on earth mover's distanceInformation Sciences: an International Journal10.1016/j.ins.2007.02.045177:18(3718-3730)Online publication date: 1-Sep-2007
https://dl.acm.org/doi/10.1016/j.ins.2007.02.045

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Measuring the structural similarity among XML documents and DTDs

Measuring Similarity among Legal Court Case Documents

Content-based filtering for semi-structured documents

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations