research-article

Report on the XML mining track at INEX 2007 categorization and clustering of XML documents

Authors:
Ludovic Denoyer

University of Paris

University of Paris
View Profile

,
Patrick Gallinari

University of Paris

University of Paris
View Profile

Authors Info & Claims

ACM SIGIR Forum Volume 42 Issue 1June 2008pp 22–28https://doi.org/10.1145/1394251.1394255

Published:01 June 2008Publication History

ACM SIGIR Forum

Abstract

This report concerns the last edition of the XML Mining Track at INEX 2007. A preceding report has been already published concerning the two preceding editions of the track. We present here the new corpus used for this third phase and briefly describe the models and the results obtained by the different participants.

References

Denoyer, L., Gallinari, P.: Report on the xml mining track at inex 2005 and inex 2006: categorization and clustering of xml documents. SIGIR Forum 41(1) (2007) 79--90 Google ScholarDigital Library
Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum (2006) Google ScholarDigital Library
de Campos, L. M., Fernandez-Luna, J. M., Huete, J. F., Romero, A. E.: Probabilistic methods for structured document classification at inex '07. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Murugeshan, M. S., Krishnamurthy, L., Mukherjee, S.: Lakshmi krishnamurthy and saswati mukherjee. an ncd based approach for wikipedia categorization task. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Yang, J., Zhang, F.: Xml document classification using extended vsm. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Hagenbuchner, M., Tsoi, A. C., Sperduti, A., Kc, M.: Efficient clustering of structured documents using graph self-organizing maps. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Tran, T., Nayak, R., Bruza, P.: Document clustering using incremental and pairwise approaches. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Yao, J., Zerida, N.: Rare patterns to improve path-based clusteringof wikipedia articles. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Kutty, S., Tran, T., Nayak, R., Li, Y.: Clustering xml documents using closed frequent subtrees- a structure-only based approach. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar

Index Terms

Report on the XML mining track at INEX 2007 categorization and clustering of XML documents
1. General and reference
  1. Document types

Recommendations

Schemas Extraction for XML Documents by XML Element Sequence Patterns
ICISE '09: Proceedings of the 2009 First IEEE International Conference on Information Science and Engineering

XML is the de facto standard format for data exchange manipulation of structured documents. XML schema provides important structural information of XML documents. Unfortunately, much XML data does not have XML schema or is not accompanied by its XML ...
Read More
Report on the XML mining track at INEX 2005 and INEX 2006: categorization and clustering of XML documents

This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). We focus here on the classification and clustering of XML documents. We detail these two tasks and the corpus used for this challenge and then present a ...
Read More
PKU at INEX 2010 XML mining track
INEX'10: Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval

This paper presents our participation in the INEX 2010 XML Mining track. Our classification and clustering solutions for XML documents have used both the structure and content information, where the frequent subtrees as structural units are used for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM SIGIR Forum Volume 42, Issue 1
June 2008
76 pages
ISSN:0163-5840
DOI:10.1145/1394251
Issue’s Table of Contents

Copyright © 2008 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 2008
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 191
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Report on the XML mining track at INEX 2007 categorization and clustering of XML documents

ACM SIGIR Forum

Abstract

References

Cited By

Index Terms

Recommendations

Schemas Extraction for XML Documents by XML Element Sequence Patterns

Report on the XML mining track at INEX 2005 and INEX 2006: categorization and clustering of XML documents

PKU at INEX 2010 XML mining track

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Report on the XML mining track at INEX 2007 categorization and clustering of XML documents

ACM SIGIR Forum

Abstract

References

Cited By

Index Terms

Recommendations

Schemas Extraction for XML Documents by XML Element Sequence Patterns

Report on the XML mining track at INEX 2005 and INEX 2006: categorization and clustering of XML documents

PKU at INEX 2010 XML mining track

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media