Abstract
This report concerns the last edition of the XML Mining Track at INEX 2007. A preceding report has been already published concerning the two preceding editions of the track. We present here the new corpus used for this third phase and briefly describe the models and the results obtained by the different participants.
- Denoyer, L., Gallinari, P.: Report on the xml mining track at inex 2005 and inex 2006: categorization and clustering of xml documents. SIGIR Forum 41(1) (2007) 79--90 Google ScholarDigital Library
- Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum (2006) Google ScholarDigital Library
- de Campos, L. M., Fernandez-Luna, J. M., Huete, J. F., Romero, A. E.: Probabilistic methods for structured document classification at inex '07. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
- Murugeshan, M. S., Krishnamurthy, L., Mukherjee, S.: Lakshmi krishnamurthy and saswati mukherjee. an ncd based approach for wikipedia categorization task. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
- Yang, J., Zhang, F.: Xml document classification using extended vsm. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
- Hagenbuchner, M., Tsoi, A. C., Sperduti, A., Kc, M.: Efficient clustering of structured documents using graph self-organizing maps. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
- Tran, T., Nayak, R., Bruza, P.: Document clustering using incremental and pairwise approaches. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
- Yao, J., Zerida, N.: Rare patterns to improve path-based clusteringof wikipedia articles. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
- Kutty, S., Tran, T., Nayak, R., Li, Y.: Clustering xml documents using closed frequent subtrees- a structure-only based approach. In: Workshop of the INitiative for the Evaluation of XML Retrieval. (2007)Google Scholar
Index Terms
- Report on the XML mining track at INEX 2007 categorization and clustering of XML documents
Recommendations
Schemas Extraction for XML Documents by XML Element Sequence Patterns
ICISE '09: Proceedings of the 2009 First IEEE International Conference on Information Science and EngineeringXML is the de facto standard format for data exchange manipulation of structured documents. XML schema provides important structural information of XML documents. Unfortunately, much XML data does not have XML schema or is not accompanied by its XML ...
Report on the XML mining track at INEX 2005 and INEX 2006: categorization and clustering of XML documents
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). We focus here on the classification and clustering of XML documents. We detail these two tasks and the corpus used for this challenge and then present a ...
PKU at INEX 2010 XML mining track
INEX'10: Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrievalThis paper presents our participation in the INEX 2010 XML Mining track. Our classification and clustering solutions for XML documents have used both the structure and content information, where the frequent subtrees as structural units are used for ...
Comments