Information Extraction by XLM

Okada, Masashi; Ishii, Naohiro; Kato, Nariaki

doi:10.1007/978-3-540-74827-4_131

Masashi Okada⁴,
Naohiro Ishii⁴ &
Nariaki Kato⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4693))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

2096 Accesses

Abstract

XML(eXtensible Markup Language) is used as the description form of the documents and the data exchanged on the Web. Now, the usage of the XML is extending to not only the exchange of the XML document but also that of the XML database for classification and retrieval. This paper develops how to retrieve the related objective data from the tree structure in the XML document for the classification. In this paper, an ordered preserving relation is defined as the cue of the retrieval of the objective pattern. Then, the problem is to find the ordered preserving relations in the document. In the document, the ordered and the unordered subtree structure are important in the retrieval and classification. Then, the retrieval value is calculated in the tree structure. A method developed here was applied to the practical XML document.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Parsing TEI XML

Information Retrieval in XML Document: State of the Art

XML Parsing Technique

References

Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications. IEEE Trans. on Knowledge and Data Engineering 17(8), 1021–1035 (2005)
Article Google Scholar
Zaki, M.J.: Efficiently Mining Frequent Embedded Unordered Trees. Fundamenta Informatica 65, 1–20 (2005)
MathSciNet MATH Google Scholar
Zaki, M.J., Aggarwal, C.C.: XRules, An Effective Structural Classifier for XML Data. In: Proc. of the 2003 Int. Conf. Knowledge Discovery and Data Mining, pp. 316–325 (2003)
Google Scholar
Asai, T., Arimura, T., Uno, T., Nakano, S.: Discovering Frequent Substructures in Large Unordered Trees. In: Proc. Sixth Int. Conf. Discovery Science, pp. 47–61 (October 2003)
Google Scholar
Chi, Y., Yang, Y., Munz, R.R.: Indexing and Mining Free Trees. In: Proc. Third IEEE Int. Conf. Data Mining, pp. 509–512. IEEE Computer Society Press, Los Alamitos (2003)
Chapter Google Scholar
Bao, Y., Tsuchiya, E., Ishii, N., Du, X.: Classification by Instance-Based Learning Algorithm. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 133–140. Springer, Heidelberg (2005)
Chapter Google Scholar
Yamada, T., Yamashita, K., Ishii, N., Iwata, K.: Text Classification by Combining Different Distance Functions with Weights. In: SNPD2006, pp. 85–90. IEEE Computer Soc. Publication, Los Alamitos (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Aichi Institute of Technology, Yachigusa, Yakusacho, Toyota, Japan 470-0392, Japan
Masashi Okada, Naohiro Ishii & Nariaki Kato

Authors

Masashi Okada
View author publications
You can also search for this author in PubMed Google Scholar
Naohiro Ishii
View author publications
You can also search for this author in PubMed Google Scholar
Nariaki Kato
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Scienze dell’Informazione, Università degli Studi di Milano, Via Comelico 39/41, 20135, Milano, Italy
Bruno Apolloni
Centre for SMART Systems, School of Engineering, University of Brighton, BN2 4GJ, Brighton, UK
Robert J. Howlett
Knowledge-Based Intelligent Engineering Systems Centre, University of South Australia, Mawson Lakes, SA 5095, Adelaide, Australia
Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Okada, M., Ishii, N., Kato, N. (2007). Information Extraction by XLM. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4693. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74827-4_131

Download citation

DOI: https://doi.org/10.1007/978-3-540-74827-4_131
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74826-7
Online ISBN: 978-3-540-74827-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics