Mining Frequent Closed Unordered Trees Through Natural Representations

Balcázar, José L.; Bifet, Albert; Lozano, Antoni

doi:10.1007/978-3-540-73681-3_26

José L. Balcázar¹,
Albert Bifet¹ &
Antoni Lozano¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4604))

Included in the following conference series:

International Conference on Conceptual Structures

735 Accesses
4 Citations

Abstract

Many knowledge representation mechanisms consist of link-based structures; they may be studied formally by means of unordered trees. Here we consider the case where labels on the nodes are nonexistent or unreliable, and propose data mining processes focusing on just the link structure. We propose a representation of ordered trees, describe a combinatorial characterization and some properties, and use them to propose an efficient algorithm for mining frequent closed subtrees from a set of input trees. Then we focus on unordered trees, and show that intrinsic characterizations of our representation provide for a way of avoiding the repeated exploration of unordered trees, and then we give an efficient algorithm for mining frequent closed unordered trees.

Partially supported by the 6th Framework Program of EU through the integrated project DELIS (#001907), by the EU PASCAL Network of Excellence, IST-2002-506778, by the MEC TIN2005-08832-C03-03 (MOISES-BAR), MCYT TIN2004-07925-C03-02 (TRANGRAM), and CICYT TIN2004-04343 (iDEAS) projects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arimura, H., Uno, T.: An output-polyunomial time algorithm for mining frequent closed attribute trees. In: Kramer, S., Pfahringer, B. (eds.) ILP 2005. LNCS (LNAI), vol. 3625, pp. 1–19. Springer, Heidelberg (2005)
Google Scholar
Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. In: SDM (2002)
Google Scholar
Asai, T., Arimura, H., Uno, T., Nakano, S.-I.: Discovering frequent substructures in large unordered trees. Discovery Science, 47–61 (2003)
Google Scholar
Baixeries, J., Balcázar, J.L.: Discrete deterministic data mining as knowledge compilation. In: Workshop on Discrete Math. and Data Mining at SIAM DM Conference (2003)
Google Scholar
Balcázar, J.L., Bifet, A., Lozano, A.: Intersection algorithms and a closure operator on unordered trees. In: MLG 2006, 4th International Workshop on Mining and Learning with Graphs (2006)
Google Scholar
Balcázar, J.L., Bifet, A., Lozano, A.: Mining frequent closed rooted trees (submitted, 2007)
Google Scholar
Balcázar, J.L., Garriga, G.C.: On Horn axiomatizations for sequential data. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 215–229. Springer, Heidelberg (2004)
Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: HybridTreeMiner: An efficient algorithm for mining frequent rooted trees and free trees using canonical forms. In: Chi, Y., Yang, Y., Muntz, R.R. (eds.) SSDBM 2004. Proceedings of the 16th International Conference on Scientific and Statistical Database Management, Washington, DC, USA, 2004, p. 11. IEEE Computer Society Press, Los Alamitos (2004)
Google Scholar
Chi, Y., Muntz, R., Nijssen, S., Kok, J.: Frequent subtree mining – an overview. Fundamenta Informaticae XXI, 1001–1038 (2001)
Google Scholar
Chi, Y., Xia, Y., Yang, Y., Muntz, R.: Mining closed and maximal frequent subtrees from databases of labeled rooted trees. Fundamenta Informaticae XXI, 1001–1038 (2001)
Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: Indexing and mining free trees. In: ICDM 2003. Proceedings of the Third IEEE International Conference on Data Mining, Washington, DC, USA, 2003, p. 509. IEEE Computer Society, Los Alamitos (2003)
Google Scholar
Ganter, B., Wille, R.: Formal Concept Analysis. Springer, Heidelberg (1999)
MATH Google Scholar
Garriga, G.C.: Formal methods for mining structured objects. PhD Thesis (2006)
Google Scholar
Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-Cup 2000 organizers’ report: Peeling the onion. SIGKDD Explorations 2(2), 86–98 (2000)
Article Google Scholar
Nijssen, S., Kok, J.N.: Efficient discovery of frequent unordered trees. In: First International Workshop on Mining Graphs, Trees and Sequences, pp. 55–64 (2003)
Google Scholar
Termier, A., Rousset, M.-C., Sebag, M.: DRYADE: a new approach for discovering closed frequent trees in heterogeneous tree databases. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 543–546. Springer, Heidelberg (2004)
Google Scholar
Valiente, G.: Algorithms on Trees and Graphs. Springer, Heidelberg (2002)
MATH Google Scholar
Xiao, Y., Yao, J.-F., Li, Z., Dunham, M.H.: Efficient data mining for maximal frequent subtrees. In: ICDM 2003. Proceedings of the Third IEEE International Conference on Data Mining, Washington, DC, USA, 2003, p. 379. IEEE Computer Society, Los Alamitos (2003)
Google Scholar
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM 2002. Proceedings of the 2002 IEEE International Conference on Data Mining, Washington, DC, USA, 2002, p. 721. IEEE Computer Society, Los Alamitos (2002)
Google Scholar
Yan, X., Han, J.: CloseGraph: mining closed frequent graph patterns. In: KDD 2003. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 286–295. ACM Press, New York, NY, USA (2003)
Chapter Google Scholar
Yan, X., Han, J., Afshar, R.: CloSpan: Mining closed sequential patterns in large databases. In: SDM (2003)
Google Scholar
Zaki, M.J.: Efficiently mining frequent trees in a forest. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)
Google Scholar
Zaki, M.J.: Efficiently mining frequent embedded unordered trees. Fundam. Inform. 66(1-2), 33–52 (2005)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Universitat Politècnica de Catalunya,
José L. Balcázar, Albert Bifet & Antoni Lozano

Authors

José L. Balcázar
View author publications
You can also search for this author in PubMed Google Scholar
Albert Bifet
View author publications
You can also search for this author in PubMed Google Scholar
Antoni Lozano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Uta Priss Simon Polovina Richard Hill

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balcázar, J.L., Bifet, A., Lozano, A. (2007). Mining Frequent Closed Unordered Trees Through Natural Representations . In: Priss, U., Polovina, S., Hill, R. (eds) Conceptual Structures: Knowledge Architectures for Smart Applications. ICCS 2007. Lecture Notes in Computer Science(), vol 4604. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73681-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-540-73681-3_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73680-6
Online ISBN: 978-3-540-73681-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics