Mining Frequent Trees with Node-Inclusion Constraints

Nakamura, Atsuyoshi; Kudo, Mineichi

doi:10.1007/11430919_101

Atsuyoshi Nakamura²¹ &
Mineichi Kudo²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3518))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2661 Accesses

Abstract

In this paper, we propose an efficient algorithm enumerating all frequent subtrees containing all special nodes that are guaranteed to be included in all trees belonging to a given data. Our algorithm is a modification of TreeMiner algorithm [10] so as to efficiently generate only candidate subtrees satisfying our constraints. We report mining results obtained by applying our algorithm to the problem of finding frequent structures containing the name and reputation of given restaurants in Web pages collected by a search engine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining rooted ordered trees under subtree homeomorphism

Article 19 October 2015

Mining transactional tree databases under homeomorphism

Article 22 February 2025

Transactional Tree Mining

References

Agrawal, R., Srikant, R.: First algorithms for mining association rules. In: Proc. 20th Int’l Conf. on VLDB, pp. 487–499 (1994)
Google Scholar
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. 11th Int’l Conf. on Data Eng., pp. 3–14 (1995)
Google Scholar
Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. In: Proc. 2nd SIAM Int’l Conf. on Data Mining, pp. 158–174 (2002)
Google Scholar
Cohen, W.W., Hurst, M., Jensen, L.S.: A flexible learning system for wrapping tables and lists in HTML documents. In: Proc. 11th Int’l World Wide Web Conf., pp. 232–241 (2002)
Google Scholar
Garofalakis, M., Rastogi, R., Shim, K.: Mining sequential patterns with regular expression constraints. IEEE Transactions on Knowledge and Data Engineering 14(3), 530–552 (2002)
Article Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Hasegawa, H., Kudo, M., Nakamura, A.: Reputation Extraction Using Both Structural and Content Information. Technical Report TCS-TR-A-05-2 (2005), http://www-alg.ist.hokudai.ac.jp/tra.html
Kushmerick, N.: Wrapper induction:efficiency and expressiveness. Artificial Intelligence (118), 15–68 (2000)
Google Scholar
Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Proc. 3rd Int’l Conf. on Knowledge Discovery and Data Mining, pp. 67–73 (1997)
Google Scholar
Zaki, M.J.: Efficiently mining frequent trees in a forest. In: Proc. SIGKDD 2002, pp. 71–80 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, Sapporo, 060-0814, Japan
Atsuyoshi Nakamura & Mineichi Kudo

Authors

Atsuyoshi Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Mineichi Kudo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Japan Advanced Institute of Science and Technology, Asahidai 1-1, 923-12292, Nomi, Japan
Tu Bao Ho
University of Hong Kong, Pokfulam Road, Hong Kong, China
David Cheung
Department of Computer Science and Engineering, Arizona State University, Tempe, Arizona, USA
Huan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakamura, A., Kudo, M. (2005). Mining Frequent Trees with Node-Inclusion Constraints. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_101

Download citation

DOI: https://doi.org/10.1007/11430919_101
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics