Abstract
In text mining, when we need more precise information than word frequencies such as the relationships among words, it is necessary to extract frequent patterns of words with a dependency structure in a sentence. This paper proposes a semi-structure mining method for extracting frequent patterns of words with a dependency structure from a text corpus. First, it describes the data structure representing the dependency structure. This is a tree structure in which each node has multiple items. Then, a mining algorithm for this data structure is described. Our method can extract frequent patterns that cannot be extracted by conventional methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. of ICDE 1995, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)
Pei, J., et al.: PrefixSpan:Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of ICDE2001, pp. 215–224. IEEE Computer Society Press, Los Alamitos (2001)
Pei, J., Han, J., Wang, W.: Mining sequential patterns with constraints in large databases. In: Proc. of CIKM 2002, pp. 18–25 (2002)
Hirate, Y., Yamana, H.: Sequential Pattern Mining with Time Interval. In: Ng, W.-K., et al. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, Springer, Heidelberg (2006)
Abe, K., et al.: Optimized Substructure Discovery for Semi-structured Data. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, Springer, Heidelberg (2002)
Asai, T., et al.: Efficient Substructure Discovery from Large Semi-structured Data. In: Proc. of SDM (2002)
Zaki, M.J.: Efficiently mining frequent trees in a forest. In: Proc. of SIGKD (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Sato, I., Nakagawa, H. (2007). Semi-structure Mining Method for Text Mining with a Chunk-Based Dependency Structure. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_85
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)