Skip to main content

Semi-structure Mining Method for Text Mining with a Chunk-Based Dependency Structure

  • Conference paper
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Abstract

In text mining, when we need more precise information than word frequencies such as the relationships among words, it is necessary to extract frequent patterns of words with a dependency structure in a sentence. This paper proposes a semi-structure mining method for extracting frequent patterns of words with a dependency structure from a text corpus. First, it describes the data structure representing the dependency structure. This is a tree structure in which each node has multiple items. Then, a mining algorithm for this data structure is described. Our method can extract frequent patterns that cannot be extracted by conventional methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. of ICDE 1995, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)

    Google Scholar 

  2. Pei, J., et al.: PrefixSpan:Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of ICDE2001, pp. 215–224. IEEE Computer Society Press, Los Alamitos (2001)

    Google Scholar 

  3. Pei, J., Han, J., Wang, W.: Mining sequential patterns with constraints in large databases. In: Proc. of CIKM 2002, pp. 18–25 (2002)

    Google Scholar 

  4. Hirate, Y., Yamana, H.: Sequential Pattern Mining with Time Interval. In: Ng, W.-K., et al. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Abe, K., et al.: Optimized Substructure Discovery for Semi-structured Data. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Asai, T., et al.: Efficient Substructure Discovery from Large Semi-structured Data. In: Proc. of SDM (2002)

    Google Scholar 

  7. Zaki, M.J.: Efficiently mining frequent trees in a forest. In: Proc. of SIGKD (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Sato, I., Nakagawa, H. (2007). Semi-structure Mining Method for Text Mining with a Chunk-Based Dependency Structure. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_85

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71701-0_85

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71700-3

  • Online ISBN: 978-3-540-71701-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics