Mining from Literary Texts: Pattern Discovery and Similarity Computation

Takeda, Masayuki; Fukuda, Tomoko; Nanri, Ichirō

doi:10.1007/3-540-45884-0_39

Masayuki Takeda^2,3,
Tomoko Fukuda⁴ &
Ichirō Nanri⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2281))

520 Accesses

Abstract

This paper surveys our recent studies of text mining from literary works, especially classical Japanese poems, Waka. We present methods for finding characteristic patterns in anthologies of Waka poems, as well as those for finding similar poem pairs. Our aim is to obtain good results that are of interest to Waka researchers, not just to develop efficient algorithms. We report successful results in finding patterns and similar poem pairs, some of which led to new discoveries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Computational thematics: comparing algorithms for clustering the genres of literary fiction

Article Open access 20 March 2024

Pairwise Relation Analysis and Quality Estimation of Classical Chinese Poetry in Ancient Korea

On measurement of distances between texts in dictionary-based content analysis

Article 05 July 2024

References

D. Angluin. Finding patterns common to a set of strings. J. Comput. Sys. Sci., 21:46–62, 1980.
Article MATH MathSciNet Google Scholar
H. Arimura. Text data mining with optimized pattern discovery. In Proc. 17th Workshop on Machine Intelligence, Cambridge, July 2000.
Google Scholar
A. Blumer, J. Blumer, D. Haussler, R. Mcconnell, and A. Ehrenfeucht. Complete inverted files for efficient text retrieval and analysis. J. ACM, 34(3):578–595, 1987. Previous version in: STOC’84.
Article MathSciNet Google Scholar
A. Bräzma, E. Ukkonen, and J. Vilo. Discovering unbounded unions of regular pattern languages from positive examples. In Proc. 7th International Symposium on Algorithms and Computation (ISAAC’96), pages 95–104, 1996.
Google Scholar
M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, 1994.
Google Scholar
L. Devroye, L. Györ., and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1997.
Google Scholar
U. M. Fayyad, G. P.-Shapiro, and P. Smyth. From data mining to knowledge discovery: an overview. In Advances in Knowledge Discovery and Data Mining, pages 1–34. The AAAI Press, 1996.
Google Scholar
Z. Galil. Open problems in stringology. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, NATO ASI Series, Advanced Science Institutes Series, Series F: Computer and Systems Sciences, Vol. 12, pages 1–8. Springer-Verlag, 1985.
Google Scholar
D. Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York, 1997.
MATH Google Scholar
H. Hori, S. Shimozono, M. Takeda, and A. Shinohara. Fragmentary pattern matching: Complexity, algorithms and applications for analyzing classic literary works. In Proc. 12th Annual International Symposium on Algorithms and Computation (ISAAC’⫗1), 2001. To appear.
Google Scholar
T. Kadota, M. Hirao, A. Ishino, M. Takeda, A. Shinohara, and F. Matsuo. Musical sequence comparison for melodic and rhythmic similarities. In Proc. 8th International Symposium on String Processing and Information Retrieval (SPIRE2001). IEEE Computer Society, 2001. To appear.
Google Scholar
O. Maruyama, T. Uchida, K. L. Sim, and S. Miyano. Designing views in HypothesisCreator: System for assisting in discovery. In Proc. 2nd International Conference on Discovery Science (DS’99), LNAI 1721, pages 115–127, 1999.
Google Scholar
S. Morishita. On classification and regression. InProc. 1st International Conference on Discovery Science (DS’99), LNAI1532, pages 49–59, 1998.
Google Scholar
S. Shimozono, H. Arimura, and S. Arikawa. Efficient discovery of optimal wordassociation patterns in large databases. New Gener. Comput., 18(1):49–60, 2000.
Google Scholar
M. Takeda, T. Fukuda, I. Nanri, M. Yamasaki, and K. Tamari. Discovering instances of poetic allusion from anthologies of classical Japanese poems. Theor. Comput. Sci., 2001. To appear. Preliminary version in: Proc. DS’99 (LNAI 1721).
Google Scholar
M. Takeda, T. Matsumoto, T. Fukuda, and I. Nanri. Discovering characteristic expressions from literary works. Theor. Comput. Sci., 2001. To appear. Preliminary version in: Proc. DS 2000 (LNAI1967).
Google Scholar
K. Yamamoto, M. Takeda, A. Shinohara, T. Fukuda, and I. Nanri. Discovering repetitive expressions and afinities from anthologies of classical Japanese poems. In Proc. 4th International Conference on Discovery Science (DS2001), 2001. To appear.
Google Scholar
M. Yamasaki, M. Takeda, T. Fukuda, and I. Nanri. Discovering characteristic patterns from collections of classical Japanese poems. New Gener. Comput., 18(1):61–73, 2000. Preliminary version in: Proc. DS’98 (LNAI 1532).
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Kyushu University33, 812-8581, Fukuoka, Japan
Masayuki Takeda
PRESTO, Japan Science and Technology Corporation (JST), 815-0036, Fukuoka, Japan
Masayuki Takeda
Junshin Women’s Junior College, 815-0036, Fukuoka, Japan
Tomoko Fukuda & Ichirō Nanri

Authors

Masayuki Takeda
View author publications
You can also search for this author in PubMed Google Scholar
Tomoko Fukuda
View author publications
You can also search for this author in PubMed Google Scholar
Ichirō Nanri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, 812-8581, Fukuoka, Japan
Setsuo Arikawa & Ayumi Shinohara &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Takeda, M., Fukuda, T., Nanri, I. (2002). Mining from Literary Texts: Pattern Discovery and Similarity Computation. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_39

Download citation

DOI: https://doi.org/10.1007/3-540-45884-0_39
Published: 14 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43338-5
Online ISBN: 978-3-540-45884-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics