Discovery of hierarchical thematic structure in text collections with adaptive resonance theory

Massey, Louis

doi:10.1007/s00521-008-0178-2

Discovery of hierarchical thematic structure in text collections with adaptive resonance theory

Original Article
Published: 29 February 2008

Volume 18, pages 261–273, (2009)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Louis Massey¹

142 Accesses
4 Citations
Explore all metrics

Abstract

This paper investigates the abilities of adaptive resonance theory (ART) neural networks as miners of hierarchical thematic structure in text collections. We present experimental results with binary ART1 on the benchmark Reuter-21578 corpus. Using both quantitative evaluation with the standard F ₁ measure and qualitative visualization of the hierarchy obtained with ART, we discuss how useful ART built hierarchies would be to a user intending to use it as a means to find and access textual information. Our F ₁ results show that ART1 produces hierarchical clustering that exhibit a quality exceeding k-means and a hierarchical clustering algorithm. However, we identify several critical problem areas that would make it rather impractical to actually use such a hierarchy in a real-life environment. These predicaments point to the importance of semantic feature selection. Our main contribution is to test in details the applicability of ART to the important domain of hierarchical document clustering, an application of Adaptive Resonance that had received little attention until now.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Structural Representations of Text Documents in Large Document Collections

A Knowledge Discovery from Full-Text Document Collections Using Clustering and Interpretable Genetic-Fuzzy Systems

A comprehensive and analytical review of text clustering techniques

Article 08 April 2024

Notes

F ₁ in this case is a well-known clustering quality measure, to be distinguished from the F ₁ of section 2 which is the name of the input layer of ART networks.

References

Koller D, Sahami M (1997) Hierarchically classifying documents using very few words. In: Proceedings of the 14th international conference on machine learning (ICML97), pp 170–178
Kiritchenko S, Matwin S, Nock R, Famili F (2006) Learning and evaluation in the presence of class hierarchies: application to text categorization. In: Proceedings of the Canadian artificial intelligence conference, QC, Canada
Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. In: Proceedings of the sixth ACM international conference on knowledge discovery and data mining (SIGKDD), Boston
Heuser U, Rosenstiel W (2000) Automatic construction of local internet directories using hierarchical radius-based competitive learning. In: Proceedings of the 4th world multiconference on systemics, cybernetics and informatics (SCI 2000) July 23–26, 2000, Orlando, vol IV (communications systems and networks), pp 436–441 (invited paper)
Zhao Y, Karypis G (2005) Hierarchical clustering algorithms for document datasets. Data Min Knowl Discov 10(2):141–168
MathSciNet Google Scholar
Fung BCM, Wang K, Ester M (2003) Hierarchical document clustering using frequent itemsets. In: Proceedings of the SIAM international conference on data mining. SDM’03, San Francisco, pp 59–70
Kummamuru K, Lotlikar R, Roy S, Singal K, Krishnapuram R (2004) A hierarchical monothetic document clustering algorithm for summarization and browsing search results. In: Proceedings of the 13th international conference on World Wide Web, pp 658–665
Freeman RT, Yin H (2004) Adaptive topological tree structure for document organisation and visualization. Neural Netw 17(8–9):1255–1271
MATH Google Scholar
Grossberg S (1976) Adaptive pattern classification and universal recording: I. Parallel development and coding of neural feature detectors. Biol Cybern 23:121–134
MATH MathSciNet Google Scholar
Vlajic N, Card HC (1998) Categorizing Web Pages using modified ART. In: Proceedings of IEEE 1998 Canadian conference on electrical and computer engineering, Waterloo
Massey L (2002) Structure discovery in text collections. In: Proceedings of KES’2002, sixth international conference on knowledge-based intelligent information and engineering systems, Podere d’Ombriano, Italy
Massey L (2003) On the quality of ART1 text clustering. Neural Netw 5–6(16):771–778
Google Scholar
Massey L (2005) Real-world text clustering with adaptive resonance theory neural networks. In: Proceedings of 2005 international joint conference on neural networks, Montréal, Canada
Beale R, Jackson T (1990) Neural computing: an introduction, IOP Publishing Ltd., Bristol
Carpenter GA, Grossberg S (1987) Invariant pattern recognition and recall by an attentive self-organizing art architecture in a nonstationary world. In: Proceedings of the IEEE first international conference on neural networks, pp II-737–II-745
Georgiopoulos M, Heileman GL, Huang J (1990) Convergence properties of learning in ART1. Neural Comput 2(4):502–509
Article Google Scholar
Moore B (1988) ART and pattern clustering. In: Proceedings of the 1988 Connectionist Models Summer School, pp 174–183
Massey L (2003) Using ART1 neural networks to determine clustering tendency. In: Lotfi A, Garibaldi JM (eds) Applications and science in soft computing. Springer, Heidelberg, pp 17–22
Ishihara S, Ishihara K, Nagamachi M, Matsubara Y (1995) arboART: ART based hierarchical clustering and its application to questionnaire data analysis. In: Proceedings of the IEEE international conference on neural networks, vol 1, pp 532–537
Bartfai G, White R (1997) Adaptive resonance theory–based modular networks for incremental learning of hierarchical clusterings. Connect Sci 9(1):87–112
Google Scholar
Lavoie P, Crespo J-P, Savaria Y (1999) Generalization, discrimination, and multiple categorization using adaptive resonance theory. IEEE Trans Neural Netw 10(4):757–67
Google Scholar
Burke L (1995) Conscientious neural nets for tour construction in the traveling salesman problem: the vigilant net. Comput Oper Res 23(2):121–129
Google Scholar
Bartfai G (1996) An ART-based modular architecture for learning hierarchical clusterings. Neurocomputing 13:31–45
Google Scholar
Larsen B, Aone C (1999) Fast and effective text mining using linear-time document clustering. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 16–22
Massey L (2005) An experimental methodology for text clustering. In: Proceedings of 2005 IASTED international conference on computational intelligence (CI 2005), Calgary, Canada
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47
Google Scholar
VanRijsbergen CJ (1979) Information retrieval. Butterworths, London
Google Scholar
Larkey LS, Croft WB (1996) Combining classifiers in text categorization. In: Proceedings of SIGIR-96, 19th ACM international conference on research and development in information retrieval, Zurich, pp 289–297
Weigend AS, Wiener ED, Pedersen JO (1999) Exploiting hierarchy in text categorization. Inform Retr 1(3):193–216
Google Scholar
D’Alessio S, Murray M, Schiaffino R, Kershenbaum A (1998) Category levels in hierarchical text categorization. In: Proceedings of EMNLP-3, 3rd conference on empirical methods in natural language processing
Hotho A, Staab S, Stumme G (2003) Wordnet improves text document clustering. In: Proceedings of the Semantic Web Workshop of the 26th annual international ACM SIGIR conference, Toronto, Canada

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Royal Military College, Kingston, ON, Canada, K7K 7B4
Louis Massey

Authors

Louis Massey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Louis Massey.

Additional information

This research was supported in part by the National Defense Academic Research Program (ARP) under grant 743321.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Massey, L. Discovery of hierarchical thematic structure in text collections with adaptive resonance theory. Neural Comput & Applic 18, 261–273 (2009). https://doi.org/10.1007/s00521-008-0178-2

Download citation

Received: 28 May 2007
Accepted: 08 February 2008
Published: 29 February 2008
Issue Date: April 2009
DOI: https://doi.org/10.1007/s00521-008-0178-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovery of hierarchical thematic structure in text collections with adaptive resonance theory

Abstract

Access this article

Similar content being viewed by others

Learning Structural Representations of Text Documents in Large Document Collections

A Knowledge Discovery from Full-Text Document Collections Using Clustering and Interpretable Genetic-Fuzzy Systems

A comprehensive and analytical review of text clustering techniques

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovery of hierarchical thematic structure in text collections with adaptive resonance theory

Abstract

Access this article

Similar content being viewed by others

Learning Structural Representations of Text Documents in Large Document Collections

A Knowledge Discovery from Full-Text Document Collections Using Clustering and Interpretable Genetic-Fuzzy Systems

A comprehensive and analytical review of text clustering techniques

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation