Abstract
There are two important strategies incomputer-assisted reading and analysis of text(CARAT). The first relates to theclassification process, and the second pertainsto the categorisation process. These twooften-interrelated operations have beenregularly recognised as essential components oftext analysis. However, the two operations arehighly time-consuming. A possible solution tothis problem calls upon more inductive orbottom-up strategies that are numerical andstatistical in nature. In our own research, wehave been exploring a few of these techniquesand their combination. We now know, through ourown past research and others' work, that theclassification methods allow a good empiricalthematic exploration of a corpus. Morespecifically, in this paper we shallconcentrate on the problem of assisting theautomatic categorisation of small segments of aphilosophical text into a set of thematiccategories.
Similar content being viewed by others
References
Alexa M., Zuell C. (1999) A Review of Software for Text Analysis. Mannheim, Zuma.
Bardin L. (1983) L'analyse de contenu. PUF, Paris.
Beaugrande R. (1980) Text Discourse and Process. Longman.
Bouroche J.M., Saporta G. (1980) L'analyse des données. Presses Universitaire de France, Paris.
Carpenter G.A., Grossberg S. (1988) The ART of Adaptative Pattern Recognition by a Self-Organizing Neural Network. IEEE Computer, 12.3, pp. 77-88.
Clark A., Thornton C. (1997) Trading Spaces: Computation, Representation, and the Limits of Uniformed Learning. Behavioral and Brain Sciences, 20, pp. 57-90.
De Jong K.A., Spears W.M., Gordon D. (1993) Using Genetic Algorithms for Concept Learning. In Machine Learning, 13.2-;3, pp. 161-188.
Hayes P.J. (1980) The Logic of Frames. In Metzing D. (ed.), Frame Conceptions and Text Understanding, Walter de Gruyter, New York.
Hearst M. (1994a) Context and Structure in Automated Full-Text Information Access, PhD thesis. University of California, Berkeley.
Hearst M. (1999). In Proceedings of ACL'99: the 37th Annual Meeting of the Association for Computational Linguistic, University of Maryland, June 20-26.
Horgan T., Tiensen J. (1996) Connectionism and the Philosophy of Psychology. MIT Press, Cambridge.
Jansen S., Olesen J., Prebensen H., Tharne T. (1992) Computational Approaches to Text Undestanding. Museum Tuscalanum Press, Copenhaguen.
Landow G.P., Delany P. (eds.) (1993) The Digital Word: Text-Based Computing in the Humanities. MIT Press, Cambridge.
Lewis D.D., Ringuette M. (1994) A Comparison of Two Learning Algorithms for Text Categorization. Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 81-93.
Manning C.D., Schütze H. (1999) Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, Mass.
Mcleod P., Plunkett K., Rolls E.T. (1998) Introduction to Connectionist Modelling of Cognitive Processes. Oxford University Press.
Memmi D. (2000) Le modèle vectoriel pour le traitement de documents. Les cahiers du laboratoire Leibniz, Leibniz-Imag, Grenoble.
Meunier J.G., Memmi D., Gabi K. (1998) Dynamical Knowledge extraction from texts by Art Networks. Proceedings of Neurap. Marseille, pp. 205-210.
Meunier J.G., Remaki L., Forest D. (1999) Use of Classifiers in Computer Assisted Reading and Analysis of Text. Proceedings of the 1999 Internat. Conf. on Imaging Science, Systems, and Technology (CISST'99), pp. 437-443.
Nault G., Rialle V., Meunier J.G. (1999) PROGEN: a Genetic-Based Semi-automatic Hypertext Construction Tool-First Steps and Experiment. InSmith R.E. (eds.), GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, July 13-17. Orlando, Florida USA, Morgan Kaufmann, San Francisco, CA.
Rastier F. et al. (1994) Sémantique pour l'analyse. De la linguistique à l'informatique. Paris.
Robert A.D., Bouillaguet A. (1997) L'analyse de contenu. PUF.
Russell B. (1959) Problems of Philosophy. Oxford University Press, London.
Salton G., Mcgill M. (1983) Introduction to Models of Information Retrieval. McGraw Hill, New York.
Salton G., Buckley C. (1990) Improving Retrieval Performance by Relevance Feedback. Journal of the American Society for Information Science, 41.4, pp. 288-297.
Sebastiani F. (2002) Machine Learning in Automated Text Categorisation: A Survey. ACM Computing Surveys, 34.1, March 2002.
Wermter S., Panchev C., Arevian G. (1999) Hybrid Neural Plausibility Networks for News Agents. Proceedings of AAAI-99, 16th Conference of the American Association for Artificial Intelligence, Menlo Park, AAAI Press, pp. 93-98.
Yang Y., Liu X. (1999) A Re-examination of Text Categorization Methods. Proceedings of SIGIR-99, 22 nd ACM International Conference on Research and Development in Information Retrieval, ACM Press, New York, pp. 42-49.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Pasquale, JF.d., Meunier, JG. Categorisation Techniques in Computer-Assisted Reading and Analysis of Texts (CARAT) in the Humanities. Computers and the Humanities 37, 111–118 (2003). https://doi.org/10.1023/A:1021855607270
Issue Date:
DOI: https://doi.org/10.1023/A:1021855607270