Abstract
This paper presents SuPor, an environment for extractive Automatic Summarization of texts written in Brazilian Portuguese, which can be explored by a specialist on AS to select promising strategic features for extraction. By combining any number of features, SuPor actually entitles one to investigate the performance of distinct AS systems and identify which groups of features are more adequate for Brazilian Portuguese. One of its systems has outperformed six other extractive summarizers, signaling a significant grouping of features, as shown in this paper.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aires, R.V.X., Aluísio, S.M., Kuhn, D.C.S., Andreeta, M.L.B., Oliveira Jr., O.N.: Combining Multiple Classifiers to Improve Part of Speech Tagging: A Case Study for Brazilian Portuguese. In: The Proc. of the Brazilian Symposium on Artificial Intelligence, Atibaia – SP, Brasil (2000)
Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: The Proc. of the Intelligent Scalable Text Summarization Workshop, Madri, Spain (1997); Also In: Mani, I., Maybury, M.T. (eds.) Advances in Automatic Text Summarization. MIT Press, New York, pp. 111–121 (1999)
Caldas Jr., J., Imamura, C.Y.M., Rezende, S.O.: Avaliação de um algoritmo de Stemming para a Língua Portuguesa. In: Proceedings of the 2nd Congress of Logic Applied to Technology – LABTEC 2001, Faculdade SENAC de Ciências Exatas e Tecnologia, São Paulo, Brasil, vol. II, pp. 267–274 (2001)
Dias-da-Silva, B.C., Oliveira, M.F., Moraes, H.R., Paschoalino, C., Hasegawa, R., Amorin, D., Nascimento, A.C.: Construção de um Thesaurus Eletrônico para o Português do Brasil. In: Proceedings of the V Encontro para o Processamento Computacional da Língua Portuguesa Escrita e Falada (PROPOR 2000), Atibaia – SP, pp. 1–11 (2000)
Edmundson, H.P.: New Methods in Automatic Extracting. Journal for Computing Machinery 16(2), 264–285 (1969)
Greghi, J.G., Martins, R.T., Nunes, M.G.V.: Diadorim: a Lexical database for Brazilian Portuguese. In: Rodríguez, M.G., Araujo, C.P.S. (eds.) Proceedings of the Third International Conference on Language Resources and Evaluation LREC 2002, Las Palmas, vol. IV, pp. 1346–1350 (2002)
Halliday, M.A.K., Hasan, R.: Cohesion in English. Longman (1976)
Hearst, M.A.: TextTiling: A Quantitative Approach to Discourse Segmentation. Technical Report 93/24. University of California, Berkeley (1993)
Hoey, M.: Patterns of Lexis in Text. Oxford University Press, Oxford (1991)
Kohonen, T.: Self-organized formation of topologically correct feature maps. Biological Cybernetics 43, 59–69 (1982)
Kupiec, J., Petersen, J., Chen, F.: A trainable document summarizer. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) Proceedings of the 18th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA. EUA, July 1995, pp. 68–73 (1995)
Larocca Neto, J., Santos, A.D., Kaestner, A.A., Freitas, A.A.: Document clustering and text summarization. In: The Proceedings of the 4th Int. Conf. on Practical Applications of Knowledge Discovery and Data Mining (PADD 2000), London, pp. 41–55 (2000)
Larocca Neto, J., Freitas, A.A., Kaestner, C.A.: Automatic Text Summarization using a ML Approach. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 205–215. Springer, Heidelberg (2002)
Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development 2(2), 159–165 (1958)
Mani, I.: Automatic Summarization. John Benjamin’s Publishing Company, USA (2001)
Mani, I., Maybury, M.T. (eds.): Advances in automatic text summarization. MIT Press, Cambridge (1999)
Martins, R.T., Hasegawa, R., Nunes, M.G.V.: Curupira: um Parser Funcional para o Português. Relatório Técnico do NILC, NILC-TR-02-26. São Carlos, Dezembro, 43 p. (2002)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An On-line Lexical Database (April 2003), ftp://www.cogsci.princeton.edu/pub/wordnet/5papers.ps
Módolo, M.: SuPor: an Environment for Exploration of Extractive Methods for Automatic Text Summarization for Portuguese. MSc. Dissertation. Departamento de Computação, UFSCar (2003) (in Portuguese)
Nunes, M.G.V., Vieira, F.M.C., Zavaglia, C., Sossolote, C.R.C., Hernandez, J.: A Construção de um Léxico da Língua Portuguesa do Brasil para suporte à Correção Automática de Textos. Relatórios Técnicos do ICMC-USP, Nro. 42. Setembro, 36 p. (1996)
Orasan, C., Mitkov, R., Hasler, L.: CAST: a Computer-Aided Summarisation Tool. In: The Proceedings of the 10th Conference of The European Chapter of the Association for Computational Linguistics (EACL 2003), Budapest, Hungary (2003)
Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: GistSumm: A Summarization Tool Based on a New Extractive Method. In: Mamede, N.J., Baptista, J., Trancoso, I., Nunes, M.d.G.V. (eds.) PROPOR 2003. LNCS, vol. 2721, pp. 210–218. Springer, Heidelberg (2003a)
Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: NeuralSumm: A Connexionist Approach to Automatic Text Summarization (in Portuguese). In: Anais do IV Encontro Nacional de Inteligência Artificial – ENIA 2003, XXII Cong. Nac.da SBC. Campinas – SP (2003b)
Pardo, T.A.S., e Rino, L.H.M.: TeMário: A Corpus for Automatic Text Summarization (in Portuguese). NILC Tech. Report. NILC-TR-03-09. São Carlos, Outubro, 12 p. (2003)
Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Radev, D., Teufel, S., Saggion, H., Lam, W., Blitzer, J., Qi, H., Çelebi, A., Liu, D., Drabek, E.: Evaluation challenges in large-scale document summarization. In: The Proc. of the 41st Annual Meeting of the Association for Computational Linguistics, July 2003, pp. 375–382 (2003)
Ratnaparkhi, A.: A Maximum Entropy Part-Of-Speech Tagger. In: Proceedings of the Empirical Methods in Natural Language Processing Conference, University of Pennsylvania, USA, May 17-18 (1996)
Rino, L.H.M., Pardo, T.A.S., Silla Jr., C.N., Kaestner, C.A., Pombo, M.: A Comparison of Automatic Summarization Systems for Brazilian Portuguese Texts. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS, vol. 3171, pp. 235–244. Springer, Heidelberg (2004)
Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic Text Structuring and Summarization. Information Processing & Management 33(2), 193–207 (1997)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24, 513–523 (1988); Reprinted In: Sparck-Jones, K., Willet, P. (eds.) Readings in Information Retrieval, pp. 323–328. Morgan Kaufmann (1997)
Teufel, S., Moens, M.: Argumentative Classification of Extracted Sentences as a First Step Towards Flexible Abstracting. In: Mani, I., Maybury, M.T. (eds.) Advances in Automatic Text Summarization, pp. 155–175. MIT Press, Cambridge (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rino, L.H.M., Módolo, M. (2004). SuPor: An Environment for AS of Texts in Brazilian Portuguese. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds) Advances in Natural Language Processing. EsTAL 2004. Lecture Notes in Computer Science(), vol 3230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30228-5_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-30228-5_37
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23498-2
Online ISBN: 978-3-540-30228-5
eBook Packages: Springer Book Archive