skip to main content
10.1145/1809980.1810066acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
research-article

Abordagem não supervisionada para extração de conceitos a partir de textos

Authors Info & Claims
Published:26 October 2008Publication History

ABSTRACT

This paper presents an investigation about concepts extraction from texts using clustering algorithms. We applied a hybrid approach to select feature candidates and the CLUTO tool to support the process of clustering of terms. The analysis of identified concepts was manual. The details and preliminaries results of this approach for portuguese texts are discussed.

References

  1. Azeredo, S., Moraes, S. M. W., and Strube de Lima, V. L. Keywords, k-NN and Neural Networks: a Support for Hierarchical Categorization of Texts in Brazilian Portuguese. In: 6th International Language Resources and Evaluation (LREC'08), Marrakech, may 28--30. European Language Resources Association (ELRA), Morocco, 2008.Google ScholarGoogle Scholar
  2. Bang, S. L, Yang, J. D and Yang, H. J. Hierarchical Document Categorization with k-NN and concept-based thesauri. Information Processing and Management, N° 42, Elsevier, 2006, pp. 387--406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bloehdorn, S., Cimiano P. and Hotho, A. Learning Ontologies to Improve Text clustering and Classification. In: 29th Annual Conference of the German Classification Society (GfKl 2005): From Data and Information Analysis to Knowledge Engineering, Magdeburg, Germany, March 9--11, 2005. Studies in Classification, Data Analysis, and Knowledge Organization, 30, Springer, pp. 334--341, February 2006.Google ScholarGoogle Scholar
  4. Butters, J. and Ciravegna, F. Using Similarity Metrics for Terminology Recognition. In; 6th International Language Resources and Evaluation (LREC'08), Marrakech, may 28--30. European Language Resources Association (ELRA), Morocco, 2008.Google ScholarGoogle Scholar
  5. Edmonds, A. Using conceptual structures for efficient document comparison and location. In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2007), Honolulu, April 1--5. IEEE Symposium Series on Computational Intelligence 2007, Hawaii, USA, 2007, pp 238--242.Google ScholarGoogle ScholarCross RefCross Ref
  6. Frantzi, K. T. and Ananiadou, S. The C/NC value domain independent method for multi-word term extraction. Journal of Natural Language Processing 6, 3, 1999, 145--179.Google ScholarGoogle ScholarCross RefCross Ref
  7. Gamallo, P., Lopes, G. P. and Agustini, A. Inducing Classes of Terms from Text. In: 10th International Conference Text, Speech and Dialogue (TSD 2007), Pilsen, Czech Republic, September 3--7. Lecture Notes in Computer Science, 4649, Springer, 2007, pp. 31--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gonzalez, M. A. I and Strube de Lima, V. L. Tools for Normalization: An Alternative for Lexical Normalization. In: International Conference on Computational Processing of Portuguese, E. Vieira et. al (eds): PROPOR 2006, Lecture Notes in Computer Science, 3960, Springer-Verlag, 2007, pp. 100--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Grefenstette, G. Evaluation techniques for automatic semantic extraction: comparing syntactic and window based approaches. In Branimir Boguraev and James Pustejovsky (eds), Corpus processing for Lexical Acquisition, MIT Press, USA, 1996, pp. 205--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hindle, D. Noun classification from predicate-argument structures. In: 28th Annual Meeting of the Association of Computational Linguistics, ACL, Pittsburgh, Pennsylvania, USA, 1990, pp. 268--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Karypis, G. CLUTO: A clustering Toolkit. University of Minnesota, Department of Computer Science, Minneapolis, Technical Report 02-017. Available from http://glaros.dtc.umn.edu/gkhome/fetch/sw/cluto/manual.pdf (2003), accessed june 2008.Google ScholarGoogle Scholar
  12. Moraes, S. M. W. e Strube de Lima, V. L. Um Estudo sobre Categorização Hierárquica de uma Grande Coleção de Textos em Língua Portuguesa. In: V Workshop em Tecnologia da Informação e Linguagem Humana, XXVII Congresso da SBC, 5--6 julho, SBC, Rio de Janeiro, 2007.Google ScholarGoogle Scholar
  13. Salton, G. Introduction to Modern Information Retrieval. New York: McGraw-Hill, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Spasic, I., Nenadic, G. and Ananiadou, S. Using Domain-Specific Verbs for Term Classification. In: Workshop on Natural Language Processing in Biomedicine, Sapporo, Japan, ACL, 2003, pp. 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sung, S., Chung, S. and McLeod, D. Efficient Concept clustering for Ontology Learning using an Event Life Cycle on the Web. In: ACM Symposium on Applied Computing (SAC), Fortaleza, Ceara, Brazil, March 16--20, 2008, pp. 2310--2314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Vilares, J., Barcala, F. M. and Alonso, M. A. Using Syntactic dependency-pairs conflation to improve retrieval performance in Spanish. In: Internacional Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2002, Mexico, February 17--23. Lectures Notes in Computer Science, 2276, Springer-Verlag, 2002, pp. 381--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yang, H. and Callan, J. Ontology Generation for Large Email Collections. In: 9th Annual International Conference on Digital Government Research, Partnerships for Public Innovation, DG.O 2008, Montreal, Canada, May 18--21. ACM International Conference Proceeding Series, 289, Digital Government Research Center, 2008, pp. 254--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zhang, Z., Iria, J., Brewster, C. and Ciravegna, F. A Comparative Evaluation of Term Recognition Algorithms, In: 6th International Language Resources and Evaluation (LREC'08), Marrakech, may 28--30. European Language Resources Association (ELRA), Morocco, 2008.Google ScholarGoogle Scholar

Index Terms

  1. Abordagem não supervisionada para extração de conceitos a partir de textos

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader