skip to main content
10.1145/2637748.2638409acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesi-knowConference Proceedingsconference-collections
research-article

Domain-independent sentence type classification: examining the scenarios of scientific abstracts and scrum protocols

Published: 16 September 2014 Publication History

Abstract

The amount of available textual information in everybody's daily environment is increasing steadily. To satisfy a user's information needs, the user has to examine numerous documents until the required information has been found. Additionally, the relevant information is often contained in only short sections of the considered documents. This leads to a high amount of irrelevant text the user has to read what could be solved by filtering relevant information within textual documents automatically. In this article we present our findings on the classification of sentences according to the type of information contained. Our evaluation has been conducted on documents from the field of abstracts of scientific publications and protocols of Scrum retrospective meetings. The results show the feasibility of our approach for finding a higher percentage of relevant information within textual documents and hence reducing the information overload for the users.

References

[1]
S. Baccianella, A. Esuli, and F. Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, volume 10, pages 2200--2204, 2010.
[2]
R. Daniel. Domain-independent mining of abstracts using indicator phrases. D-Lib Magazine, 18(7/8), July 2012.
[3]
E. de Maat, K. Krabben, and R. Winkels. Machine learning versus knowledge based classification of legal texts. In Proceedings of the 2010 conference on Legal Knowledge and Information Systems: JURIX 2010: The Twenty-Third Annual Conference, pages 87--96. IOS Press, 2010.
[4]
M.-C. De Marneffe, B. MacCartney, and C. D. Manning. Generating typed dependency parses from phrase structure parses. In Proceedings of LREC, volume 6, pages 449--454, 2006.
[5]
Y. Guo, A. Korhonen, M. Liakata, I. S. Karolinska, L. Sun, and U. Stenius. Identifying the information structure of scientific abstracts: An investigation of three different schemes. In Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP '10, page 99--107, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.
[6]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009.
[7]
G. H. John and P. Langley. Estimating continuous distributions in bayesian classifiers. In Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338--345, San Mateo, 1995. Morgan Kaufmann.
[8]
A. Khoo, Y. Marom, and D. Albrecht. Experiments with sentence classification. In Proceedings of the 2006 Australasian language technology workshop, pages 18--25, 2006.
[9]
M. Liakata, S. Saha, S. Dobnik, C. Batchelor, and D. Rebholz-Schuhmann. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics, 28(7):991--1000, 2012.
[10]
S. Mukherjee and P. Bhattacharyya. Sentiment analysis in twitter with lightweight discourse analysis. In COLING, pages 1847--1864, 2012.
[11]
M. Naughton, N. Stokes, and J. Carthy. Sentence-level event classification in unstructured texts. Information retrieval, 13(2):132--156, 2010.
[12]
J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schoelkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning. MIT Press, 1998.
[13]
R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA, 1993.
[14]
G. Salton and C. Buckley. Term Weighting Approaches in Automatic Text Retrieval. Information Processing Management, 24(5):513--523, 1988.
[15]
S. Teufel, A. Siddharthan, and C. Batchelor. Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3-Volume 3, page 1493--1502. Association for Computational Linguistics, 2009.
[16]
P. Warnier and C. Nédellec. Sentence filtering for bionlp: Searching for renaming acts. In Proceedings of the BioNLP Shared Task 2011 Workshop, BioNLP Shared Task '11, pages 121--129, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics.
[17]
S. Yelati and R. Sangal. Novel approach for tagging of discourse segments in help-desk e-mails. In Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Volume 03, pages 369--372. IEEE Computer Society, 2011.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
i-KNOW '14: Proceedings of the 14th International Conference on Knowledge Technologies and Data-driven Business
September 2014
262 pages
ISBN:9781450327695
DOI:10.1145/2637748
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. information extraction
  2. knowledge discovery
  3. text classification

Qualifiers

  • Research-article

Funding Sources

Conference

i-KNOW '14

Acceptance Rates

i-KNOW '14 Paper Acceptance Rate 25 of 73 submissions, 34%;
Overall Acceptance Rate 77 of 238 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 96
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media