skip to main content
10.1145/2756406.2756943acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
short-paper

Big Data Text Summarization for Events: A Problem Based Learning Course

Published:21 June 2015Publication History

ABSTRACT

Problem/project Based Learning (PBL) is a highly effective student-centered teaching method, where student teams learn by solving problems. This paper describes an instance of PBL applied to digital library education. We show the design, implementation, results, and partial evaluation of a Computational Linguistics course that provides students an opportunity to engage in active learning about adding value to digital libraries with large collections of text, i.e., one aspect of "big data." Students are engaging in PBL with the semester long challenge of generating good English summaries of an event, given a large collection from our webpage archives. Six teams, each working with a different type of event, and applying three different summarization methods, learned how to generate good summaries; these have fair precision relative to the Wikipedia page that describes their event.

References

  1. Buck Institute for Education. Why Project Based Learning (PBL)? Retrieved January, 2015, from http://bie.org/Google ScholarGoogle Scholar
  2. Fox, E. A., Akbar, M., Abdelhamid, S. H. E. M., Elsherbiny, N. I., Farag, M. M. G., Jin, F., Leidig, J. P. and Neppali, S. T. Digital Libraries. In Computing Handbook, Third ed., vol. 2, Chapman & Hall/CRC Press, Taylor and Francis Group, 2014.Google ScholarGoogle Scholar
  3. Fox, E. A. and Leidig, J. P. Digital Library Applications: CBIR, Education, Social Networks, eScience/Simulation, and GIS. Morgan & Claypool Publishers, San Francisco, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yang, S., Chung, H., Lin, X., Lee, S., Chen, L., Andrew Wood, Kavanaugh, A. L., Sheetz, S. D., Shoemaker, D. J. and Fox, E. A. PhaseVis: What, When, Where, and Who in Visualizing the Four Phases of Emergency Management Through the Lens of Social Media. Proceedings of the 10th International ISCRAM Conference. Baden-Baden, Germany, May 12--15, 2013.Google ScholarGoogle Scholar
  5. Goncalves, M. A., Fox, E. A. and Watson, L. T. Towards a Digital Library Theory: A Formal Digital Library Ontology. International Journal Digital Libraries. 8(2): 91--114. doi: 10.1007/s00799-008-0033--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Apache Hadoop. Welcome to Apache Hadoop! Retrieved January, 2015, from http://hadoop.apache.org/Google ScholarGoogle Scholar
  7. Apache Mahout. Latent Dirichlet Allocation. Retrieved January, 2015, from https://mahout.apache.org/users/clustering/latent-dirichlet-allocation.htmlGoogle ScholarGoogle Scholar
  8. Bird, S., Klein, E. and Loper, E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O'Reilly, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Stanford Natural Language Processing Group. Stanford Named Entity Recognizer (NER). Retrieved January, 2015, from http://nlp.stanford.edu/software/CRF-NER.shtmlGoogle ScholarGoogle Scholar
  10. Apache Mahout k-Means clustering - basics. Retrieved January, 2015, from https://mahout.apache.org/users/clustering/k-means-clustering.html.Google ScholarGoogle Scholar

Index Terms

  1. Big Data Text Summarization for Events: A Problem Based Learning Course

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            JCDL '15: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries
            June 2015
            324 pages
            ISBN:9781450335942
            DOI:10.1145/2756406
            • General Chairs:
            • Paul Logasa Bogen,
            • Suzie Allard,
            • Holly Mercer,
            • Micah Beck,
            • Program Chairs:
            • Sally Jo Cunningham,
            • Dion Goh,
            • Geneva Henry

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 21 June 2015

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper

            Acceptance Rates

            JCDL '15 Paper Acceptance Rate18of60submissions,30%Overall Acceptance Rate415of1,482submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader