skip to main content
10.1145/2232817.2232854acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Learning topics and related passages in books

Published:10 June 2012Publication History

ABSTRACT

The number of books available online is increasing, but user interfaces may not be taking full advantage of advances in machine learning techniques that could help users navigate, explore, discover and understand interesting and useful content in books. Using a group of ten students and over one thousand crowdsourced judgments, we conducted multiple user studies to evaluate topics and related passages in books, all learned by topic modeling. Using ten books, selected from humanities (e.g. Plato's Republic), social sciences (e.g. Marx's Capital) and sciences (e.g. Einstein's Relativity), and four different evaluation experiments, we show that users agree that the learned topics are coherent and important to the book, and related to the automatically generated passages. We show how crowdsourced evaluations are useful, and can complement more focused evaluations using students who have studied the texts. This work provides a framework for (1) learning topics and related passages in books, and (2) evaluating those learned topics and passages, and moves one step toward automatic annotation to support topic navigation of books.

References

  1. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. The Journal of Machine Learning Research, 3, 993-1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hearst, M. A. (1997). TextTiling: Segmenting text into multiparagraph subtopic passages. Computational linguistics, 23(1), 33-6. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning topics and related passages in books

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
      June 2012
      458 pages
      ISBN:9781450311540
      DOI:10.1145/2232817

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 June 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Author Tags

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate415of1,482submissions,28%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader