skip to main content
10.1145/3180308.3180318acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
demonstration

VisPod: Content-Based Audio Visual Navigation

Published:05 March 2018Publication History

ABSTRACT

Current audio player interfaces generally provide brief information such as title and duration time and support basic playback control functions. These features alone are not sufficient for certain user tasks, such as quickly finding a previously-visited location or browsing the main topics covered in the audio content. We present VisPod, a visual audio player that visually displays the main topics and keywords extracted from the transcript. VisPod supports (1) audio content browsing, (2) topic-based and keyword-based navigation, (3) communication of transcript and speaker information in real time, and (4) content-based query. VisPod encodes audio as a donut chart comprised of topic segments, and uses text processing algorithms to segment the transcript into independent topics and utilizes a deep learning model to generate human-readable topic names. An informal study suggests users prefer VisPod over traditional audio playback approaches specifically with regards to its benefits for audio browsing and navigation.

References

  1. Marti A Hearst. 1997. TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational linguistics 23, 1 (1997), 33--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Konstantin Lopyrev. 2015. Generating news headlines with recurrent neural networks. arXiv preprint arXiv:1512.01712 (2015).Google ScholarGoogle Scholar
  3. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Text.. In EMNLP, Vol. 4. 404--411.Google ScholarGoogle Scholar
  4. Alex Rudnicky. 2010. Sphinx knowledge base tool. (2010). http://www.speech.cs.cmu.edu/tools/lmtool.htmlGoogle ScholarGoogle Scholar
  5. Jiahong Yuan and Mark Liberman. 2008. Speaker identification on the SCOTUS corpus. Journal of the Acoustical Society of America 123, 5 (2008), 3878.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. VisPod: Content-Based Audio Visual Navigation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IUI '18 Companion: Companion Proceedings of the 23rd International Conference on Intelligent User Interfaces
        March 2018
        141 pages
        ISBN:9781450355711
        DOI:10.1145/3180308

        Copyright © 2018 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 March 2018

        Check for updates

        Qualifiers

        • demonstration
        • Research
        • Refereed limited

        Acceptance Rates

        IUI '18 Companion Paper Acceptance Rate63of127submissions,50%Overall Acceptance Rate746of2,811submissions,27%
      • Article Metrics

        • Downloads (Last 12 months)5
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader