skip to main content
10.1145/1124772.1124822acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Article

Time is of the essence: an evaluation of temporal compression algorithms

Published:22 April 2006Publication History

ABSTRACT

Although speech is a potentially rich information source, a major barrier to exploiting speech archives is the lack of useful tools for efficiently accessing lengthy speech recordings. This paper develops and evaluates techniques for temporal compression - reducing the time people take to listen to a recording while still extracting critical information. We first describe an exploratory study that identifies novel excision techniques that remove unimportant words or utterances from the recording. We then develop a new method for evaluating how well temporal compression supports users in forming a general understanding of a recording. Applying this method, we demonstrate that excision techniques are generally more effective than standard compression techniques that simply speed up the entire recording.

References

  1. AMI Project. http://www.amiproject.org/Google ScholarGoogle Scholar
  2. Arons, B. SpeechSkimmer: A system for interactively skimming recorded speech. ACM Trans. Computer-Human Interaction 4, 1 (1997), 3--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Baeza-Yates, R. and Ribeiro-Neto, B. Modern Information Retrieval. Addison Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Beasley, D.S. and Maki, J.E. Time and frequency altered speech. In Contemporary Issues in Experimental Phonetics, Academic Press, (1976), 419--458.Google ScholarGoogle Scholar
  5. Chalfonte, B.L., Fish, R.S. and Kraut, R. Expressive richness: A comparison of speech and text as Media for Revision. Proc. CHI 1991, (1991), 21--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Covell, M., Withgott, M. and Slaney, M. Mach1: Nonuniform time-scale modification of speech. Proc. IEEE ICASSP 1998, (1998), 493--496.Google ScholarGoogle ScholarCross RefCross Ref
  7. Cutler, R., Rui, Y., Gupta, A. Cadiz, J.J. Tashev, I., He, L., Colburn, A., Zhang, Z., Liu, Z. and Silverberg, S. Distributed meetings: A meeting capture and broadcasting system. Proc. 10th ACM International Conf on Multimedia, (2002), 503--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Garofolo, J., Auzanne, C.G.P. and Voorhees, E.M. The TREC-9 spoken document retrieval track: A success story. Proc. RIAO-2000, (2000).Google ScholarGoogle Scholar
  9. Hays, W.L. Statistics for the Social Sciences. Holt, Rinehart and Winston, 1973.Google ScholarGoogle Scholar
  10. He, L. and Gupta, A. User benefits of non-linear time compression. Microsoft Research Technical Report MSR-TR-2000-96, Microsoft, (2000).Google ScholarGoogle Scholar
  11. Hejna, D. Real-time time-scale modification of speech via the synchronized overlap-add algorithm. MSc Dissertation, M.I.T., (1990).Google ScholarGoogle Scholar
  12. Hori, C. and Furui, S. A new approach to automatic speech summarization. IEEE Trans. Multimedia 5, 3 (2003), 368--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lin, C-W. ROUGE: A package for automatic evaluation of summaries. Proceedings of ACL 2004, (2004), 56--60.Google ScholarGoogle Scholar
  14. McKeown, K., Hirschberg, J., Galley, M. and Maskey, S. From text to speech summarization. In Proc. of ICASSP 2005, (2005).Google ScholarGoogle ScholarCross RefCross Ref
  15. MLMI 2005. http://groups.inf.ed.ac.uk/mlmi05/techprog.html.Google ScholarGoogle Scholar
  16. Morgan, N., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Janin, A., Pfau, T., Shriberg, E. and Stolcke, A. The meeting project at ICSI. Proc. HLT Conference, (2001), 246--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nenkova, A. and Passonneau, R. Evaluating content selection in summarization: the pyramid model. In Proc HLT-NAACL 2004, (2004), 145--152.Google ScholarGoogle Scholar
  18. Sticht, T.G. Comprehension of repeated time-compression recordings. Journal of Experimental Education 37, 4 (1969).Google ScholarGoogle ScholarCross RefCross Ref
  19. Stifelman, L. Augmenting real-world objects: A paper-based audio notebook. In Proc. CHI 1996, (1996), 199--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tucker, S. and Whittaker, S. Accessing multimodal meeting data: systems, problems and possibilities. In Lecture Notes in Computer Science 3361, (2005), 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tucker, S. and Whittaker, S. Novel techniques for time-compressing speech: An exploratory study. In Proc of ICASSP 2005, (2005).Google ScholarGoogle ScholarCross RefCross Ref
  22. Vemuri, S., DeCamp, P., Bender, W. and Schmandt, C. Improving speech playback using time-compression and speech recognition. In Proc. CHI 2004, (2004), 295--302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Voorhees, E.M. and Buckland, L.P. The Thirteenth Text REtrieval Conference Proceedings. NIST Special Publication, (2004).Google ScholarGoogle Scholar
  24. Walker, M., Prasad, R. and Stent, A. A trainable generator for recommendations in multimodal dialog. In EUROSPEECH: European Conference on Speech Processing, (2003), 1697--1701.Google ScholarGoogle Scholar
  25. Wellner, P., Flynn, M., Tucker, S. and Whittaker, S. A meeting browser evaluation test. In Proc. CHI 2005, (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Whittaker, S., Hirschberg, J., Amento, B., Stark, L., Bacchiani, M., Isenhour, P., Stead, L., Zamchick, G. and Rosenberg, A. SCANMail: A voicemail interface that makes speech browsable, readable and searchable. In Proc. CHI 2002, (2002), 275--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Whittaker, S., and Amento, B. Semantic speech editing. In Proc. CHI 2004, (2004), 527--534. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Time is of the essence: an evaluation of temporal compression algorithms

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CHI '06: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
        April 2006
        1353 pages
        ISBN:1595933727
        DOI:10.1145/1124772

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 April 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate6,199of26,314submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader