short-paper

Vapor Engine: Demonstrating an Early Prototype of a Language-Independent Search Engine for Speech

Authors:

Douglas W. Oard,

Rashmi Sankepally,

Craig HarmanAuthors Info & Claims

CHIIR '16: Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval

Pages 301 - 304

https://doi.org/10.1145/2854946.2854987

Published: 13 March 2016 Publication History

Abstract

Typical search engines for spoken content begin with some form of language-specific audio processing such as phonetic word recognition. Many languages, however, lack the language tuned preprocessing tools that are needed to create indexing terms for speech. One approach in such cases is to rely on repetition, detected using acoustic features, to find terms that might be worth indexing. Experiments have shown that this approach yields term sets that might be sufficient for some applications in both spoken term detection and ranked retrieval experiments. Such approaches currently work only with spoken queries, however, and only when the searcher is able to speak in a manner similar to that of the speakers in the collection. This demonstration paper proposes Vapor Engine, a new tool for selectively transcribing repeated terms that can be automatically detected from spoken content in any language. These transcribed terms could then be matched to queries formulated using written terms. Vapor Engine is early in development: it currently supports only single-term queries and has not yet having been formally evaluated. This paper introduces the interface and summarizes the challenges it seeks to address.

References

[1]

F. Abdulhamid and S. Marshall. Treemaps to visualise and navigate speech audio. In OZCHI, 2013.

Digital Library

[2]

L. Begeja et al. A system for searching and browsing spoken communications. In NAACL-HLT, 2004.

Digital Library

[3]

J. Cui et al. Easyalbum: an interactive photo annotation system based on face clustering and re-ranking. In CHI, 2007.

Digital Library

[4]

M. Dredze et al. NLP on spoken documents without ASR. In EMNLP, 2010.

Digital Library

[5]

J. Goldstein et al. Annotating subsets of the enron email corpus. In CEAS, 2006.

[6]

A. Jansen and B. Van Durme. Efficient spoken term discovery using randomized algorithms. In ASRU, 2011.

[7]

S. Luz et al. Supporting collaborative transcription of recorded speech with a 3D game interface. In KES, 2010.

Digital Library

[8]

M. Marge et al. Using the Amazon Mechanical Turk to transcribe and annotate meeting speech for extractive summarization. In NAACL-HLT, 2010.

Digital Library

[9]

A. Muscariello et al. Unsupervised motif acquisition in speech via seeded discovery and template matching combination. In ICASSP, 2012.

Digital Library

[10]

K. Ng. Subword-Based Approaches for Spoken Document Retrieval. PhD thesis, MIT, 1990.

Digital Library

[11]

S. Novotney and C. Callison-Burch. Cheap, fast and good enough: Automatic speech recognition with non-expert transcription. In NAACL-HLT, 2010.

Digital Library

[12]

D. Oard et al. The FIRE 2013 question answering for the spoken web task. In FIRE, 2013.

Digital Library

[13]

A. Park and J. Glass. Unsupervised pattern discovery in speech. In ICASSP, 2008.

Digital Library

[14]

M. A. Pitt et al. The Buckeye corpus of conversational speech: Labeling conventions and a test of transcriber reliability. Speech Communication, 2005.

[15]

A. Ranjan et al. Searching in audio: the utility of transcripts, dichotic presentation, and time-compression. In CHI, 2006.

Digital Library

[16]

J. White et al. Using zero-resource spoken term discovery for ranked retrieval. In NAACL-HLT, 2015.

[17]

S. Whittaker et al. SCAN: Designing and evaluating user interfaces to support retrieval from speech archives. In SIGIR, 1999.

Digital Library

[18]

Y. Zhang and J. Glass. Towards multi-speaker unsupervised speech pattern discovery. In ICASSP, 2010.

Cited By

White JOard DLim EWinslett MSanderson MFu ASun JCulpepper SLo EHo JDonato DAgrawal RZheng YCastillo CSun ATseng VLi C(2017)Simulating Zero-Resource Spoken Term DiscoveryProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3133160(2371-2374)Online publication date: 6-Nov-2017
https://dl.acm.org/doi/10.1145/3132847.3133160

Index Terms

Vapor Engine: Demonstrating an Early Prototype of a Language-Independent Search Engine for Speech
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Speech / audio search

Recommendations

Vocabulary independent spoken term detection
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

We are interested in retrieving information from speech data like broadcast news, telephone conversations and roundtable meetings. Today, most systems use large vocabulary continuous speech recognition tools to produce word transcripts; the transcripts ...
Combining LVCSR and vocabulary-independent ranked utterance retrieval for robust speech search
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Well tuned Large-Vocabulary Continuous Speech Recognition (LVCSR) has been shown to generally be more effective than vocabulary-independent techniques for ranked retrieval of spoken content when one or the other approach is used alone. Tuning LVCSR ...
Spoken information retrieval for turkish broadcast news
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Speech Retrieval systems utilize automatic speech recognition (ASR) to generate textual data for indexing. However, automatic transcriptions include errors, either because of out-of-vocabulary (OOV) words or due to ASR inaccuracy. In this work, we ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHIIR '16: Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval

March 2016

400 pages

ISBN:9781450337519

DOI:10.1145/2854946

General Chairs:
Diane Kelly
University of North Carolina at Chapel Hill, USA
,
Rob Capra
University of North Carolina at Chapel Hill, USA
,
Program Chairs:
Nick Belkin
Rutgers University, USA
,
Jaime Teevan
Microsoft Research, USA
,
Pertti Vakkari
University of Tampere, Finland

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

In-Cooperation

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National Science Foundation

Conference

CHIIR '16

Sponsor:

SIGIR

CHIIR '16: Conference on Human Information Interaction and Retrieval

March 13 - 17, 2016

North Carolina, Carrboro, USA

Acceptance Rates

CHIIR '16 Paper Acceptance Rate 23 of 58 submissions, 40%;

Overall Acceptance Rate 55 of 163 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
97
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

White JOard DLim EWinslett MSanderson MFu ASun JCulpepper SLo EHo JDonato DAgrawal RZheng YCastillo CSun ATseng VLi C(2017)Simulating Zero-Resource Spoken Term DiscoveryProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3133160(2371-2374)Online publication date: 6-Nov-2017
https://dl.acm.org/doi/10.1145/3132847.3133160

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten