Skip to main content
Log in

Vocal Access to a Newspaper Archive: Assessing the Limitations of Current Voice Information Access Technology

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

This paper presents the design and the current prototype implementation of an interactive vocal Information Retrieval system that can be used to access articles of a large newspaper archive using a telephone. The implementation of the system highlights the limitations of current voice information retrieval technology, in particular of speech recognition and synthesis. We present our evaluation of these limitations and address the feasibility of intelligent interactive vocal information access systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allan, J. (2002). Perspectives on Information Retrieval and Speech. In A.R. Coden, E.W. Brown, and S. Srinivasan (Eds.), Information Retrieval Techniques for Speech Applications (pp. 1–10). Berlin, Germany: Springer-Verlag.

    Google Scholar 

  • Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., and Kuo, S.W. (1997). Experiments in Spoken Queries for Document Retrieval. In Eurospeech 97, vol. 3, Rodhes, Greece, pp. 1323–1326.

    Google Scholar 

  • Bernsen, N.O., Dybkjoer, H., and Dybkjoer, L. (1997). What Should Your Speech System Say? IEEE Computer 25–31.

  • Callan, J.P. (1994). Passage-Level Evidence in Document Retrieval. In Proceedings of ACM SIGIR, Dublin, Ireland (pp. 302–310).

  • Crestani, F. (2000a). Combination of Semantic and Phonetic Term Similarity for Spoken Document Retrieval and Spoken Query Processing. In Proceedings of the 8th Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU), Madrid, Spain (pp. 960–967).

  • Crestani, F. (2000b). Effects ofWord Recognition Errors in Spoken Query Processing. In Proceedings of the IEEE ADL 2000 Conference, Washington DC, USA (pp. 39–47).

  • Crestani, F. (2000c). Exploiting the Similarity of Non-Matching Terms at Retrieval Time. Journal of Information Retrieval, 2(1), 23–43.

    Google Scholar 

  • Crestani, F. (2000d).Word Recognition Errors and Relevance Feedback in Spoken Query Processing. In Proceedings of Fourth Internation Conference on Flexible Query Answering Systems, Warsaw, Poland (pp. 267–281).

  • Crestani, F., Lalmas, M., van Rijsbergen, C.J., and Campbell, I. (1998). Is this Document Relevant?...Probably. A Survey of Probabilistic Models in Information Retrieval. ACM Computing Surveys, 30(4), 528–552.

    Google Scholar 

  • Dutoit, T. (1997a). High Quality Text-to-Speech Synthesis: An Overview. Journal of Electrical and Electronics Engineering, Australia, 17(1), 25–37.

    Google Scholar 

  • Dutoit, T. (1997b). An Introduction to Text-To-Speech Synthesis. Dordrecht, The Netherlands: Kluwer Academic Publishers.

    Google Scholar 

  • Frakes, W.R. and Baeza-Yates, R. (Eds.) (1992). Information Retrieval: Data Structures and Algorithms. Englewood Cliffs, New Jersey, USA: Prentice Hall.

    Google Scholar 

  • Garofolo, J.S., Auzanne, C.G.P., and Voorhees, E.M. (1999). The TREC Spoken Document Retrieval Track: A Success Story. In Proceedings of the TREC Conference, Gaithersburg, MD, USA (pp. 107–130).

  • Harman, D. (1992). Relevance Feedback and Other Query Modification Techniques. In W.B. Frakes and R. Baeza-Yates (Eds.), Information Retrieval: Data Structures and Algorithms, ch. 11, Englewood Cliffs, New Jersey, USA: Prentice Hall.

    Google Scholar 

  • Jurafsky, D. and Martin, J.H. (2000). Speech and Language Processing. Upper Saddle River, NJ, USA: Prentice Hall.

    Google Scholar 

  • Kao, Y.H., Hemphill, C.T., Wheatley, B.J., and Rajasekaran, P.K. (1994). Toward Vocabulary Independent Telephone Speech Recognition. In Proccedings of ICASSP'94, vol. 1, Adelaide, Australia (pp. 117–120).

    Google Scholar 

  • Kim, J. and Oard, W. (2002). The Use of Speech Retrieval Systems: A Study Design. In A.R. Coden, E.W. Brown, and S. Srinivasan (Eds.), Information Retrieval Techniques for Speech Applications, Berlin, Germany: Springer-Verlag (pp. 87–93).

    Google Scholar 

  • Markowitz, J.A. (1996). Using Speech Recognition. Upper Saddle River, NJ, USA: Prentice Hall.

    Google Scholar 

  • Miller, S. (1984). Experimental Design and Statistics, 2nd Edn. London, UK: Routledge.

    Google Scholar 

  • Mittendorf, E. and Schauble, P. (1996). Measuring the Effects of Data Corruption on Information Retrieval. In Proceedings of the SDAIR 96 Conference, Las Vegas, NV, USA (pp. 179–189).

  • Peckham, J. (1991). Speech Understanding and Dialogue over the Telephone: An Overview of the ESPRIT SUNDIAL Project. In Proceedings of theWorkshop on Speech and Natural Language, Pacific Grove, CA, USA (pp. 14–27).

  • Peckham, J. (1996). Speech Understanding and Dialogue over the Telephone. In K. Varghese, S. Pfleger, and J.-P. Lefevre (Eds.), Advanced Speech Applications (pp. 112–125). Berlin, Germany: Springer-Verlag.

    Google Scholar 

  • Porter, M.F. (1980). An Algorithm for Suffix Stripping. Program, 14(3), 130–137.

    Google Scholar 

  • Sanderson, M. (1996). Word Sense Disambiguation and Information Retrieval. Ph.D. Thesis, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK.

    Google Scholar 

  • Silipo, R. and Crestani, F. (2000). Prosodic Stress and Topic Detection in Spoken Sentences. In Proceedings of the SPIRE 2000, the Seventh Symposium on String Processing and Information Retrieval, La Corunna, Spain (pp. 243–252).

  • Singhal, A., Choi, J., Hindle, D., Lewis, D.D., and Pereira, F. (1998). AT&T at TREC-7. In Proceedings of the TREC Conference, Washington DC, USA (pp. 239–253).

  • Smith, R.W. and Hipp, D.R. (1994). Spoken Natural Language Dialog Systems: A Practical Approach. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Stolcke, A., Shriberg, E., Hakkani-Tur, D., Tur, G., Rivlin, Z., and Sonmez, K. (1999). Combining Words and Speech Prosody for Automatic Topic Segmentation. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Washington DC, USA.

  • Tombros, A. and Crestani, F. (2000). Users's Perception of Relevance of Spoken Documents. Journal of the American Society for Information Science, 51(9), 929–939.

    Google Scholar 

  • Tombros, A. and Sanderson, M. (1998). Advantages of Query Biased Summaries in Information Retrieval. In Proceedings of ACM SIGIR, Melbourne, Australia (pp. 2–10).

  • van Rijsbergen, C.J. (1979). Information Retrieval, 2nd Edn., London, UK: Butterworths.

    Google Scholar 

  • Voorhees, E., Garofolo, J., and Sparck Jones, K. (1997). The TREC-6 Spoken Document Retrieval Track. In TREC-6 notebook, NIST, Gaithersburgh, MD, USA (pp. 167–170).

  • Voorhees, E.M. and Harman, D. (1998). Overview of the Seventh Text Retrieval Conference (TREC-7). In Proceedings of the TREC Conference, Gaithersburg, MD, USA (pp. 1–24).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Crestani, F. Vocal Access to a Newspaper Archive: Assessing the Limitations of Current Voice Information Access Technology. Journal of Intelligent Information Systems 20, 161–180 (2003). https://doi.org/10.1023/A:1021824019028

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1021824019028

Navigation