Skip to main content

Asking Questions Framework for Oral History Archives

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2024)

Abstract

The importance of oral history archives in preserving and understanding past experiences is counterbalanced by the challenges encountered in accessing and searching through them, primarily due to their extensive size and the diverse demographics of the speakers. This paper presents an approach combining ASR technology and Transformer-based neural networks into the Asking questions framework. Its primary function is to generate questions accompanied by concise answers that relate to the topics discussed in each interview segment. Additionally, we introduce a semantic continuity model that filters the generated questions, ensuring that only the most relevant ones are retained. This enables a real-time semantic search through thousands of hours of recordings, with the crucial benefit that the speakers’ original words remain unaltered and still semantically align with the query. While the method is exemplified using a specific publicly available archive, its applicability extends universally to datasets of a similar nature.

This research was supported by the Czech Science Foundation (GA CR), project No. GA22-27800S, and by the grant of the University of West Bohemia, project No. SGS-2022-017.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://sfi.usc.edu/what-we-do/collections.

  2. 2.

    https://www.ushmm.org/.

  3. 3.

    https://github.com/honzas83/t5s.

  4. 4.

    Demonstrator available at https://malach-aq.kky.zcu.cz.

  5. 5.

    https://www.youtube.com/uscshoahfoundation.

  6. 6.

    https://www.sbert.net.

References

  1. USC Shoah Foundation Oral History with Abraham Bomba | Experiencing History: Holocaust Sources in Context. https://perspectives.ushmm.org/. Accessed 12 Apr 2023

  2. Baevski, A., Zhou, Y., Mohamed, A., Auli, M.: Wav2Vec 2.0: a framework for self-supervised learning of speech representations. In: Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460 (2020)

    Google Scholar 

  3. Chen, G., et al.: Gigaspeech: an evolving, multi-domain ASR corpus with 10,000 hours of transcribed audio. In: Proceedings of Interspeech 2021 (2021)

    Google Scholar 

  4. Gospodinov, M., MacAvaney, S., Macdonald, C.: Doc2query-: when less is more. In: Kamps, J., et al. (eds.) ECIR 2023. LNCS, vol. 13981, pp. 414–422. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-28238-6_31

    Chapter  Google Scholar 

  5. He, B., Ounis, I.: Studying query expansion effectiveness. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 611–619. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_57

    Chapter  Google Scholar 

  6. Khashabi, D., et al.: UNIFIEDQA: crossing format boundaries with a single QA system. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1896–1907. Association for Computational Linguistics, Online (2020)

    Google Scholar 

  7. Košarko, O., Variš, D., Popel, M.: LINDAT translation service (2019). http://hdl.handle.net/11234/1-2922. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University

  8. Lehečka, J., Švec, J., Pražák, A., Psutka, J.V.: Exploring capabilities of monolingual audio transformers using large datasets in automatic speech recognition of Czech. In: Proceedings of Interspeech 2022, pp. 1831–1835 (2022)

    Google Scholar 

  9. Mao, H.H., Li, S., McAuley, J., Cottrell, G.W.: Speech recognition and multi-speaker diarization of long conversations. In: Proceedings of Interspeech 2020, pp. 691–695 (2020)

    Google Scholar 

  10. OpenAI: GPT-3 API (2021). https://beta.openai.com/docs/api-reference/introduction. Accessed 25 Mar 2023

  11. Pecina, P., Hoffmannová, P., Jones, G.J.F., Zhang, Y., Oard, D.W.: Overview of the CLEF-2007 cross-language speech retrieval track. In: Peters, C., et al. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 674–686. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85760-0_86

    Chapter  Google Scholar 

  12. Picheny, M., Tüske, Z., Kingsbury, B., Audhkhasi, K., Cui, X., Saon, G.: Challenging the boundaries of speech recognition: the MALACH corpus. In: Proceedings of Interspeech 2019, pp. 326–330 (2019)

    Google Scholar 

  13. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR (2019). http://arxiv.org/abs/1910.10683

  14. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. In: Proceedings of ACL 2018, Melbourne, Australia, pp. 784–789. ACL (2018)

    Google Scholar 

  15. Ramabhadran, B., et al.: USC-SFI MALACH Interviews and Transcripts English LDC2012S05. Linguistic Data Consortium, Philadelphia (2012). https://catalog.ldc.upenn.edu/LDC2012s05

  16. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 EMNLP-IJCNLP, Hong Kong, China, pp. 3982–3992. Association for Computational Linguistics (2019)

    Google Scholar 

  17. Švec, J., Lehečka, J., Šmídl, L., Ircing, P.: Transformer-based automatic punctuation prediction and word casing reconstruction of the ASR output. In: Ekštein, K., Pártl, F., Konopík, M. (eds.) TSD 2021. LNCS, vol. 12848, pp. 86–94. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-83527-9_7

    Chapter  Google Scholar 

  18. Wang, J., Jatowt, A., Yoshikawa, M.: Archivalqa: a large-scale benchmark dataset for open domain question answering over archival news collections. CoRR abs/2109.03438 (2021)

    Google Scholar 

  19. Yao, X., et al.: Creating conversational characters using question generation tools. Dialogue Discourse 3(2), 125–146 (2012)

    Article  Google Scholar 

  20. Švec, J., Šmídl, L., Psutka, J.V., Pražák, A.: Spoken term detection and relevance score estimation using dot-product of pronunciation embeddings. In: Proceedings of Interspeech 2021, pp. 4398–4402 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Bulín .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Švec, J., Bulín, M., Frémund, A., Polák, F. (2024). Asking Questions Framework for Oral History Archives. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14610. Springer, Cham. https://doi.org/10.1007/978-3-031-56063-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56063-7_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56062-0

  • Online ISBN: 978-3-031-56063-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics