Asking Questions Framework for Oral History Archives

Švec, Jan; Bulín, Martin; Frémund, Adam; Polák, Filip

doi:10.1007/978-3-031-56063-7_11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14610))

Included in the following conference series:

European Conference on Information Retrieval

296 Accesses

Abstract

The importance of oral history archives in preserving and understanding past experiences is counterbalanced by the challenges encountered in accessing and searching through them, primarily due to their extensive size and the diverse demographics of the speakers. This paper presents an approach combining ASR technology and Transformer-based neural networks into the Asking questions framework. Its primary function is to generate questions accompanied by concise answers that relate to the topics discussed in each interview segment. Additionally, we introduce a semantic continuity model that filters the generated questions, ensuring that only the most relevant ones are retained. This enables a real-time semantic search through thousands of hours of recordings, with the crucial benefit that the speakers’ original words remain unaltered and still semantically align with the query. While the method is exemplified using a specific publicly available archive, its applicability extends universally to datasets of a similar nature.

This research was supported by the Czech Science Foundation (GA CR), project No. GA22-27800S, and by the grant of the University of West Bohemia, project No. SGS-2022-017.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://sfi.usc.edu/what-we-do/collections.
2.
https://www.ushmm.org/.
3.
https://github.com/honzas83/t5s.
4.
Demonstrator available at https://malach-aq.kky.zcu.cz.
5.
https://www.youtube.com/uscshoahfoundation.
6.
https://www.sbert.net.

References

USC Shoah Foundation Oral History with Abraham Bomba | Experiencing History: Holocaust Sources in Context. https://perspectives.ushmm.org/. Accessed 12 Apr 2023
Baevski, A., Zhou, Y., Mohamed, A., Auli, M.: Wav2Vec 2.0: a framework for self-supervised learning of speech representations. In: Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460 (2020)
Google Scholar
Chen, G., et al.: Gigaspeech: an evolving, multi-domain ASR corpus with 10,000 hours of transcribed audio. In: Proceedings of Interspeech 2021 (2021)
Google Scholar
Gospodinov, M., MacAvaney, S., Macdonald, C.: Doc2query-: when less is more. In: Kamps, J., et al. (eds.) ECIR 2023. LNCS, vol. 13981, pp. 414–422. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-28238-6_31
Chapter Google Scholar
He, B., Ounis, I.: Studying query expansion effectiveness. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 611–619. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_57
Chapter Google Scholar
Khashabi, D., et al.: UNIFIEDQA: crossing format boundaries with a single QA system. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1896–1907. Association for Computational Linguistics, Online (2020)
Google Scholar
Košarko, O., Variš, D., Popel, M.: LINDAT translation service (2019). http://hdl.handle.net/11234/1-2922. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Lehečka, J., Švec, J., Pražák, A., Psutka, J.V.: Exploring capabilities of monolingual audio transformers using large datasets in automatic speech recognition of Czech. In: Proceedings of Interspeech 2022, pp. 1831–1835 (2022)
Google Scholar
Mao, H.H., Li, S., McAuley, J., Cottrell, G.W.: Speech recognition and multi-speaker diarization of long conversations. In: Proceedings of Interspeech 2020, pp. 691–695 (2020)
Google Scholar
OpenAI: GPT-3 API (2021). https://beta.openai.com/docs/api-reference/introduction. Accessed 25 Mar 2023
Pecina, P., Hoffmannová, P., Jones, G.J.F., Zhang, Y., Oard, D.W.: Overview of the CLEF-2007 cross-language speech retrieval track. In: Peters, C., et al. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 674–686. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85760-0_86
Chapter Google Scholar
Picheny, M., Tüske, Z., Kingsbury, B., Audhkhasi, K., Cui, X., Saon, G.: Challenging the boundaries of speech recognition: the MALACH corpus. In: Proceedings of Interspeech 2019, pp. 326–330 (2019)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR (2019). http://arxiv.org/abs/1910.10683
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. In: Proceedings of ACL 2018, Melbourne, Australia, pp. 784–789. ACL (2018)
Google Scholar
Ramabhadran, B., et al.: USC-SFI MALACH Interviews and Transcripts English LDC2012S05. Linguistic Data Consortium, Philadelphia (2012). https://catalog.ldc.upenn.edu/LDC2012s05
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 EMNLP-IJCNLP, Hong Kong, China, pp. 3982–3992. Association for Computational Linguistics (2019)
Google Scholar
Švec, J., Lehečka, J., Šmídl, L., Ircing, P.: Transformer-based automatic punctuation prediction and word casing reconstruction of the ASR output. In: Ekštein, K., Pártl, F., Konopík, M. (eds.) TSD 2021. LNCS, vol. 12848, pp. 86–94. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-83527-9_7
Chapter Google Scholar
Wang, J., Jatowt, A., Yoshikawa, M.: Archivalqa: a large-scale benchmark dataset for open domain question answering over archival news collections. CoRR abs/2109.03438 (2021)
Google Scholar
Yao, X., et al.: Creating conversational characters using question generation tools. Dialogue Discourse 3(2), 125–146 (2012)
Article Google Scholar
Švec, J., Šmídl, L., Psutka, J.V., Pražák, A.: Spoken term detection and relevance score estimation using dot-product of pronunciation embeddings. In: Proceedings of Interspeech 2021, pp. 4398–4402 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic
Jan Švec, Martin Bulín, Adam Frémund & Filip Polák

Authors

Jan Švec
View author publications
You can also search for this author in PubMed Google Scholar
Martin Bulín
View author publications
You can also search for this author in PubMed Google Scholar
Adam Frémund
View author publications
You can also search for this author in PubMed Google Scholar
Filip Polák
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Bulín .

Editor information

Editors and Affiliations

Georgetown University, Washington, WA, USA
Nazli Goharian
University of Pisa, PISA, Pisa, Italy
Nicola Tonellotto
King's College London, London, UK
Yulan He
University College London, London, UK
Aldo Lipani
University of Glasgow, Glasgow, UK
Graham McDonald
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Glasgow, Glasgow, UK
Iadh Ounis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Švec, J., Bulín, M., Frémund, A., Polák, F. (2024). Asking Questions Framework for Oral History Archives. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14610. Springer, Cham. https://doi.org/10.1007/978-3-031-56063-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-56063-7_11
Published: 23 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56062-0
Online ISBN: 978-3-031-56063-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Asking Questions Framework for Oral History Archives