Question-Answering Dialog System for Large Audiovisual Archives

Chýlek, Adam; Šmídl, Luboš; Švec, Jan

doi:10.1007/978-3-030-27947-9_33

Adam Chýlek⁹,
Luboš Šmídl¹⁰ &
Jan Švec⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11697))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

915 Accesses

Abstract

In this paper, we present our spoken dialog system that serves as a search interface of the MALACH archive. The voice interface and natural language input allow the users to retrieve information contained in large audiovisual archives more comfortably. Especially, finding answers to a more structured question should be easier in comparison with typical search input options. The dialog is build on top of a system that automatically annotates and indexes the archive using automatic speech recognition. These indexes were searchable so far only in a full-text search for any arbitrary text query. Our proposed approach improves this system and leverages named entity recognition to create a knowledge base of semantic information contained in the recognized utterances. We describe the design of the dialog system, as well as the automatic knowledge base generation and the approach to creating queries using a spoken natural language as an input.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Initial Experiments on Question Answering from the Intrinsic Structure of Oral History Archives

Conversational Search for Multimedia Archives

ConfNet2Seq

Notes

1.
https://malach.umiacs.umd.edu/.
2.
https://sfi.usc.edu/.

References

Bordes, A., Boureau, Y.L., Weston, J.: Learning end-to-end goal-oriented dialog. In: ICLR (2017). http://arxiv.org/abs/1605.07683
Choi, E., et al.: QuAC: question answering in context. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2174–2184. Association for Computational Linguistics, Brussels, Belgium (2018). https://www.aclweb.org/anthology/D18-1241
Dubey, M., Dasgupta, S., Sharma, A., Höffner, K., Lehmann, J.: AskNow: a framework for natural language query formalization in SPARQL. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 300–316. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_19
Chapter Google Scholar
Gambino, S.L., Zerrieß, S., Schlangen, D.: Testing strategies for bridging time-to-content in spoken dialogue systems. In: Proceedings of the Ninth International Workshop on Spoken Dialogue Systems Technology, pp. 1–7 (2018)
Google Scholar
Gurevych, I., Porzel, R., Slinko, E., Pfleger, N., Alexandersson, J., Merten, S.: Less is more: using a single knowledge representation in dialogue systems. In: Proceedings of the HLT-NAACL Workshop on Text Meaning, pp. 14–21 (2003)
Google Scholar
Kadlec, R., Vodolan, M., Libovicky, J., Macek, J., Kleindienst, J.: Knowledge-based dialog state tracking. In: 2014 IEEE Spoken Language Technology Workshop (SLT), No. 1, pp. 348–353. IEEE, December 2014. http://ieeexplore.ieee.org/document/7078599/
Lee, L.S., Glass, J., Lee, H.Y., Chan, C.A.: Spoken content retrieval - beyond cascading speech recognition with text retrieval. IEEE/ACM Trans. Audio Speech Lang. Process. 23, 1389–1420 (2015). http://ieeexplore.ieee.org/document/7114229/
Article Google Scholar
Lopez, V.: PowerAqua: open question answering on the semantic web. Ph.D. thesis (2011)
Google Scholar
Neo4j, Inc: The Neo4j Cypher Manual v3.5 (2019). https://neo4j.com/docs/cypher-manual/3.5/
Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: Librispeech: an ASR corpus based on public domain audio books. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2015, pp. 5206–5210, August 2015
Google Scholar
Popel, M., Žabokrtský, Z.: TectoMT: modular NLP framework. In: IceTAL, 7th International Conference on Natural Language Processing, Reykjavik, pp. 293–304 (2010). https://ufal.mff.cuni.cz/treex
Chapter Google Scholar
Psutka, J., Radová, V., Ircing, P., Matoušek, J., Müller, L.: USC-SFI MALACH Interviews and Transcripts Czech LDC2014S04 (2014). https://catalog.ldc.upenn.edu/LDC2014S04
Ramabhadran, B., et al.: USC-SFI MALACH Interviews and Transcripts English (2012). https://catalog.ldc.upenn.edu/LDC2012S05
Stanislav, P., Švec, J., Ircing, P.: An engine for online video search in large archives of the holocaust testimonies. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 08–12 September, pp. 2352–2353 (2016)
Google Scholar
Stede, M., Schlangen, D.: Information-seeking chat: dialogue management by topic structure. In: Proceedings of the 8th Workshop on the Semantics and Pragmatics of Dialogue, pp. 117–124 (2004)
Google Scholar
Švec, J., Ircing, P., Šmídl, L.: Semantic entity detection from multiple ASR hypotheses within the WFST framework. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings, pp. 84–89 (2013)
Google Scholar
Švec, J., Psutka, J.V., Trmal, J., Šmídl, L., Ircing, P., Sedmidubsky, J.: On the use of grapheme models for searching in large spoken archives. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2018, pp. 6259–6263, April 2018
Google Scholar
Unger, C., Bühmann, L.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web, pp. 639–648 (2012). http://dl.acm.org/citation.cfm?id=2187923
Webber, J.: A programmatic introduction to Neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, p. 217 (2012)
Google Scholar
Williams, J.D., Henderson, M., Raux, A., Thomson, B., Black, A., Ramachandran, D.: The dialog state tracking challenge series. AI Mag. 35(4), 121 (2017)
Article Google Scholar

Download references

Acknowledgement

This work was supported by the European Regional Development Fund under the project Robotics for Industry 4.0 (reg. no. CZ.02.1.01/0.0/0.0/15_003/0000470), by the Technology Agency of the Czech Republic, project No. TE01020197 and by the grant of the University of West Bohemia, project No. SGS-2019-027.

Author information

Authors and Affiliations

NTIS - New Technologies for Information Society, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic
Adam Chýlek & Jan Švec
Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic
Luboš Šmídl

Authors

Adam Chýlek
View author publications
You can also search for this author in PubMed Google Scholar
Luboš Šmídl
View author publications
You can also search for this author in PubMed Google Scholar
Jan Švec
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Chýlek .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Kamil Ekštein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chýlek, A., Šmídl, L., Švec, J. (2019). Question-Answering Dialog System for Large Audiovisual Archives. In: Ekštein, K. (eds) Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science(), vol 11697. Springer, Cham. https://doi.org/10.1007/978-3-030-27947-9_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-27947-9_33
Published: 06 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27946-2
Online ISBN: 978-3-030-27947-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Question-Answering Dialog System for Large Audiovisual Archives

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Initial Experiments on Question Answering from the Intrinsic Structure of Oral History Archives

Conversational Search for Multimedia Archives

ConfNet2Seq

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Question-Answering Dialog System for Large Audiovisual Archives

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Initial Experiments on Question Answering from the Intrinsic Structure of Oral History Archives

Conversational Search for Multimedia Archives

ConfNet2Seq

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation