Skip to main content

A Practical Approach to Intelligent Spoken Dialogue for Third-Party Applications on Home Devices with Linear Bandits

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Abstract

Third-party applications deployed on vocal home-devices (Google Home, Amazon Echo...) are usually rule-based and follow an hard-coded dialogue graph. In this paper we describe how we included artificial intelligence in our vocal conversational agent actually running in production on Amazon Echo and soon on Google Home. This approach is based on contextual bandits, a special case of reinforcement learning, that allows to pilot the dialogue inside a fussy dialogue graph while taking advantage of the features available in the home-devices’ frameworks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Allesiardo, R., Féraud, R., & Bouneffouf, D. (2014). A neural networks committee for the contextual bandit problem. In Neural Information Processing - 21st International Conference, ICONIP (pp. 374–381).

    Google Scholar 

  • Amazon (2017). Alexa skill kit.

    Google Scholar 

  • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.

    Article  Google Scholar 

  • Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1–2), 41–77.

    Article  MathSciNet  Google Scholar 

  • Bouraoui, J.-L., & Lemaire, V. (2017). Cluster-based graphs for conceiving dialog systems. In Workshop DMNLP at European Conference on Machine Learning (ECML).

    Google Scholar 

  • Chu, W., Li, L., Reyzin, L., & Schapire, R. (2011). Contextual bandits with linear payoff functions. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of Proceedings of Machine Learning Research, pages 208–214, Fort Lauderdale, FL, USA. PMLR.

    Google Scholar 

  • Cuayáhuitl, H., Renals, S., Lemon, O., & Shimodaira, H. (2010). Evaluation of a hierarchical reinforcement learning spoken dialogue system. Computer Speech and Language, 24(2), 395.

    Google Scholar 

  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.

  • Dhingra, B., Li, L., Li, X., Gao, J., Chen, Y.-N., Ahmed, F., & Deng, L. (2016). End-to-end reinforcement learning of dialogue agents for information access. Technical report.

    Google Scholar 

  • Fatemi, M., Asri, L. E., Schulz, H., He, J., & Suleman, K. (2016). Policy networks with two-stage training for dialogue systems. In Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA (pp. 101–110).

    Google Scholar 

  • Féraud, R., Allesiardo, R., Urvoy, T., & Clérot, F. (2016). Random forest for the contextual bandit problem. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016, Cadiz, Spain, May 9-11, 2016 (pp. 93–101).

    Google Scholar 

  • Google (2017). Dialogflow.

    Google Scholar 

  • Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016a). Bag of tricks for efficient text classification.

    Google Scholar 

  • Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016b). Bag of tricks for efficient text classification. arXiv:1607.01759.

  • Lai, T., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.

    Article  MathSciNet  Google Scholar 

  • Langford, J. & Zhang, T. (2007). The epoch-greedy algorithm for multi-armed bandits with side information. In Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007 (pp. 817–824).

    Google Scholar 

  • Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10 (pp. 661–670). New York, NY, USA: ACM.

    Google Scholar 

  • Microsoft (2017). Botframework.

    Google Scholar 

  • Rojas-Barahona, L. M., Gasic, M., Mrksic, N., Su, P., Ultes, S., Wen, T., Young, S. J., & Vandyke, D. (2017). A network-based end-to-end trainable task-oriented dialogue system. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3-7, 2017, Volume 1: Long Papers (pp. 438–449).

    Google Scholar 

  • Singh, S. P., Kearns, M. J., Litman, D. J., & Walker, M. A. (1999). Reinforcement learning for spoken dialogue systems. In Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29 - December 4, 1999] (pp. 956–962).

    Google Scholar 

  • Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9–44.

    Google Scholar 

  • Torregrossa, F., Kooli, N., Allesiardo, R., & Pigneul, E. (2019). How we achieved a production ready slot filling deep neural network without initial natural language data. In T. Gedeon, K. W. Wong, & M. Lee (Eds.), Neural Information Processing (pp. 247–255). Cham: Springer International Publishing.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robin Allesiardo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Allesiardo, R., Sauldubois, C., Depaulis, F., Bulteau, N., Chantrel, F., Pigneul, E. (2022). A Practical Approach to Intelligent Spoken Dialogue for Third-Party Applications on Home Devices with Linear Bandits. In: Jaziri, R., Martin, A., Rousset, MC., Boudjeloud-Assala, L., Guillet, F. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 1004. Springer, Cham. https://doi.org/10.1007/978-3-030-90287-2_4

Download citation

Publish with us

Policies and ethics