Skip to main content

How We Achieved a Production Ready Slot Filling Deep Neural Network Without Initial Natural Language Data

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1142))

Included in the following conference series:

Abstract

Training deep networks requires large volumes of data. However, for many companies developing new products, those data may not be available and public data-sets may not be adapted to their particular use-case. In this paper, we explain how we achieved a production ready slot filling deep neural network for our new single-field search engine without initial natural language data. First, we implemented a baseline by using recurrent neural networks trained on expert defined templates with parameters extracted from our knowledge databases. Then, we collected actual natural language data by deploying this baseline in production on a small part of our traffic. Finally, we improved our algorithm by adding a knowledge vector as input of the deep learning model and training it on pseudo-labeled production data. We provide detailed experimental reports and show the impact of hyper-parameters and algorithm modifications in our use-case.

F. Torregrossa, N. Kooli and R. Allesiardo—The authors made equal contributions and are sorted at random.

François Torregrossa is also a member of The Research Institute on Informatics and Random Systems (IRISA) and of the Inria Rennes - Bretagne Atlantique research center.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Google DialogFlow, https://dialogflow.com/.

  2. 2.

    Microsoft Luis, https://www.luis.ai/.

  3. 3.

    Accuracies reported in Table 2 are obtained on our most recent testing set. At the beginning of the project, we challenged both solutions on a smaller set containing queries made by experts.

References

  1. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 2012 (2012)

    MathSciNet  MATH  Google Scholar 

  2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)

  3. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (2014)

    Google Scholar 

  4. Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: NIPS (2004)

    Google Scholar 

  5. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. In: IJCNN (2005)

    Google Scholar 

  6. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 (2015)

    Google Scholar 

  7. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML (2001)

    Google Scholar 

  8. Ma X., Hovy, E.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: End-to-End Sequence Labeling via Bi-directional LSTM-CNNS-CRF. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016)

    Google Scholar 

  9. Mesnil G., He X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH (2013)

    Google Scholar 

  10. Nair, V., Hinton, G.: Rectified linear units improve restricted Boltzmann machines. In: ICML 2010 (2010)

    Google Scholar 

  11. Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Workshop (2017)

    Google Scholar 

  12. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  13. Sang, E.F.T.K., Veenstra, J.: Representing text chunks. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, EACL 1999, pp. 173–179. Association for Computational Linguistics (1999)

    Google Scholar 

  14. Sun, A., Grishman, R., Xu, W., Min, B.: New York University 2011 system for KBP slot filling. In: TAC. NIST (2011)

    Google Scholar 

  15. Wang, Y., Deng, L., Acero, A.: Semantic frame based spoken language understanding. In: Tur, G., De Mori, R. (eds.) Spoken Language Understanding: Systems for Extracting Semantic Information from Speech, Chap. 3, pp. 35–80, January 2011

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robin Allesiardo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Torregrossa, F., Kooli, N., Allesiardo, R., Pigneul, E. (2019). How We Achieved a Production Ready Slot Filling Deep Neural Network Without Initial Natural Language Data. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-030-36808-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36808-1_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36807-4

  • Online ISBN: 978-3-030-36808-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics