Abstract
Training deep networks requires large volumes of data. However, for many companies developing new products, those data may not be available and public data-sets may not be adapted to their particular use-case. In this paper, we explain how we achieved a production ready slot filling deep neural network for our new single-field search engine without initial natural language data. First, we implemented a baseline by using recurrent neural networks trained on expert defined templates with parameters extracted from our knowledge databases. Then, we collected actual natural language data by deploying this baseline in production on a small part of our traffic. Finally, we improved our algorithm by adding a knowledge vector as input of the deep learning model and training it on pseudo-labeled production data. We provide detailed experimental reports and show the impact of hyper-parameters and algorithm modifications in our use-case.
F. Torregrossa, N. Kooli and R. Allesiardo—The authors made equal contributions and are sorted at random.
François Torregrossa is also a member of The Research Institute on Informatics and Random Systems (IRISA) and of the Inria Rennes - Bretagne Atlantique research center.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Google DialogFlow, https://dialogflow.com/.
- 2.
Microsoft Luis, https://www.luis.ai/.
- 3.
Accuracies reported in Table 2 are obtained on our most recent testing set. At the beginning of the project, we challenged both solutions on a smaller set containing queries made by experts.
References
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 2012 (2012)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (2014)
Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: NIPS (2004)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. In: IJCNN (2005)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 (2015)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML (2001)
Ma X., Hovy, E.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: End-to-End Sequence Labeling via Bi-directional LSTM-CNNS-CRF. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016)
Mesnil G., He X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH (2013)
Nair, V., Hinton, G.: Rectified linear units improve restricted Boltzmann machines. In: ICML 2010 (2010)
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Workshop (2017)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Sang, E.F.T.K., Veenstra, J.: Representing text chunks. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, EACL 1999, pp. 173–179. Association for Computational Linguistics (1999)
Sun, A., Grishman, R., Xu, W., Min, B.: New York University 2011 system for KBP slot filling. In: TAC. NIST (2011)
Wang, Y., Deng, L., Acero, A.: Semantic frame based spoken language understanding. In: Tur, G., De Mori, R. (eds.) Spoken Language Understanding: Systems for Extracting Semantic Information from Speech, Chap. 3, pp. 35–80, January 2011
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Torregrossa, F., Kooli, N., Allesiardo, R., Pigneul, E. (2019). How We Achieved a Production Ready Slot Filling Deep Neural Network Without Initial Natural Language Data. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-030-36808-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-36808-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36807-4
Online ISBN: 978-3-030-36808-1
eBook Packages: Computer ScienceComputer Science (R0)