Deep Multi-task Learning with Cross Connected Layer for Slot Filling

Kong, Junsheng; Cai, Yi; Ren, Da; Li, Zilu

doi:10.1007/978-3-030-32236-6_27

Deep Multi-task Learning with Cross Connected Layer for Slot Filling

Junsheng Kong¹³,
Yi Cai¹³,
Da Ren¹³ &
…
Zilu Li¹⁴

Conference paper
First Online: 30 September 2019

4702 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

Slot filling is a critical subtask of Spoken language understanding (SLU) in task-oriented dialogue systems. This is a common scenario that different slot filling tasks from different but similar domains have overlapped sets of slots (shared slots). In this paper, we propose an effective deep multi-task learning with Cross Connected Layer (CCL) to capture this information. The experiments show that our proposed model outperforms some mainstream baselines on the Chinese E-commerce datasets. The significant improvement in the F1 socre of the shared slots proves that CCL can capture more information about shared slots.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/JansonKong/Deep-Multi-task-Learning-with-Cross-Connected-Layer-for-Slot-Filling.

References

Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015). https://doi.org/10.1109/TASLP.2014.2383614, https://doi.org/10.1109/TASLP.2014.2383614
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR abs/1508.01991 (2015). http://arxiv.org/abs/1508.01991
Reimers, N., Gurevych, I.: Optimal hyperparameters for deep LSTM-networks for sequence labeling tasks. CoRR abs/1707.06799 (2017). http://arxiv.org/abs/1707.06799
Price, P.J.: Evaluation of spoken language systems: the ATIS domain. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, USA, 24–27 June 1990 (1990). https://aclanthology.info/papers/H90-1020/h90-1020
Hakkani-Tür, D., et al.: Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In: 17th Annual Conference of the International Speech Communication Association, Interspeech 2016, San Francisco, CA, USA, 8–12 September 2016, pp. 715–719 (2016). https://doi.org/10.21437/Interspeech.2016-402, https://doi.org/10.21437/Interspeech.2016-402
Ramshaw, L.A., Marcus, M.: Text chunking using transformation-based learning. In: Third Workshop on Very Large Corpora, VLC@ACL 1995, Cambridge, Massachusetts, USA, 30 June 1995 (1995). https://aclanthology.info/papers/W95-0107/w95-0107
Deoras, A., Sarikaya, R.: Deep belief network based semantic taggers for spoken language understanding. In: 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013, Lyon, France, 25–29 August 2013, pp. 2713–2717 (2013). http://www.isca-speech.org/archive/interspeech_2013/i13_2713.html
Xu, P., Sarikaya, R.: Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, 8–12 December 2013, pp. 78–83 (2013). https://doi.org/10.1109/ASRU.2013.6707709, https://doi.org/10.1109/ASRU.2013.6707709
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: 2014 IEEE Spoken Language Technology Workshop, SLT 2014, South Lake Tahoe, NV, USA, 7–10 December 2014, pp. 189–194 (2014). https://doi.org/10.1109/SLT.2014.7078572, https://doi.org/10.1109/SLT.2014.7078572
Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013)
Google Scholar
Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2014)
Article Google Scholar
Xu, P., Sarikaya, R.: Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 78–83. IEEE (2013)
Google Scholar
Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: 2014 IEEE Spoken Language Technology Workshop (SLT), pp. 189–194. IEEE (2014)
Google Scholar
Zhai, F., Potdar, S., Xiang, B., Zhou, B.: Neural models for sequence chunking. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 4–9 February 2017, pp. 3365–3371 (2017). http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14776
Peng, B., Yao, K., Jing, L., Wong, K.: Recurrent neural networks with external memory for spoken language understanding. In: Natural Language Processing and Chinese Computing - 4th CCF Conference, NLPCC 2015, Proceedings, Nanchang, China, 9–13 October 2015, pp. 25–35 (2015). https://doi.org/10.1007/978-3-319-25207-0_3, https://doi.org/10.1007/978-3-319-25207-0_3
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997). https://doi.org/10.1023/A:1007379606734
Article MathSciNet Google Scholar
Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Volume 2: Short Papers, 7–12 August 2016, Berlin, Germany (2016). http://aclweb.org/anthology/P/P16/P16-2025.pdf
Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, Rep4NLP@ACL 2017, Vancouver, Canada, 3 August 2017, pp. 91–100 (2017). https://aclanthology.info/papers/W17-2612/w17-2612
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. In: 5th International Conference on Learning Representations, ICLR 2017, Conference Track Proceedings, Toulon, France, 24–26 April 2017 (2017). https://openreview.net/forum?id=ByxpMd9lx
Zhao, X., Haihong, E., Song, M.: A joint model based on CNN-LSTMs in dialogue understanding. In: 2018 International Conference on Information Systems and Computer Aided Education (ICISCAE), pp. 471–475. IEEE (2018)
Google Scholar
Kim, Y., Stratos, K., Sarikaya, R.: Frustratingly easy neural domain adaptation. In: 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, COLING 2016, Osaka, Japan, 11–16 December 2016, pp. 387–396 (2016). http://aclweb.org/anthology/C/C16/C16-1038.pdf
Liu, Z., Zhu, C., Zhao, T.: Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words? In: Huang, D.-S., Zhang, X., Reyes García, C.A., Zhang, L. (eds.) ICIC 2010. LNCS (LNAI), vol. 6216, pp. 634–640. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14932-0_78
Chapter Google Scholar
Li, H., Hagiwara, M., Li, Q., Ji, H.: Comparison of the impact of word segmentation on name tagging for Chinese and Japanese. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, 26–31 May 2014, pp. 2532–2536 (2014). http://www.lrec-conf.org/proceedings/lrec2014/summaries/358.html
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28–July 1, 2001, pp. 282–289 (2001)
Google Scholar
Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)
Article MathSciNet Google Scholar

Download references

Acknowledgment

This work presented in this paper is partially supported by the Fundamental Research Funds for the Central Universities, SCUT (Nos. 2017ZD048, D2182480), the Tiptop Scientific and Technical Innovative Youth Talents of Guangdong special support program (No.2015TQ01X633), the Science and Technology Planning Project of Guangdong Province (No.2017B050506004), the Science and Technology Program of Guangzhou (Nos. 201704030076, 201802010027).

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, China
Junsheng Kong, Yi Cai & Da Ren
Guangzhou Tianhe Foreign Language School, Guangzhou, China
Zilu Li

Authors

Junsheng Kong
View author publications
You can also search for this author in PubMed Google Scholar
Yi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Da Ren
View author publications
You can also search for this author in PubMed Google Scholar
Zilu Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Cai .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kong, J., Cai, Y., Ren, D., Li, Z. (2019). Deep Multi-task Learning with Cross Connected Layer for Slot Filling. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_27
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)