Skip to main content

Deep Multi-task Learning with Cross Connected Layer for Slot Filling

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

Slot filling is a critical subtask of Spoken language understanding (SLU) in task-oriented dialogue systems. This is a common scenario that different slot filling tasks from different but similar domains have overlapped sets of slots (shared slots). In this paper, we propose an effective deep multi-task learning with Cross Connected Layer (CCL) to capture this information. The experiments show that our proposed model outperforms some mainstream baselines on the Chinese E-commerce datasets. The significant improvement in the F1 socre of the shared slots proves that CCL can capture more information about shared slots.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/JansonKong/Deep-Multi-task-Learning-with-Cross-Connected-Layer-for-Slot-Filling.

References

  1. Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015). https://doi.org/10.1109/TASLP.2014.2383614, https://doi.org/10.1109/TASLP.2014.2383614

  2. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR abs/1508.01991 (2015). http://arxiv.org/abs/1508.01991

  3. Reimers, N., Gurevych, I.: Optimal hyperparameters for deep LSTM-networks for sequence labeling tasks. CoRR abs/1707.06799 (2017). http://arxiv.org/abs/1707.06799

  4. Price, P.J.: Evaluation of spoken language systems: the ATIS domain. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, USA, 24–27 June 1990 (1990). https://aclanthology.info/papers/H90-1020/h90-1020

  5. Hakkani-Tür, D., et al.: Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In: 17th Annual Conference of the International Speech Communication Association, Interspeech 2016, San Francisco, CA, USA, 8–12 September 2016, pp. 715–719 (2016). https://doi.org/10.21437/Interspeech.2016-402, https://doi.org/10.21437/Interspeech.2016-402

  6. Ramshaw, L.A., Marcus, M.: Text chunking using transformation-based learning. In: Third Workshop on Very Large Corpora, VLC@ACL 1995, Cambridge, Massachusetts, USA, 30 June 1995 (1995). https://aclanthology.info/papers/W95-0107/w95-0107

  7. Deoras, A., Sarikaya, R.: Deep belief network based semantic taggers for spoken language understanding. In: 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013, Lyon, France, 25–29 August 2013, pp. 2713–2717 (2013). http://www.isca-speech.org/archive/interspeech_2013/i13_2713.html

  8. Xu, P., Sarikaya, R.: Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, 8–12 December 2013, pp. 78–83 (2013). https://doi.org/10.1109/ASRU.2013.6707709, https://doi.org/10.1109/ASRU.2013.6707709

  9. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  10. Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: 2014 IEEE Spoken Language Technology Workshop, SLT 2014, South Lake Tahoe, NV, USA, 7–10 December 2014, pp. 189–194 (2014). https://doi.org/10.1109/SLT.2014.7078572, https://doi.org/10.1109/SLT.2014.7078572

  11. Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013)

    Google Scholar 

  12. Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2014)

    Article  Google Scholar 

  13. Xu, P., Sarikaya, R.: Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 78–83. IEEE (2013)

    Google Scholar 

  14. Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: 2014 IEEE Spoken Language Technology Workshop (SLT), pp. 189–194. IEEE (2014)

    Google Scholar 

  15. Zhai, F., Potdar, S., Xiang, B., Zhou, B.: Neural models for sequence chunking. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 4–9 February 2017, pp. 3365–3371 (2017). http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14776

  16. Peng, B., Yao, K., Jing, L., Wong, K.: Recurrent neural networks with external memory for spoken language understanding. In: Natural Language Processing and Chinese Computing - 4th CCF Conference, NLPCC 2015, Proceedings, Nanchang, China, 9–13 October 2015, pp. 25–35 (2015). https://doi.org/10.1007/978-3-319-25207-0_3, https://doi.org/10.1007/978-3-319-25207-0_3

  17. Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997). https://doi.org/10.1023/A:1007379606734

    Article  MathSciNet  Google Scholar 

  18. Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Volume 2: Short Papers, 7–12 August 2016, Berlin, Germany (2016). http://aclweb.org/anthology/P/P16/P16-2025.pdf

  19. Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, Rep4NLP@ACL 2017, Vancouver, Canada, 3 August 2017, pp. 91–100 (2017). https://aclanthology.info/papers/W17-2612/w17-2612

  20. Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. In: 5th International Conference on Learning Representations, ICLR 2017, Conference Track Proceedings, Toulon, France, 24–26 April 2017 (2017). https://openreview.net/forum?id=ByxpMd9lx

  21. Zhao, X., Haihong, E., Song, M.: A joint model based on CNN-LSTMs in dialogue understanding. In: 2018 International Conference on Information Systems and Computer Aided Education (ICISCAE), pp. 471–475. IEEE (2018)

    Google Scholar 

  22. Kim, Y., Stratos, K., Sarikaya, R.: Frustratingly easy neural domain adaptation. In: 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, COLING 2016, Osaka, Japan, 11–16 December 2016, pp. 387–396 (2016). http://aclweb.org/anthology/C/C16/C16-1038.pdf

  23. Liu, Z., Zhu, C., Zhao, T.: Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words? In: Huang, D.-S., Zhang, X., Reyes García, C.A., Zhang, L. (eds.) ICIC 2010. LNCS (LNAI), vol. 6216, pp. 634–640. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14932-0_78

    Chapter  Google Scholar 

  24. Li, H., Hagiwara, M., Li, Q., Ji, H.: Comparison of the impact of word segmentation on name tagging for Chinese and Japanese. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, 26–31 May 2014, pp. 2532–2536 (2014). http://www.lrec-conf.org/proceedings/lrec2014/summaries/358.html

  25. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28–July 1, 2001, pp. 282–289 (2001)

    Google Scholar 

  26. Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgment

This work presented in this paper is partially supported by the Fundamental Research Funds for the Central Universities, SCUT (Nos. 2017ZD048, D2182480), the Tiptop Scientific and Technical Innovative Youth Talents of Guangdong special support program (No.2015TQ01X633), the Science and Technology Planning Project of Guangdong Province (No.2017B050506004), the Science and Technology Program of Guangzhou (Nos. 201704030076, 201802010027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kong, J., Cai, Y., Ren, D., Li, Z. (2019). Deep Multi-task Learning with Cross Connected Layer for Slot Filling. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics