Segmentation-Based Formulation of Slot Filling Task for Better Generative Modeling

Wakabayashi, Kei; Takeuchi, Johane; Nakano, Mikio

doi:10.1007/978-981-19-5538-9_2

Kei Wakabayashi⁴⁰,
Johane Takeuchi⁴¹ &
Mikio Nakano⁴²

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 943))

536 Accesses

Abstract

Slot filling is a fundamental task in spoken language understanding that is usually formulated as a sequence labeling problem and solved using discriminative models such as conditional random fields and recurrent neural networks. One of the weak points of this discriminative approach is robustness against incomplete annotations. For obtaining a more robust method, this paper leverages an overlooked property of slot filling tasks: Non-slot parts of utterance follow a specific pattern depending on the user’s intent. To this end, we propose a generative model that estimates the underlying pattern of utterances based on a segmentation-based formulation of slot-filling tasks. The proposed method adopts nonparametric Bayesian models that enjoy the flexibility of the phrase distribution modeling brought by the new formulation. The experimental result demonstrates that the proposed method performs better in a situation that the training data with incomplete annotations in comparison to the BiLSTM-CRF and HMM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A novel model based on a transformer for intent detection and slot filling

Article Open access 14 August 2024

Exploring Deep Learning Architectures Coupled with CRF Based Prediction for Slot-Filling

Binary Label-Aware Transfer Learning for Cross-Domain Slot Filling

Notes

1.
In the experiment, we used word as a token for English and character as a token for Japanese.
2.
We can formulate the language models for phrases based on token sequence representation, but we prefer the character sequence modeling because the model can get more flexibility. This choice does not affect the overall framework of the proposed method.
3.
In contrast to the major usage of CRP that constitutes an infinite mixture model [25], $\phi _{a_i}$ is not a parameter for another distribution but an observable phrase ($s_i = \phi _{a_i}$).
4.
The index $\backslash i$ indicates a set of the variables except for the ith variable.
5.
As described in [28], the effect of this approximation that ignores the local count is sufficiently small when there are many short sentences. This case applies to the slot filling task.
6.
We can substitute the variables with the expected values because the predictive distribution of a Dirichlet-categorical distribution with $p_{dir}(\theta | \alpha )$ and $p_{cat}(x | \theta )$ equals $p(x_N = k | x_{1:N-1}) = \int p(x_N = k | \theta ) p(\theta | x_{1:N-1}) d\theta = \frac{\alpha _k + \sum _{i=1}^{N-1}\delta (x_i = k)}{\sum _{k}\alpha _k + N - 1} = p_{cat}(x | \theta =E_{p(\theta |x_{1:N-1})}[\theta ]).$
7.
https://github.com/Hironsan/anago.

References

Bishop C (2006) Pattern recognition and machine learning. Springer
Google Scholar
Chib S (1996) Calculating posterior distributions and modal estimates in markov mixture models. J Econom 75:79–97
Article MathSciNet MATH Google Scholar
Fukubayashi Y, Komatani K, Nakano M, Funakoshi K, Tsujino H, Ogata T, Okuno HG (2008) Rapid prototyping of robust language understanding modules for spoken dialogue systems. In: Proceedings of IJCNLP, pp 210–216
Google Scholar
Goldwater S, Griffiths TL, Johnson M (2011) Producing power-law distributions and damping word frequencies with two-stage language models. J Mach Learn Res 12:2335–2382
MathSciNet MATH Google Scholar
Henderson M (2015) Machine learning for dialog state tracking: a review. In: Proceedings of international workshop on machine learning in spoken language processing
Google Scholar
Henderson MS (2015) Discriminative methods for statistical spoken dialogue systems. PhD thesis, University of Cambridge
Google Scholar
Jie Z, Xie P, Lu W, Ding R, Li L (2019) Better modeling of incomplete annotations for named entity recognition. In: Proceedings of NAACL: HLT, pp 729–734
Google Scholar
Jin L, Schwartz L, Doshi-Velez F, Miller T, Schuler W (2021) Depth-bounded statistical PCFG induction as a model of human grammar acquisition. Comput Linguist Assoc Comput Linguist 47(1):181–216
Article Google Scholar
Komatani K, Katsumaru M, Nakano M, Funakoshi K, Ogata T, Okuno HG (2010) Automatic allocation of training data for rapid prototyping. In: Proceedings of COLING
Google Scholar
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. arXiv:1603.01360 [cs.CL]
Lim KW, Buntine W, Chen C, Du L (2016) Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes. Int J Approx Reason 78(C):172–191
Article MathSciNet MATH Google Scholar
Macherey K, Och FJ, Ney H (2001) Natural language understanding using statistical machine translation. In: Proceedings of EUROSPEECH, pp 2205–2208
Google Scholar
Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, He X, Heck L, Tur G, Yu D, Zweig G (2015) Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans Audio, Speech, Lang Process 23(3):530–539
Article Google Scholar
Nguyen AT, Wallace BC, Li JJ, Nenkova A, Lease M (2017) Aggregating and predicting sequence labels from crowd annotations. In: Proceedings ACL, pp 299–309
Google Scholar
Niu J, Penn G (2019) Rationally reappraising ATIS-based dialogue systems. In: Proceedings ACL, pp 5503–5507
Google Scholar
Ponvert E, Baldridge J, Erk K (2011) Simple unsupervised grammar induction from raw text with cascaded finite state models. In: Proceedings ACL, pp 1077–1086
Google Scholar
Qin L, Liu T, Che W, Kang B, Zhao S, Liu T (2021) A co-interactive transformer for joint slot filling and intent detection. In: Proceedings ICASSP, pp 8193–8197
Google Scholar
Raymond C, Riccardi G (2007) Generative and discriminative algorithms for spoken language understanding. In: Proceedings of Interspeech
Google Scholar
Rodrigues F, Pereira F, Ribeiro B (2014) Sequence labeling with multiple annotators. Mach Learn 95(2):165–181
Article MathSciNet MATH Google Scholar
Sato I, Nakagawa H (2010) Topic models with power-law using Pitman-Yor process. In: Proceedings KDD
Google Scholar
Scott SL (2002) Bayesian methods for hidden markov models: recursive computing in the 21st century. J Am Stat Assoc 97:337–351
Article MathSciNet MATH Google Scholar
Seneff S (1992) TINA: a natural language system for spoken language applications. Comput Linguist 18(1):61–86
Google Scholar
Simpson ED, Gurevych I (2019) A bayesian approach for sequence tagging with crowds. In: Proceedings EMNLP, pp 1093–1104
Google Scholar
Snow R, O’Connor B, Jurafsky D, Ng AY (2008) Cheap and fast—but is it good? evaluating non-expert annotations for natural language tasks. In: Proceedings EMNLP, pp 254–263
Google Scholar
Teh YW, Jordan MI, Beal MJ, Blei DM (2005) Hierarchical dirichlet processes. J Am Stat Assoc 101:1566–1581
Article MathSciNet MATH Google Scholar
Uchiumi K, Tsukahara H, Mochihashi D (2015) Inducing word and part-of-speech with pitman-yor hidden semi-markov models. In: Proceedings ACL-IJCNLP
Google Scholar
Wakabayashi K, Takeuchi J, Funakoshi K, Nakano M (2016) Nonparametric Bayesian models for spoken language understanding. In: Proceedings EMNLP
Google Scholar
Wang P, Blunsom P (2013) Collapsed variational Bayesian inference for hidden Markov models. In: Proceedings AISTATS, pp 599–607
Google Scholar
Xu P, Sarikaya R (2013) Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: Proceedings of IEEE workshop on automatic speech recognition and understanding
Google Scholar
Yadav V, Bethard S (2018) A survey on recent advances in named entity recognition from deep learning models. In: Proceedings COLING
Google Scholar
Zhai K, Boyd-graber J (2013) Online latent dirichlet allocation with infinite vocabulary. In: Proceedings of ICML
Google Scholar

Download references

Acknowledgements

This work was partially supported by JSPS KAKENHI Grant Number 19K20333.

Author information

Authors and Affiliations

University of Tsukuba, Tsukuba, Japan
Kei Wakabayashi
Honda Research Institute Japan Co., Ltd., Saitama, Japan
Johane Takeuchi
Honda Research Institute Japan Co., Ltd. (Currently with C4A Research Institute, Inc.), Saitama, Japan
Mikio Nakano

Authors

Kei Wakabayashi
View author publications
You can also search for this author in PubMed Google Scholar
Johane Takeuchi
View author publications
You can also search for this author in PubMed Google Scholar
Mikio Nakano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kei Wakabayashi .

Editor information

Editors and Affiliations

Toshiba (United Kingdom), Weybridge, UK
Svetlana Stoyanchev
Daimler (Germany), Stuttgart, Germany
Stefan Ultes
The Chinese University of Hong Kong, Shenzhen, China
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wakabayashi, K., Takeuchi, J., Nakano, M. (2022). Segmentation-Based Formulation of Slot Filling Task for Better Generative Modeling. In: Stoyanchev, S., Ultes, S., Li, H. (eds) Conversational AI for Natural Human-Centric Interaction. Lecture Notes in Electrical Engineering, vol 943. Springer, Singapore. https://doi.org/10.1007/978-981-19-5538-9_2

Download citation

DOI: https://doi.org/10.1007/978-981-19-5538-9_2
Published: 01 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5537-2
Online ISBN: 978-981-19-5538-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics