Hidden Markov model (HMM) is one of the popular techniques for story segmentation, where hidden Markov states represent the topics, and the emission distributions of n-gram language model (LM) are dependent on the states. Given a text document, a Viterbi decoder finds the hidden story sequence, with a change of topic indicating a story boundary. In this paper, we propose a discriminative approach to story boundary detection. In the HMM framework, we use deep neural network (DNN) to estimate the posterior probability of topics given the bag-of-words in the local context. We call it the DNN-HMM approach. We consider the topic dependent LM as a generative modeling technique, and the DNN-HMM as the discriminative solution. Experiments on topic detection and tracking (TDT2) task show that DNN-HMM outperforms traditional n-gram LM approach significantly and achieves state-of-the-art performance.
Cite as: Yu, J., Xiao, X., Xie, L., Chng, E.S., Li, H. (2016) A DNN-HMM Approach to Story Segmentation. Proc. Interspeech 2016, 1527-1531, doi: 10.21437/Interspeech.2016-873
@inproceedings{yu16b_interspeech, author={Jia Yu and Xiong Xiao and Lei Xie and Eng Siong Chng and Haizhou Li}, title={{A DNN-HMM Approach to Story Segmentation}}, year=2016, booktitle={Proc. Interspeech 2016}, pages={1527--1531}, doi={10.21437/Interspeech.2016-873} }