short-paper

Dialogue Topic Segmentation via Parallel Extraction Network with Neighbor Smoothing

Authors:

Houfeng WangAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2126 - 2131

https://doi.org/10.1145/3477495.3531817

Published: 07 July 2022 Publication History

Abstract

Dialogue topic segmentation is a challenging task in which dialogues are split into segments with pre-defined topics. Existing works on topic segmentation adopt a two-stage paradigm, including text segmentation and segment labeling. However, such methods tend to focus on the local context in segmentation, and the inter-segment dependency is not well captured. Besides, the ambiguity and labeling noise in dialogue segment bounds bring further challenges to existing models. In this work, we propose the Parallel Extraction Network with Neighbor Smoothing (PEN-NS) to address the above issues. Specifically, we propose the parallel extraction network to perform segment extractions, optimizing the bipartite matching cost of segments to capture inter-segment dependency. Furthermore, we propose neighbor smoothing to handle the segment-bound noise and ambiguity. Experiments on a dialogue-based and a document-based topic segmentation dataset show that PEN-NS outperforms state-the-of-art models significantly.

References

[1]

Sebastian Arnold, Rudolf Schneider, Philippe Cudré-Mauroux, Felix A. Gers, and Alexander Löser. 2019. SECTOR: A Neural Model for Coherent Topic Segmentation and Classification. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 169--184.

[2]

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A Simple but Tough-to-Beat Baseline for Sentence Embeddings. In ICLR .

[3]

Joe Barrow, R. Jain, Vlad I. Morariu, Varun Manjunatha, Douglas W. Oard, and Philip Resnik. 2020. A Joint Model for Document Segmentation and Segment Labeling. In ACL .

[4]

Doug Beeferman, Adam L. Berger, and John D. Lafferty. 2004. Statistical Models for Text Segmentation. Machine Learning, Vol. 34 (2004), 177--210.

Digital Library

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL .

[6]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, Vol. 9 (1997), 1735--1780.

Digital Library

[7]

Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. In ACL .

[8]

Joo-Kyung Kim, Guoyin Wang, Sungjin Lee, and Young-Bum Kim. 2021. Deciding Whether to Ask Clarifying Questions in Large-Scale Spoken Language Understanding. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (2021), 869--876.

[9]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR, Vol. abs/1412.6980 (2015).

[10]

Omri Koshorek, Adir Cohen, Noam Mor, Michael Rotman, and Jonathan Berant. 2018. Text Segmentation as a Supervised Learning Task. In NAACL .

[11]

Harold W. Kuhn. 2010. The Hungarian Method for the Assignment Problem. In 50 Years of Integer Programming .

[12]

Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In NAACL .

[13]

Chunyi Liu, Peng Wang, Jiang Xu, Zang Li, and Jieping Ye. 2019. Automatic Dialogue Summary Generation for Customer Service. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2019).

Digital Library

[14]

Kelvin Lo, Yuan Jin, Weicong Tan, Ming Liu, Lan Du, and Wray L. Buntine. 2021. Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence. In EMNLP .

[15]

Fabrizio Macagno and Sarah Bigi. 2018. Types of dialogue and pragmatic ambiguity.

[16]

Ryo Masumura, Takanobu Oba, Hirokazu Masataki, Osamu Yoshioka, and Satoshi Takahashi. 2014. Role play dialogue topic model for language model adaptation in multi-party conversation speech recognition. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2014), 4873--4877.

[17]

Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR .

[18]

Pedro Mota, Maxine Eskénazi, and Luísa Coheur. 2019. BeamSeg: A Joint Model for Multi-Document Segmentation and Topic Identification. In CoNLL .

[19]

Rafael Müller, Simon Kornblith, and Geoffrey E. Hinton. 2019. When Does Label Smoothing Help?. In NeurIPS .

[20]

Artem Popov, Victor Bulatov, Darya Polyudova, and Eugenia Veselova. 2019. Unsupervised dialogue intent detection via hierarchical topic model. In RANLP .

[21]

MengNan Qi, Hao Liu, Yuzhuo Fu, and Ting Liu. 2021. Improving Abstractive Dialogue Summarization with Hierarchical Pretraining and Topic Segment. In EMNLP .

[22]

Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text Chunking using Transformation-Based Learning. ArXiv, Vol. cmp-lg/9505040 (1995).

[23]

Imran A. Sheikh, D. Fohr, and Irina Illina. 2017. Topic segmentation in ASR transcripts using bidirectional RNNS for change detection. 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (2017), 512--518.

[24]

Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., Vol. 15 (2014), 1929--1958.

Digital Library

[25]

Ashish Vaswani, Noam M. Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS .

[26]

Linzi Xing, Bradley Alexander Hackinen, Giuseppe Carenini, and Francesco Trebbi. 2020. Improving Context Modeling in Neural Topic Segmentation. In AACL .

[27]

Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian L. Price, Zhonghao Wang, and Humphrey Shi. 2021 a. Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021), 12040--12050.

[28]

Yi Xu, Hai Zhao, and Zhuosheng Zhang. 2021 b. Topic-Aware Multi-turn Dialogue Modeling. In AAAI .

[29]

Seunghyun Yoon, Joongbo Shin, and Kyomin Jung. 2018. Learning to Rank Question-Answer Pairs Using Hierarchical Recurrent Encoder with Latent Topic Clustering. In NAACL .

[30]

Hainan Zhang, Yanyan Lan, Liang Pang, Hongshen Chen, Zhuoye Ding, and Dawei Yin. 2020. Modeling Topical Relevance for Multi-Turn Dialogue Generation. In IJCAI .

[31]

Yujun Zhou, Changliang Li, Saike He, Xiaoqi Wang, and Yiming Qiu. 2019. Pre-trained Contextualized Representation for Chinese Conversation Topic Classification. 2019 IEEE International Conference on Intelligence and Security Informatics (ISI) (2019), 122--127.

[32]

Lin Zhu, Xinnan Dai, Qihao Huang, Hai Xiang, and Jie Zheng. 2019. Topic Judgment Helps Question Similarity Prediction in Medical FAQ Dialogue Systems. 2019 International Conference on Data Mining Workshops (ICDMW) (2019), 966--972.

[33]

Yicheng Zou, Lujun Zhao, Yangyang Kang, Jun Lin, Minlong Peng, Zhuoren Jiang, Changlong Sun, Qi Zhang, Xuanjing Huang, and Xiaozhong Liu. 2021. Topic-Oriented Spoken Dialogue Summarization for Customer Service with Saliency-Aware Topic Modeling. In AAAI .

Cited By

Zeng ZLiu SSha LLi ZYang KLiu SGašević DChen GLarson K(2024)Detecting AI-generated sentences in human-AI collaborative hybrid textsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/835(7545-7553)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/835
Ghazimatin AGarmash EPenha GSheets KAchenbach MSemerci OGalvez RTannenberg MMantravadi SNarayanan DKalaydzhyan OCole DCarterette BClifton ABennett PHauff CLalmas MSerra ESpezzano F(2024)PODTILE: Facilitating Podcast Episode Browsing with Auto-generated ChaptersProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680081(4487-4495)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680081
Sun WTran HGonzález-Gallardo CCoustaty MDoucet A(2024)Global-SEG: Text Semantic Segmentation Based on Global Semantic Pair RelationsDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70546-5_15(253-269)Online publication date: 11-Sep-2024
https://doi.org/10.1007/978-3-031-70546-5_15
Show More Cited By

Index Terms

Dialogue Topic Segmentation via Parallel Extraction Network with Neighbor Smoothing
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics
      2. Information extraction

Recommendations

Unsupervised Dialogue Topic Segmentation with Topic-aware Contrastive Learning
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Dialogue Topic Segmentation (DTS) plays an essential role in a variety of dialogue modeling tasks. Previous DTS methods either focus on semantic similarity or dialogue coherence to assess topic similarity for unsupervised dialogue segmentation. However, ...
Topic Segmentation for Interview Dialogue System
NLPIR '21: Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval

In this study, topic segmentation was performed by referring to the interview dialogue corpus. Utterance intention tags were added to the existing interview dialogue corpus, and uttered sentences were vectorized using BERT, Sentence BERT, and Distil ...
Dialogue Topic Extraction as Sentence Sequence Labeling
Natural Language Processing and Chinese Computing
Abstract
The topic information of the dialogue text is important for the model to understand the intentions of the dialogue participants and to abstractly summarize the content of the dialogue. The dialogue topic extraction task aims to extract the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2022

3569 pages

ISBN:9781450387323

DOI:10.1145/3477495

General Chairs:
Enrique Amigo
UNED
,
Pablo Castells
UAM and Amazon
,
Julio Gonzalo
UNED
,
Program Chairs:
Ben Carterette
Spotify
,
J. Shane Culpepper
RMIT University
,
Gabriella Kazai
Waseda University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

PKU-Baidu Fund
National Natural Science Foundation of China

Conference

SIGIR '22

Sponsor:

SIGIR

SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11 - 15, 2022

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
458
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)2

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zeng ZLiu SSha LLi ZYang KLiu SGašević DChen GLarson K(2024)Detecting AI-generated sentences in human-AI collaborative hybrid textsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/835(7545-7553)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/835
Ghazimatin AGarmash EPenha GSheets KAchenbach MSemerci OGalvez RTannenberg MMantravadi SNarayanan DKalaydzhyan OCole DCarterette BClifton ABennett PHauff CLalmas MSerra ESpezzano F(2024)PODTILE: Facilitating Podcast Episode Browsing with Auto-generated ChaptersProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680081(4487-4495)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680081
Sun WTran HGonzález-Gallardo CCoustaty MDoucet A(2024)Global-SEG: Text Semantic Segmentation Based on Global Semantic Pair RelationsDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70546-5_15(253-269)Online publication date: 11-Sep-2024
https://doi.org/10.1007/978-3-031-70546-5_15
Gao HWang RLin TWu YYang MHuang FLi YChen HDuh WHuang HKato MMothe JPoblete B(2023)Unsupervised Dialogue Topic Segmentation with Topic-aware Contrastive LearningProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592081(2481-2485)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3592081
Cai CZhao QXu RQin B(2023)Multimodal Dialogue Understanding via Holistic Modeling and Sequence LabelingNatural Language Processing and Chinese Computing10.1007/978-3-031-44699-3_36(399-411)Online publication date: 12-Oct-2023
https://dl.acm.org/doi/10.1007/978-3-031-44699-3_36

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten