skip to main content
10.1145/2009916.2009976acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning online discussion structures by conditional random fields

Published: 24 July 2011 Publication History

Abstract

Online forum discussions are emerging as valuable information repository, where knowledge is accumulated by the interaction among users, leading to multiple threads with structures. Such replying structure in each thread conveys important information about the discussion content. Unfortunately, not all the online forum sites would explicitly record such replying relationship, making it hard to for both users and computers to digest the information buried in a thread discussion.
In this paper, we propose a probabilistic model in the Conditional Random Fields framework to predict the replying structure for a threaded online discussion. Different from previous thread reconstruction methods, most of which fail to consider dependency between the posts, we cast the problem as a supervised structure learning problem to incorporate the features describing the structural dependency among the discussion content and learn their relationship. Experiment results on three different online forums show that the proposed method can well capture the replying structures in online discussion threads, and multiple tasks such as forum search and question answering can benefit from the reconstructed replying structures.

References

[1]
Onix text retrieval toolkit stopword list. http://www.lextek.com/manuals/onix/stopwords1.html.
[2]
S. Bhatia and P. Mitra. Adopting Inference Networks for Online Thread Retrieval. In Proceedings of the 24th AAAI, pages 1300--1305, 2010.
[3]
G. Cong, L. Wang, C. Lin, Y. Song, and Y. Sun. Finding question-answer pairs from online forums. In Proceedings of the 31st SIGIR, pages 467--474, 2008.
[4]
S. Ding, G. Cong, C. Lin, and X. Zhu. Using conditional random fields to extract contexts and answers of questions from online forums. Proceedings of ACL-08: HLT, pages 710--718, 2008.
[5]
H. Duan and C. Zhai. Exploiting Thread Structure to Improve Smoothing of Language Models for Forum Post Retrieval. In Proceedings of the 33rd ECIR, 2011.
[6]
L. Hong and B. Davison. A classification-based approach to question answering in discussion boards. In Proceedings of the 32nd SIGIR, pages 171--178, 2009.
[7]
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the 8th KDD, pages 133--142, 2002.
[8]
P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities by using link analysis. In Proceedings of the 16th CIKM, pages 919--922, 2007.
[9]
D. Koller and N. Friedman. Probabilistic graphical models. MIT Press, 2009.
[10]
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML-2001, pages 282--289, 2001.
[11]
C. Lin, J. Yang, R. Cai, X. Wang, and W. Wang. Simultaneously modeling semantics and structure of threaded discussions: a sparse coding approach and its applications. In Proceedings of the 32nd SIGIR, pages 131--138, 2009.
[12]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information processing and management, 24(5):513--523, 1988.
[13]
J. Seo, W. Croft, and D. Smith. Online community search using thread structure. In Proceedings of the 18th CIKM, pages 1907--1910, 2009.
[14]
D. Shen, Q. Yang, J. Sun, and Z. Chen. Thread detection in dynamic text message streams. In Proceedings of 29th SIGIR, pages 35--42, 2006.
[15]
X. Shi, J. Zhu, R. Cai, and L. Zhang. User grouping behavior in online forums. In Proceedings of the 15th KDD, pages 777--786, 2009.
[16]
C. Sutton and A. McCallum. Collective segmentation and labeling of distant entities in information extraction. 2004.
[17]
M. Wainwright, T. Jaakkola, and A. Willsky. MAP estimation via agreement on trees: message-passing and linear programming. Information Theory, IEEE Transactions on, 51(11):3697--3717, 2005.
[18]
Y. Wang, M. Joshi, W. Cohen, and C. Rosé. Recovering implicit thread structure in newsgroup style conversations. In ICWSM II, 2008.
[19]
Y. Wang and C. Rosé. Making conversational structure explicit: identification of initiation-response pairs within online discussions. In Proceedings of HLT-NAACL 2010, pages 673--676, 2010.
[20]
G. Xu and W. Ma. Building implicit links from content for forum search. In Proceedings of the 29th SIGIR, pages 300--307, 2006.
[21]
J. Zhang, M. Ackerman, and L. Adamic. Expertise networks in online communities: structure and algorithms. In Proceedings of the 16th WWW, pages 221--230, 2007.
[22]
J. Zhu, Z. Nie, J. Wen, B. Zhang, and W. Ma. 2d conditional random fields for web information extraction. In Proceedings of the 22nd ICML, pages 1044--1051, 2005.

Cited By

View all
  • (2024)Aspect-based sentiment analysis: approaches, applications, challenges and trendsKnowledge and Information Systems10.1007/s10115-024-02200-966:12(7261-7303)Online publication date: 14-Aug-2024
  • (2022)POSLANProceedings of the 37th ACM/SIGAPP Symposium on Applied Computing10.1145/3477314.3507028(1756-1763)Online publication date: 25-Apr-2022
  • (2022)Issues and Challenges of Aspect-based Sentiment Analysis: A Comprehensive SurveyIEEE Transactions on Affective Computing10.1109/TAFFC.2020.297039913:2(845-863)Online publication date: 1-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. replying relation reconstruction
  2. structure learning
  3. threaded discussion

Qualifiers

  • Research-article

Conference

SIGIR '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Aspect-based sentiment analysis: approaches, applications, challenges and trendsKnowledge and Information Systems10.1007/s10115-024-02200-966:12(7261-7303)Online publication date: 14-Aug-2024
  • (2022)POSLANProceedings of the 37th ACM/SIGAPP Symposium on Applied Computing10.1145/3477314.3507028(1756-1763)Online publication date: 25-Apr-2022
  • (2022)Issues and Challenges of Aspect-based Sentiment Analysis: A Comprehensive SurveyIEEE Transactions on Affective Computing10.1109/TAFFC.2020.297039913:2(845-863)Online publication date: 1-Apr-2022
  • (2020)Member Behavior in Dynamic Online Communities: Role Affiliation Frequency ModelIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.291106732:9(1773-1784)Online publication date: 1-Sep-2020
  • (2020)Cross-domain recommender system using generalized canonical correlation analysisKnowledge and Information Systems10.1007/s10115-020-01499-462:12(4625-4651)Online publication date: 14-Sep-2020
  • (2019)Towards the detection of inconsistencies in public security vulnerability reportsProceedings of the 28th USENIX Conference on Security Symposium10.5555/3361338.3361399(869-885)Online publication date: 14-Aug-2019
  • (2019)Thread Structure Learning on Online Health Forums With Partially Labeled DataIEEE Transactions on Computational Social Systems10.1109/TCSS.2019.29464986:6(1273-1282)Online publication date: Dec-2019
  • (2019)CAPS: a supervised technique for classifying Stack Overflow posts concerning API issuesEmpirical Software Engineering10.1007/s10664-019-09743-4Online publication date: 19-Jul-2019
  • (2019)Personalized Thread Recommendation for MOOC Discussion ForumsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-030-10928-8_43(725-740)Online publication date: 23-Jan-2019
  • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media