Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning

Song, Hyun-Je; Park, Seong-Bae

doi:10.1007/s00500-017-2755-8

Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning

Methodologies and Application
Published: 09 August 2017

Volume 22, pages 8107–8118, (2018)
Cite this article

Soft Computing Aims and scope Submit manuscript

Hyun-Je Song¹ &
Seong-Bae Park¹

422 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

This paper proposes a novel method for identifying intention posts in discussion forums. The main problem of identifying intention posts in discussion forums is that there exist a few intention sentences even in a post expressing an intention. That is, an intention post consists of a few intention sentences and a number of non-intention sentences, while non-intention posts have only non-intention sentences. Therefore, multi-instance learning which regards a post as a bag and the sentences in the post as instances of the bag is adopted as a solution to this problem. One distinct characteristic of the posts is that the ways of expressing an intention are similar across domains. Thus, we incorporate a multiple sources transfer learning into the multi-instance learning. As a result, the multi-instance learning is enhanced by leveraging knowledge of expressing intentions from multiple source domains. Through a set of experiments, it is proven that the proposed method is effective at identifying intention posts in discussion forums.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intent Identification Using Few-Shot and Active Learning with User Feedback

Customized Training of Pretrained Language Models to Detect Post Intents in Online Health Support Groups

Overview of the NLPCC 2019 Shared Task: Cross-Domain Dependency Parsing

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

According to Chen et al. (2013), they collected 1000 posts for each domain. However, one post in Electronics has no content.
http://nlp.stanford.edu/software/corenlp.shtml.

References

Alguliev RM, Aliguliyev RM, Mehdiyev CA (2011) Sentence selection for generic document summarization using an adaptive differential evolution algorithm. Swarm Evol Comput 1(4):213–222
Article Google Scholar
Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. In: Advances in neural information processing systems 15, pp 561–568
Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains: a case study. In: Proceedings of recent advances in natural language processing
Banerjee N, Chakraborty D, Joshi A, Mittal S, Rai A, Ravindran B (2012) Towards analyzing micro-blogs for detection and classification of real-time intentions. In: Proceedings of the sixth international AAAI conference on weblogs and social media, pp 391–394
Bin G, Sheng VS (2017) A robust regularization path algorithm for $nu$-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248
Article Google Scholar
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th annual conference on computational learning theory, pp 92–100
Bunescu RC, Mooney RJ (2007) Multiple instance learning for sparse positive bags. In: Proceedings of the 24th international conference on Machine learning, pp 105–112
Cauwenberghs G, Poggio T (2000) Incremental and decremental support vector machine learning. In: Advances in neural information processing systems, vol 13, pp 409–415
Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: Advances in Neural Information Processing Systems 24. Curran Associates, Inc., pp 2456–2464
Chen Z, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Identifying intention posts in discussion forums. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1041–1050
Clarke J, Lapata M (2007) Modelling compression with discourse constraints. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning, pp 1–11
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Dai W, Yang Q, Xue GR, Yu Y (2007) Boosting for transfer learning. In: Proceedings of the 24th international conference on Machine learning, ACM, pp 193–200
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc 39(1):1–38
MathSciNet MATH Google Scholar
Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple-instance problem with axis-parallet rectangles. Artif Intell 89(1–2):31–71
Article Google Scholar
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9(Aug):1871–1874
MATH Google Scholar
Ganu G, Marian A (2013) One size does not fit all: multi-granularity search of web forums. In: Proceedings of the 22nd ACM international conference on information and knowledge management, pp 9–18
Gärtner T, Flach PA, Kowalczyk A, Smola AJ (2002) Multi-instance kernels. In: Proceedings of the 19th international conference on machine learning, pp 179–186
Gu B, Sheng VS, Li S (2015a) Bi-parameter space partition for cost-sensitive svm. In: Proceedings of the 24th international joint conference on artificial intelligence, pp 3532–3539
Gu B, Sheng VS, Tay KY, Romano W, Li S (2015b) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Article MathSciNet Google Scholar
Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015c) Incremental learning for $\nu $-support vector regression. Neural Netw 67:140–150
Article Google Scholar
Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst PP(99):1–11
Google Scholar
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1):389–422
Article Google Scholar
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. Mach Learn ECML-98: 137–142
Joachims T (2006) Training linear svms in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 217–226
Katsumata S, Takeda A (2015) Robust cost sensitive support vector machine. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 434–443
Kim HD, Zhai C (2009) Generating comparative summaries of contradictory opinions in text. In: Proceedings of the 18th ACM international conference on information and knowledge management, pp 385–394
Knight K, Marcu D (2002) Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artif Intell 139(1):91–107
Article Google Scholar
Kong Y, Zhang M, Ye D (2017) A belief propagation-based method for task allocation in open and dynamic cloud environments. Knowl Based Syst 115:123–132
Article Google Scholar
Leopold E, Kindermann J (2002) Text categorization with support vector machines. How to represent texts in input space? Mach Learn 46(1–3):423–444
Article Google Scholar
Li C, Du Y, Liu J, Zheng H, Wang S (2016) A novel approach of identifying user intents in microblog. In: Proceedings of the 12th international conference on intelligent computing methodologies, pp 391–400
Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th annual meeting of the association for computational linguistics, pp 510–520
Luo P, Zhuang F, Xiong H, Xiong Y, He Q (2008) Transfer learning from multiple source domains via consensus regularization. In: Proceedings of the 17th ACM conference on information and knowledge management, pp 103–112
Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. In: Advances in neural information processing systems, pp 570–576
Maron O, Ratan AL (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the 15th international conference on machine learning, pp 341–349
Masnadi-Shirazi H, Vasconcelos N (2010) Risk minimization, probability elicitation, and cost-sensitive svms. In: Proceedings of the 27th international conference on machine learning, pp 759–766
McClosky D, Charniak E, Johnson M (2006) Reranking and self-training for parser adaptation. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, pp 337–344
McDonald R (2007) A study of global inference algorithms in multi-document summarization. In: Proceedings of the 29th European conference on IR research, pp 557–564
Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. In: Proceedings of the 2004 conference on empirical methods in natural language processing, pp 404–411
Nishikawa H, Hasegawa T, Matsuo Y, Kikui G (2010) Opinion summarization with integer linear programming formulation for sentence extraction and ordering. In: Proceedings of the 23rd international conference on computational linguistics, pp 910–918
Pan Z, Zhang Y, Kwong S (2015) Efficient motion and disparity estimation optimization for low complexity multiview video coding. IEEE Trans Broadcast 61(2):166–176
Papadimitriou D, Koutrika G, Velegrakis Y, Mylopoulos J (2017) Finding related forum posts through content similarity over intention-based segmentation. IEEE Trans Knowl Data Eng PP(99):1–1
Qazvinian V, Radev DR (2010) Identifying non-explicit citing sentences for citation-based summarization. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 555–564
Ren Z, Ma J, Wang S, Liu Y (2011) Summarizing web forum threads based on a latent topic propagation process. In: Proceedings of the 20th ACM international conference on information and knowledge management, pp 879–884
Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, Cambridge
Google Scholar
Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245
Article Google Scholar
Sondhi P, Gupta M, Zhai C, Hockenmaier J (2010) Shallow information extraction from medical forum data. In: Proceedings of the 23rd international conference on computational linguistics, pp 1158–1166
Surdeanu M, Tibshirani J, Nallapati R, Manning CD (2012) Multi-instance multi-label learning for relation extraction. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, pp 455–465
Tan B, Zhong E, Xiang EW, Yang Q (2013) Multi-transfer: transfer learning with multiple views and multiple sources. In: Proceedings of the 13th SIAM international conference on data mining, pp 243–251
Chapter Google Scholar
Tian Q, Chen S (2017) Cross-heterogeneous-database age estimation through correlation representation learning. Neurocomputing 238:286–295
Article Google Scholar
Wang J, Zucker JD (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of the 17th international conference on machine learning, pp 1119–1125
Wang Q, Ruan L, Si L (2014) Adaptive knowledge transfer for multiple instance learning in image classification. In: Proceedings of 28th AAAI conference on artificial intelligence, pp 1334–1340
Xue H, Chen S, Yang Q (2011) Structural regularized support vector machine: a framework for structural large margin classifier. IEEE Trans Neural Netw 22(4):573–587
Article Google Scholar
Xue Y, Jiang J, Zhao B, Ma T (2017) A self-adaptive artificial bee colony algorithm based on global best for global optimization. Soft Comput 1–18
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the 14th international conference on machine learning, pp 412–420
Yao Y, Doretto G (2010) Boosting for transfer learning with multiple sources. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1855–1862
Yeung DS, Wang D, Ng WW, Tsang EC, Wang X (2007) Structured large margin machines: sensitive to data distributions. Mach Learn 68(2):171–200
Article Google Scholar
Zhang WJ, Zhou ZH (2014) Multi-instance learning with distribution change. In: Proceedings of 28th AAAI conference on artificial intelligence, pp 2184–2190
Zhang Y, Sun X, Wang B (2016) Efficient algorithm for k-barrier coverage based on integer linear programming. China Commun 13(7):16–23
Article Google Scholar
Zhou ZH, Xu JM (2007) On the relation between multi-instance learning and semi-supervised learning. In: Proceedings of the 24th international conference on machine learning, pp 1167–1174

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation (NRF) of Korea funded by the Ministry of Education (No. 2016R1D1A1B04935678).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Kyungpook National University, 80, Daehakro, Bukgu, Daegu, 702-701, Korea
Hyun-Je Song & Seong-Bae Park

Authors

Hyun-Je Song
View author publications
You can also search for this author in PubMed Google Scholar
Seong-Bae Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seong-Bae Park.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, HJ., Park, SB. Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning. Soft Comput 22, 8107–8118 (2018). https://doi.org/10.1007/s00500-017-2755-8

Download citation

Published: 09 August 2017
Issue Date: December 2018
DOI: https://doi.org/10.1007/s00500-017-2755-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intent Identification Using Few-Shot and Active Learning with User Feedback

Customized Training of Pretrained Language Models to Detect Post Intents in Online Health Support Groups

Overview of the NLPCC 2019 Shared Task: Cross-Domain Dependency Parsing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intent Identification Using Few-Shot and Active Learning with User Feedback

Customized Training of Pretrained Language Models to Detect Post Intents in Online Health Support Groups

Overview of the NLPCC 2019 Shared Task: Cross-Domain Dependency Parsing

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation