Hierarchical Attention Network with XGBoost for Recognizing Insufficiently Supported Argument

Suhartono, Derwin; Gema, Aryo Pradipta; Winton, Suhendro; David, Theodorus; Fanany, Mohamad Ivan; Arymurthy, Aniati Murni

doi:10.1007/978-3-319-69456-6_15

Hierarchical Attention Network with XGBoost for Recognizing Insufficiently Supported Argument

Derwin Suhartono^16,17,
Aryo Pradipta Gema¹⁶,
Suhendro Winton¹⁶,
Theodorus David¹⁶,
Mohamad Ivan Fanany¹⁷ &
…
Aniati Murni Arymurthy¹⁷

Conference paper
First Online: 19 October 2017

1766 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10607))

Abstract

In this paper, we propose the empirical analysis of Hierarchical Attention Network (HAN) as a feature extractor that works conjointly with eXtreme Gradient Boosting (XGBoost) as the classifier to recognize insufficiently supported arguments using a publicly available dataset. Besides HAN + XGBoost, we performed experiments with several other deep learning models, such as Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), bidirectional LSTM, and bidirectional GRU. All results with the best hyper-parameters are presented. In this paper, we present the following three key findings: (1) Shallow models work significantly better than the deep models when using only a small dataset. (2) Attention mechanism can improve the deep model’s result. In average, it improves Area Under the Receiver Operating Characteristic Curve (ROC-AUC) score of Recurrent Neural Network (RNN) with a margin of 18.94%. The hierarchical attention network gave a higher ROC-AUC score by 2.25% in comparison to the non-hierarchical one. (3) The use of XGBoost as the replacement for the last fully connected layer improved the F1 macro score by 5.26%. Overall our best setting achieves 1.88% improvement compared to the state-of-the-art result.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aharoni, E., Polnarov, A., Lavee, T., Hershcovich, D., Levy, R., Rinott, R., Gutfreund, D., Slonim, N.: A benchmark dataset for automatic detection of claims and evidence in the context of controversial topics. In: Proceedings of the First Workshop on Argumentation Mining, pp. 64–68 (2014)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bar-Haim, R., Bhattacharya, I., Dinuzzo, F., Saha, A., Slonim, N.: Stance classification of context-dependent claims (2016)
Google Scholar
Bengio, Y., Goodfellow, I.J., Courville, A.: Deep learning. Nature 521, 436–444 (2015)
Article MATH Google Scholar
Bilu, Y., Hershcovich, D., Slonim, N.: Automatic claim negation: why, how and when. In: NAACL HLT 2015, p. 84 (2015)
Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)
Article Google Scholar
Cabrio, E., Villata, S.: Combining textual entailment and argumentation theory for supporting online debates interactions. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 208–212. Association for Computational Linguistics (2012)
Google Scholar
Caruana, R., Lawrence, S., Giles, L.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: NIPS, pp. 402–408 (2000)
Google Scholar
Chen, T., Guestrin, C.: Xgboost: reliable large-scale tree boosting system. In: Proceedings of the 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp. 13–17 (2016)
Google Scholar
Chen, T., He, T.: Xgboost: extreme gradient boosting. R package version 0.4-2 (2015)
Google Scholar
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
Chollet, F.K.: (2015). http://keras.io
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Google Scholar
Desilia, Y., Utami, V.T., Arta, C., Suhartono, D.: An attempt to combine features in classifying argument components in persuasive essays. In: 17th Workshop on Computational Models of Natural Argument (CMNA) (2017)
Google Scholar
Do, C., Ng, A.Y.: Transfer learning for text classification. In: NIPS, pp. 299–306 (2005)
Google Scholar
Dozat, T.: Incorporating nesterov momentum into adam (2016)
Google Scholar
Eckle-Kohler, J., Kluge, R., Gurevych, I.: On the role of discourse markers for discriminating claims and premises in argumentative discourse. In: EMNLP, pp. 2236–2242 (2015)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Firat, O., Cho, K., Bengio, Y.: Multi-way, multilingual neural machine translation with a shared attention mechanism. arXiv preprint arXiv:1601.01073 (2016)
Gema, A.P., Winton, S., David, T., Suhartono, D., Shodiq, M., Gazali, W.: It takes two to tango: modification of siamese long short termmemory network with attention mechanism in recognizing argumentative relations in persuasive essay. In: 2nd International Conference on Computer Science and Computational Intelligence (2017)
Google Scholar
Govier, T.: A Practical Study of Argument. Cengage Learning, Boston (2013)
Google Scholar
Habernal, I., Gurevych, I.: Which argument is more convincing? Analyzing and predicting convincingness of web arguments using bidirectional LSTM. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) (2016)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Replicated softmax: an undirected topic model. In: Advances in Neural Information Processing Systems, pp. 1607–1614 (2009)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Johnson, R.H., Blair, J.A.: Logical Self-defense. Idea, New Delhi (2006)
Google Scholar
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Levy, R., Bilu, Y., Hershcovich, D., Aharoni, E., Slonim, N.: Context dependent claim detection (2014)
Google Scholar
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150. Association for Computational Linguistics (2011)
Google Scholar
Moens, M.F., Boiy, E., Palau, R.M., Reed, C.: Automatic detection of arguments in legal texts. In: Proceedings of the 11th International Conference on Artificial Intelligence and Law, pp. 225–230. ACM (2007)
Google Scholar
Palau, R.M., Moens, M.F.: Argumentation mining: the detection, classification and structuring of arguments in text. In: International Conference on Artificial Intelligence and Law (2009)
Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Article Google Scholar
Parsons, S., Oren, N., Reed, C.: Computational Models of Argument: Proceedings of COMMA 2014, vol. 266. IOS Press, Amsterdam (2014)
MATH Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Google Scholar
Persing, I., Ng, V.: Modeling argument strength in student essays. In: ACL, vol. 1, pp. 543–552 (2015)
Google Scholar
Rinott, R., Dankin, L., Perez, C.A., Khapra, M.M., Aharoni, E., Slonim, N.: Show me your evidence-an automatic method for context dependent evidence detection. In: EMNLP, pp. 440–450 (2015)
Google Scholar
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Papers from the 1998 Workshop on Learning for Text Categorization, vol. 62, pp. 98–105 (1998)
Google Scholar
Sandulescu, V., Chiru, M.: Predicting the future relevance of research institutions-the winning solution of the KDD cup 2016. arXiv preprint arXiv:1609.02728 (2016)
Sardianos, C., Katakis, I.M., Petasis, G., Karkaletsis, V.: Argument extraction from news. In: Proceedings of the 2nd Workshop on Argumentation Mining, pp. 56–66 (2015)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MATH MathSciNet Google Scholar
Stab, C., Gurevych, I.: Identifying argumentative discourse structures in persuasive essays. In: EMNLP, pp. 46–56 (2014)
Google Scholar
Stab, C., Gurevych, I.: Recognizing insufficiently supported arguments in argumentative essays, pp. 980–990 (2017)
Google Scholar
Suhartono, D., Iskandar, A.A., Fanany, M.I., Manurung, R.: Utilizing word vector representation for classifying argument components in persuasive essays (2016)
Google Scholar
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp. 1422–1432 (2015)
Google Scholar
Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 90–94. Association for Computational Linguistics (2012)
Google Scholar
Wei, Z., Liu, Y., Li, Y.: Is this post persuasive? Ranking argumentative comments in the online forum. In: The 54th Annual Meeting of the Association for Computational Linguistics, p. 195 (2016)
Google Scholar
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)
Google Scholar
Yao, Y., Rosasco, L., Caponnetto, A.: On early stopping in gradient descent learning. Constr. Approx. 26(2), 289–315 (2007)
Article MATH MathSciNet Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar

Download references

Acknowledgments

This research was fully funded by “Penelitian Disertasi Doktor” from Ministry of Research, Technology and Higher Education of Indonesia with contract number 039A/VR.RTT/VI/2017.

Author information

Authors and Affiliations

Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia
Derwin Suhartono, Aryo Pradipta Gema, Suhendro Winton & Theodorus David
Machine Learning and Computer Vision (MLCV) Laboratory, Universitas Indonesia, Depok, Indonesia
Derwin Suhartono, Mohamad Ivan Fanany & Aniati Murni Arymurthy

Authors

Derwin Suhartono
View author publications
You can also search for this author in PubMed Google Scholar
Aryo Pradipta Gema
View author publications
You can also search for this author in PubMed Google Scholar
Suhendro Winton
View author publications
You can also search for this author in PubMed Google Scholar
Theodorus David
View author publications
You can also search for this author in PubMed Google Scholar
Mohamad Ivan Fanany
View author publications
You can also search for this author in PubMed Google Scholar
Aniati Murni Arymurthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Derwin Suhartono or Mohamad Ivan Fanany .

Editor information

Editors and Affiliations

Universiti Teknologi Brunei, Gadong, Brunei Darussalam
Somnuk Phon-Amnuaisuk
Universiti Teknologi Brunei, Gadong, Brunei Darussalam
Swee-Peng Ang
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Soo-Young Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Suhartono, D., Gema, A.P., Winton, S., David, T., Fanany, M.I., Arymurthy, A.M. (2017). Hierarchical Attention Network with XGBoost for Recognizing Insufficiently Supported Argument. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-69456-6_15
Published: 19 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69455-9
Online ISBN: 978-3-319-69456-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics