Abstract
In this paper, we propose the empirical analysis of Hierarchical Attention Network (HAN) as a feature extractor that works conjointly with eXtreme Gradient Boosting (XGBoost) as the classifier to recognize insufficiently supported arguments using a publicly available dataset. Besides HAN + XGBoost, we performed experiments with several other deep learning models, such as Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), bidirectional LSTM, and bidirectional GRU. All results with the best hyper-parameters are presented. In this paper, we present the following three key findings: (1) Shallow models work significantly better than the deep models when using only a small dataset. (2) Attention mechanism can improve the deep model’s result. In average, it improves Area Under the Receiver Operating Characteristic Curve (ROC-AUC) score of Recurrent Neural Network (RNN) with a margin of 18.94%. The hierarchical attention network gave a higher ROC-AUC score by 2.25% in comparison to the non-hierarchical one. (3) The use of XGBoost as the replacement for the last fully connected layer improved the F1 macro score by 5.26%. Overall our best setting achieves 1.88% improvement compared to the state-of-the-art result.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aharoni, E., Polnarov, A., Lavee, T., Hershcovich, D., Levy, R., Rinott, R., Gutfreund, D., Slonim, N.: A benchmark dataset for automatic detection of claims and evidence in the context of controversial topics. In: Proceedings of the First Workshop on Argumentation Mining, pp. 64–68 (2014)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bar-Haim, R., Bhattacharya, I., Dinuzzo, F., Saha, A., Slonim, N.: Stance classification of context-dependent claims (2016)
Bengio, Y., Goodfellow, I.J., Courville, A.: Deep learning. Nature 521, 436–444 (2015)
Bilu, Y., Hershcovich, D., Slonim, N.: Automatic claim negation: why, how and when. In: NAACL HLT 2015, p. 84 (2015)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)
Cabrio, E., Villata, S.: Combining textual entailment and argumentation theory for supporting online debates interactions. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 208–212. Association for Computational Linguistics (2012)
Caruana, R., Lawrence, S., Giles, L.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: NIPS, pp. 402–408 (2000)
Chen, T., Guestrin, C.: Xgboost: reliable large-scale tree boosting system. In: Proceedings of the 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp. 13–17 (2016)
Chen, T., He, T.: Xgboost: extreme gradient boosting. R package version 0.4-2 (2015)
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
Chollet, F.K.: (2015). http://keras.io
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Desilia, Y., Utami, V.T., Arta, C., Suhartono, D.: An attempt to combine features in classifying argument components in persuasive essays. In: 17th Workshop on Computational Models of Natural Argument (CMNA) (2017)
Do, C., Ng, A.Y.: Transfer learning for text classification. In: NIPS, pp. 299–306 (2005)
Dozat, T.: Incorporating nesterov momentum into adam (2016)
Eckle-Kohler, J., Kluge, R., Gurevych, I.: On the role of discourse markers for discriminating claims and premises in argumentative discourse. In: EMNLP, pp. 2236–2242 (2015)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Firat, O., Cho, K., Bengio, Y.: Multi-way, multilingual neural machine translation with a shared attention mechanism. arXiv preprint arXiv:1601.01073 (2016)
Gema, A.P., Winton, S., David, T., Suhartono, D., Shodiq, M., Gazali, W.: It takes two to tango: modification of siamese long short termmemory network with attention mechanism in recognizing argumentative relations in persuasive essay. In: 2nd International Conference on Computer Science and Computational Intelligence (2017)
Govier, T.: A Practical Study of Argument. Cengage Learning, Boston (2013)
Habernal, I., Gurevych, I.: Which argument is more convincing? Analyzing and predicting convincingness of web arguments using bidirectional LSTM. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) (2016)
Hinton, G.E., Salakhutdinov, R.R.: Replicated softmax: an undirected topic model. In: Advances in Neural Information Processing Systems, pp. 1607–1614 (2009)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Johnson, R.H., Blair, J.A.: Logical Self-defense. Idea, New Delhi (2006)
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Levy, R., Bilu, Y., Hershcovich, D., Aharoni, E., Slonim, N.: Context dependent claim detection (2014)
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150. Association for Computational Linguistics (2011)
Moens, M.F., Boiy, E., Palau, R.M., Reed, C.: Automatic detection of arguments in legal texts. In: Proceedings of the 11th International Conference on Artificial Intelligence and Law, pp. 225–230. ACM (2007)
Palau, R.M., Moens, M.F.: Argumentation mining: the detection, classification and structuring of arguments in text. In: International Conference on Artificial Intelligence and Law (2009)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Parsons, S., Oren, N., Reed, C.: Computational Models of Argument: Proceedings of COMMA 2014, vol. 266. IOS Press, Amsterdam (2014)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Persing, I., Ng, V.: Modeling argument strength in student essays. In: ACL, vol. 1, pp. 543–552 (2015)
Rinott, R., Dankin, L., Perez, C.A., Khapra, M.M., Aharoni, E., Slonim, N.: Show me your evidence-an automatic method for context dependent evidence detection. In: EMNLP, pp. 440–450 (2015)
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Papers from the 1998 Workshop on Learning for Text Categorization, vol. 62, pp. 98–105 (1998)
Sandulescu, V., Chiru, M.: Predicting the future relevance of research institutions-the winning solution of the KDD cup 2016. arXiv preprint arXiv:1609.02728 (2016)
Sardianos, C., Katakis, I.M., Petasis, G., Karkaletsis, V.: Argument extraction from news. In: Proceedings of the 2nd Workshop on Argumentation Mining, pp. 56–66 (2015)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Stab, C., Gurevych, I.: Identifying argumentative discourse structures in persuasive essays. In: EMNLP, pp. 46–56 (2014)
Stab, C., Gurevych, I.: Recognizing insufficiently supported arguments in argumentative essays, pp. 980–990 (2017)
Suhartono, D., Iskandar, A.A., Fanany, M.I., Manurung, R.: Utilizing word vector representation for classifying argument components in persuasive essays (2016)
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp. 1422–1432 (2015)
Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 90–94. Association for Computational Linguistics (2012)
Wei, Z., Liu, Y., Li, Y.: Is this post persuasive? Ranking argumentative comments in the online forum. In: The 54th Annual Meeting of the Association for Computational Linguistics, p. 195 (2016)
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)
Yao, Y., Rosasco, L., Caponnetto, A.: On early stopping in gradient descent learning. Constr. Approx. 26(2), 289–315 (2007)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Acknowledgments
This research was fully funded by “Penelitian Disertasi Doktor” from Ministry of Research, Technology and Higher Education of Indonesia with contract number 039A/VR.RTT/VI/2017.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Suhartono, D., Gema, A.P., Winton, S., David, T., Fanany, M.I., Arymurthy, A.M. (2017). Hierarchical Attention Network with XGBoost for Recognizing Insufficiently Supported Argument. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-69456-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69455-9
Online ISBN: 978-3-319-69456-6
eBook Packages: Computer ScienceComputer Science (R0)