Skip to main content

Hierarchical Attention Network with XGBoost for Recognizing Insufficiently Supported Argument

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10607))

Abstract

In this paper, we propose the empirical analysis of Hierarchical Attention Network (HAN) as a feature extractor that works conjointly with eXtreme Gradient Boosting (XGBoost) as the classifier to recognize insufficiently supported arguments using a publicly available dataset. Besides HAN + XGBoost, we performed experiments with several other deep learning models, such as Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), bidirectional LSTM, and bidirectional GRU. All results with the best hyper-parameters are presented. In this paper, we present the following three key findings: (1) Shallow models work significantly better than the deep models when using only a small dataset. (2) Attention mechanism can improve the deep model’s result. In average, it improves Area Under the Receiver Operating Characteristic Curve (ROC-AUC) score of Recurrent Neural Network (RNN) with a margin of 18.94%. The hierarchical attention network gave a higher ROC-AUC score by 2.25% in comparison to the non-hierarchical one. (3) The use of XGBoost as the replacement for the last fully connected layer improved the F1 macro score by 5.26%. Overall our best setting achieves 1.88% improvement compared to the state-of-the-art result.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aharoni, E., Polnarov, A., Lavee, T., Hershcovich, D., Levy, R., Rinott, R., Gutfreund, D., Slonim, N.: A benchmark dataset for automatic detection of claims and evidence in the context of controversial topics. In: Proceedings of the First Workshop on Argumentation Mining, pp. 64–68 (2014)

    Google Scholar 

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  3. Bar-Haim, R., Bhattacharya, I., Dinuzzo, F., Saha, A., Slonim, N.: Stance classification of context-dependent claims (2016)

    Google Scholar 

  4. Bengio, Y., Goodfellow, I.J., Courville, A.: Deep learning. Nature 521, 436–444 (2015)

    Article  MATH  Google Scholar 

  5. Bilu, Y., Hershcovich, D., Slonim, N.: Automatic claim negation: why, how and when. In: NAACL HLT 2015, p. 84 (2015)

    Google Scholar 

  6. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  7. Cabrio, E., Villata, S.: Combining textual entailment and argumentation theory for supporting online debates interactions. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 208–212. Association for Computational Linguistics (2012)

    Google Scholar 

  8. Caruana, R., Lawrence, S., Giles, L.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: NIPS, pp. 402–408 (2000)

    Google Scholar 

  9. Chen, T., Guestrin, C.: Xgboost: reliable large-scale tree boosting system. In: Proceedings of the 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp. 13–17 (2016)

    Google Scholar 

  10. Chen, T., He, T.: Xgboost: extreme gradient boosting. R package version 0.4-2 (2015)

    Google Scholar 

  11. Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)

  12. Chollet, F.K.: (2015). http://keras.io

  13. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  14. Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)

    Google Scholar 

  15. Desilia, Y., Utami, V.T., Arta, C., Suhartono, D.: An attempt to combine features in classifying argument components in persuasive essays. In: 17th Workshop on Computational Models of Natural Argument (CMNA) (2017)

    Google Scholar 

  16. Do, C., Ng, A.Y.: Transfer learning for text classification. In: NIPS, pp. 299–306 (2005)

    Google Scholar 

  17. Dozat, T.: Incorporating nesterov momentum into adam (2016)

    Google Scholar 

  18. Eckle-Kohler, J., Kluge, R., Gurevych, I.: On the role of discourse markers for discriminating claims and premises in argumentative discourse. In: EMNLP, pp. 2236–2242 (2015)

    Google Scholar 

  19. Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  20. Firat, O., Cho, K., Bengio, Y.: Multi-way, multilingual neural machine translation with a shared attention mechanism. arXiv preprint arXiv:1601.01073 (2016)

  21. Gema, A.P., Winton, S., David, T., Suhartono, D., Shodiq, M., Gazali, W.: It takes two to tango: modification of siamese long short termmemory network with attention mechanism in recognizing argumentative relations in persuasive essay. In: 2nd International Conference on Computer Science and Computational Intelligence (2017)

    Google Scholar 

  22. Govier, T.: A Practical Study of Argument. Cengage Learning, Boston (2013)

    Google Scholar 

  23. Habernal, I., Gurevych, I.: Which argument is more convincing? Analyzing and predicting convincingness of web arguments using bidirectional LSTM. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) (2016)

    Google Scholar 

  24. Hinton, G.E., Salakhutdinov, R.R.: Replicated softmax: an undirected topic model. In: Advances in Neural Information Processing Systems, pp. 1607–1614 (2009)

    Google Scholar 

  25. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  26. Johnson, R.H., Blair, J.A.: Logical Self-defense. Idea, New Delhi (2006)

    Google Scholar 

  27. Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)

  28. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)

  29. Levy, R., Bilu, Y., Hershcovich, D., Aharoni, E., Slonim, N.: Context dependent claim detection (2014)

    Google Scholar 

  30. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150. Association for Computational Linguistics (2011)

    Google Scholar 

  31. Moens, M.F., Boiy, E., Palau, R.M., Reed, C.: Automatic detection of arguments in legal texts. In: Proceedings of the 11th International Conference on Artificial Intelligence and Law, pp. 225–230. ACM (2007)

    Google Scholar 

  32. Palau, R.M., Moens, M.F.: Argumentation mining: the detection, classification and structuring of arguments in text. In: International Conference on Artificial Intelligence and Law (2009)

    Google Scholar 

  33. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  34. Parsons, S., Oren, N., Reed, C.: Computational Models of Argument: Proceedings of COMMA 2014, vol. 266. IOS Press, Amsterdam (2014)

    MATH  Google Scholar 

  35. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)

    Google Scholar 

  36. Persing, I., Ng, V.: Modeling argument strength in student essays. In: ACL, vol. 1, pp. 543–552 (2015)

    Google Scholar 

  37. Rinott, R., Dankin, L., Perez, C.A., Khapra, M.M., Aharoni, E., Slonim, N.: Show me your evidence-an automatic method for context dependent evidence detection. In: EMNLP, pp. 440–450 (2015)

    Google Scholar 

  38. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Papers from the 1998 Workshop on Learning for Text Categorization, vol. 62, pp. 98–105 (1998)

    Google Scholar 

  39. Sandulescu, V., Chiru, M.: Predicting the future relevance of research institutions-the winning solution of the KDD cup 2016. arXiv preprint arXiv:1609.02728 (2016)

  40. Sardianos, C., Katakis, I.M., Petasis, G., Karkaletsis, V.: Argument extraction from news. In: Proceedings of the 2nd Workshop on Argumentation Mining, pp. 56–66 (2015)

    Google Scholar 

  41. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MATH  MathSciNet  Google Scholar 

  42. Stab, C., Gurevych, I.: Identifying argumentative discourse structures in persuasive essays. In: EMNLP, pp. 46–56 (2014)

    Google Scholar 

  43. Stab, C., Gurevych, I.: Recognizing insufficiently supported arguments in argumentative essays, pp. 980–990 (2017)

    Google Scholar 

  44. Suhartono, D., Iskandar, A.A., Fanany, M.I., Manurung, R.: Utilizing word vector representation for classifying argument components in persuasive essays (2016)

    Google Scholar 

  45. Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp. 1422–1432 (2015)

    Google Scholar 

  46. Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 90–94. Association for Computational Linguistics (2012)

    Google Scholar 

  47. Wei, Z., Liu, Y., Li, Y.: Is this post persuasive? Ranking argumentative comments in the online forum. In: The 54th Annual Meeting of the Association for Computational Linguistics, p. 195 (2016)

    Google Scholar 

  48. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)

    Google Scholar 

  49. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)

    Google Scholar 

  50. Yao, Y., Rosasco, L., Caponnetto, A.: On early stopping in gradient descent learning. Constr. Approx. 26(2), 289–315 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  51. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)

    Google Scholar 

Download references

Acknowledgments

This research was fully funded by “Penelitian Disertasi Doktor” from Ministry of Research, Technology and Higher Education of Indonesia with contract number 039A/VR.RTT/VI/2017.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Derwin Suhartono or Mohamad Ivan Fanany .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Suhartono, D., Gema, A.P., Winton, S., David, T., Fanany, M.I., Arymurthy, A.M. (2017). Hierarchical Attention Network with XGBoost for Recognizing Insufficiently Supported Argument. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69456-6_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69455-9

  • Online ISBN: 978-3-319-69456-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics