An In-Depth Experimental Comparison of RNTNs and CNNs for Sentence Modeling

Ahmadi, Zahra; Skowron, Marcin; Stier, Aleksandrs; Kramer, Stefan

doi:10.1007/978-3-319-67786-6_11

Zahra Ahmadi¹⁷,
Marcin Skowron¹⁸,
Aleksandrs Stier¹⁷ &
…
Stefan Kramer¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10558))

Included in the following conference series:

International Conference on Discovery Science

931 Accesses

Abstract

The goal of modeling sentences is to accurately represent their meaning for different tasks. A variety of deep learning architectures have been proposed to model sentences, however, little is known about their comparative performance on a common ground, across a variety of datasets, and on the same level of optimization. In this paper, we provide such a novel comparison for two popular architectures, Recursive Neural Tensor Networks (RNTNs) and Convolutional Neural Networks (CNNs). Although RNTNs have been shown to work well in many cases, they require intensive manual labeling due to the vanishing gradient problem. To enable an extensive comparison of the two architectures, this paper employs two methods to automatically label the internal nodes: a rule-based method and (this time as part of the RNTN method) a convolutional neural network. This enables us to compare these RNTN models to a relatively simple CNN architecture. Experiments conducted on a set of benchmark datasets demonstrate that the CNN outperforms the RNTNs based on automatic phrase labeling, whereas the RNTN based on manual labeling outperforms the CNN. The results corroborate that CNNs already offer good predictive performance and, at the same time, more research on RNTNs is needed to further exploit sentence structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 740–750 (2014)
Google Scholar
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 1107–1116 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Irsoy, O., Cardie, C.: Deep recursive neural networks for compositionality in language. In: Advances in Neural Information Processing Systems, pp. 2096–2104 (2014)
Google Scholar
Iyyer, M., Manjunatha, V., Boyd-Graber, J., Daumé III, H.: Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of 53rd Annual Meeting of the Association for Computational Linguistics, pp. 1681–1691 (2015)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, pp. 423–430 (2003)
Google Scholar
Kong, L., Schneider, N., Swayamdipta, S., Bhatia, A., Dyer, C., Smith, N.A.: A dependency parser for tweets. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 1001–1012 (2014)
Google Scholar
Li, J., Luong, M.T., Jurafsky, D., Hovy, E.: When are tree structures necessary for deep learning of representations? In: Proceedings of Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 2304–2314 (2015)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)
Google Scholar
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.P.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 1631–1642 (2013)
Google Scholar
Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. CoRR abs/1510.03820 (2015)
Google Scholar

Download references

Acknowledgements

The authors thank PRIME Research for supporting the first author during her research time. The second author is supported by the Austrian Science Fund (FWF): P27530-N15.

Author information

Authors and Affiliations

Institut Für Informatik, Johannes Gutenberg-Universität, Mainz, Germany
Zahra Ahmadi, Aleksandrs Stier & Stefan Kramer
Austrian Research Institute for Artificial Intelligence, Vienna, Austria
Marcin Skowron

Authors

Zahra Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Skowron
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandrs Stier
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zahra Ahmadi .

Editor information

Editors and Affiliations

Kyoto University, Kyoto, Japan
Akihiro Yamamoto
Hokkaido University, Sapporo, Japan
Takuya Kida
National Institute of Informatics, Tokyo, Japan
Takeaki Uno
Gakushuin University, Tokyo, Japan
Tetsuji Kuboyama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmadi, Z., Skowron, M., Stier, A., Kramer, S. (2017). An In-Depth Experimental Comparison of RNTNs and CNNs for Sentence Modeling. In: Yamamoto, A., Kida, T., Uno, T., Kuboyama, T. (eds) Discovery Science. DS 2017. Lecture Notes in Computer Science(), vol 10558. Springer, Cham. https://doi.org/10.1007/978-3-319-67786-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-67786-6_11
Published: 16 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67785-9
Online ISBN: 978-3-319-67786-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics