A Hierarchical Model with Recurrent Convolutional Neural Networks for Sequential Sentence Classification

Jiang, Xinyu; Zhang, Bowen; Ye, Yunming; Liu, Zhenhua

doi:10.1007/978-3-030-32236-6_7

A Hierarchical Model with Recurrent Convolutional Neural Networks for Sequential Sentence Classification

Xinyu Jiang¹³,
Bowen Zhang¹⁴,
Yunming Ye¹³ &
…
Zhenhua Liu ORCID: orcid.org/0000-0003-2760-3621¹⁵

Conference paper
First Online: 30 September 2019

4873 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

Hierarchical neural networks approaches have achieved outstanding results in the latest sequential sentence classification research work. However, it is challenging for the model to consider both the local invariant features and word dependent information of the sentence. In this work, we concentrate on the sentence representation and context modeling components that influence the effects of the hierarchical architecture. We present a new approach called SR-RCNN to generate more precise sentence encoding which leverage complementary strength of bi-directional recurrent neural network and text convolutional neural network to capture contextual and literal relevance information. Afterwards, statement-level encoding vectors are modeled to capture the intrinsic relations within surrounding sentences. In addition, we explore the applicability of attention mechanisms and conditional random fields to the task. Our model advances sequential sentence classification in medical abstracts to new state-of-the-art performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The dataset is downloaded from: https://github.com/Franck-Dernoncourt/pubmed-rct.
2.
https://www.kaggle.com/c/alta-nicta-challenge2.
3.
The word vectors are downloaded from: http://evexdb.org/pmresources/vec-space-models/.

References

Amini, I., Martinez, D., Molla, D., et al.: Overview of the ALTA 2012 Shared Task (2012)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
MATH Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)
MATH Google Scholar
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781 (2016)
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8609–8613. IEEE (2013)
Google Scholar
Dernoncourt, F., Lee, J.Y.: Pubmed 200k rct: a dataset for sequential sentence classification in medical abstracts. arXiv preprint arXiv:1710.06071 (2017)
Dernoncourt, F., Lee, J.Y., Szolovits, P.: Neural networks for joint sentence classification in medical paper abstracts. arXiv preprint arXiv:1612.05251 (2016)
Hachey, B., Grover, C.: Sequence modelling for sentence classification in a legal summarisation system. In: Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 292–296. ACM (2005)
Google Scholar
Hassanzadeh, H., Groza, T., Hunter, J.: Identifying scientific artefacts in biomedical literature: the evidence based medicine use case. J. Biomed. Inform. 49, 159–170 (2014)
Article Google Scholar
Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M.: Identifying sections in scientific abstracts using conditional random fields. In: Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I (2008)
Google Scholar
Huang, K.C., Chiang, I.J., Xiao, F., Liao, C.C., Liu, C.C.H., Wong, J.M.: Pico element detection in medical text without metadata: are first sentences enough? J. Biomed. Inform. 46(5), 940–946 (2013)
Article Google Scholar
Jagannatha, A.N., Yu, H.: Structured prediction models for RNN based sequence labeling in clinical text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Conference on Empirical Methods in Natural Language Processing, vol. 2016, p. 856. NIH Public Access (2016)
Google Scholar
Jin, D., Szolovits, P.: Hierarchical neural networks for sequential sentence classification in medical scientific abstracts. arXiv preprint arXiv:1808.06161 (2018)
Kim, S.N., Martinez, D., Cavedon, L., Yencken, L.: Automatic classification of sentences to support evidence based medicine. In: BMC Bioinformatics, vol. 12, p. S5. BioMed Central (2011)
Article Google Scholar
Kim, T., Yang, J.: Abstractive text classification using sequence-to-convolution neural networks. arXiv preprint arXiv:1805.07745 (2018)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of 18th International Conference on Machine Learning, pp. 282–289 (2001)
Google Scholar
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. arXiv preprint arXiv:1603.03827 (2016)
Lin, J., Karakos, D., Demner-Fushman, D., Khudanpur, S.: Generative content models for structural analysis of medical abstracts. In: Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology, LNLBioNLP 2006. pp. 65–72. Association for Computational Linguistics, Stroudsburg (2006)
Google Scholar
Liu, L., et al.: Empower sequence labeling with task-aware neural language model. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mirończuk, M.M., Protasiewicz, J.: A recent overview of the state-of-the-art elements of text classification. Expert Syst. Appl. 106, 36–54 (2018)
Article Google Scholar
Moen, S., Ananiadou, T.S.S.: Distributional semantics resources for biomedical text processing. In: Proceedings of LBM, pp. 39–44 (2013)
Google Scholar
Moriya, S., Shibata, C.: Transfer learning method for very deep CNN for text classification and methods for its evaluation. In: 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 153–158. IEEE (2018)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Reimers, N., Gurevych, I.: Optimal hyperparameters for deep lstm-networks for sequence labeling tasks. arXiv preprint arXiv:1707.06799 (2017)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Viterbi, A.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)
Article Google Scholar
Yamamoto, Y., Takagi, T.: A sentence classification system for multi biomedical literature summarization. In: 21st International Conference on Data Engineering Workshops (ICDEW 2005), pp. 1163–1163, April 2005
Google Scholar
Yin, W., Kann, K., Yu, M., Schuetze, H.: Comparative study of CNN and RNN for natural language processing (2017). arXiv preprint arXiv:1702.01923 (2017)
Zhou, Y., Xu, B., Xu, J., Yang, L., Li, C.: Compositional recurrent neural networks for Chinese short text classification. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 137–144. IEEE (2016)
Google Scholar

Download references

Acknowledgment

This research was supported in part by NSFC under Grant No. U1836107 and No. 61572158.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Xinyu Jiang & Yunming Ye
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Bowen Zhang
NLP Group, Gridsum, Beijing, China
Zhenhua Liu

Authors

Xinyu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Bowen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yunming Ye
View author publications
You can also search for this author in PubMed Google Scholar
Zhenhua Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunming Ye .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, X., Zhang, B., Ye, Y., Liu, Z. (2019). A Hierarchical Model with Recurrent Convolutional Neural Networks for Sequential Sentence Classification. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_7
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)