LSTM $$^{2}$$ : Multi-Label Ranking for Document Classification

Yan, Yan; Wang, Ying; Gao, Wen-Chao; Zhang, Bo-Wen; Yang, Chun; Yin, Xu-Cheng

doi:10.1007/s11063-017-9636-0

LSTM$^{2}$: Multi-Label Ranking for Document Classification

Published: 22 May 2017

Volume 47, pages 117–138, (2018)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Yan Yan ORCID: orcid.org/0000-0002-0187-7010¹,
Ying Wang¹,
Wen-Chao Gao¹,
Bo-Wen Zhang²,
Chun Yang² &
…
Xu-Cheng Yin²

1783 Accesses
31 Citations
3 Altmetric
Explore all metrics

Abstract

Multi-label document classification is a typical challenge in many real-world applications. Multi-label ranking is a common approach, while existing studies usually disregard the effects of context and the relationships among labels during the scoring process. In this paper, we propose an Long Short Term Memory (LSTM)-based multi-label ranking model for document classification, namely LSTM$^2$ consisting of repLSTM—an adaptive data representation process and rankLSTM—a unified learning-ranking process. In repLSTM, the supervised LSTM is used to learn document representation by incorporating the document labels. In rankLSTM, the order of the documents labels is rearranged in accordance with a semantic tree, in which the semantics are compatible with and appropriate to the sequential learning of LSTM. The model can be wholly trained by sequentially predicting labels. Connectionist Temporal Classification is performed in rankLSTM to address the error propagation for a variable number of labels in each document. Moreover, a variety of experiments with document classification conducted on three typical datasets reveal the impressive performance of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

References

Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Bioinformatics 22(7):830–836
Article Google Scholar
Blei DM, Ng AY, Jordan MI (2001) Latent dirichlet allocation. In: Advances in neural information processing systems, pp 601–608
Blockeel H, De Raedt L, Ramon J (2000) Top-down induction of clustering trees. arXiv:cs/0011032
Bucak SS, Mallapragada PK, Jin R, Jain AK (2009) Efficient multi-label ranking for multi-class learning: application to object recognition. In: 2009 IEEE 12th international conference on Computer vision, IEEE, pp 2098–2105
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Article Google Scholar
Chen J, Chaudhari NS (2005) Protein secondary structure prediction with bidirectional lstm networks. In: International joint conference on neural networks: post-conference workshop on computational intelligence approaches for the analysis of bio-data (CI-BIO), August 2005
Chiang TH, Lo HY, Lin SD (2012) A ranking-based knn approach for multi-label classification. ACML 25:81–96
Google Scholar
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
Dembczyński K, Waegeman W, Cheng W, Hüllermeier E (2012) On label dependence and loss minimization in multi-label classification. Mach Learn 88(1–2):5–45
Article MathSciNet MATH Google Scholar
Dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of the 25th international conference on computational linguistics (COLING), Dublin, Ireland
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Article Google Scholar
Elsas JL, Donmez P, Callan J, Carbonell JG (2009) Pairwise document classification for relevance feedback. Technical report, DTIC Document
Gharroudi O, Elghazel H, Aussem A (2015) Ensemble multi-label classification: a comparative study on threshold selection and voting methods. In: 2015 IEEE 27th international conference on Tools with artificial intelligence (ICTAI), IEEE, pp 377–384
Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):52
Article Google Scholar
Graves A, Daojian, Liu K, Lai S, Zhou G, Zhao J (2012) Supervised sequence labelling with recurrent neural networks, vol 385. Springer, Berlin
Google Scholar
Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on Acoustics, speech and signal processing (ICASSP), IEEE, pp 6645–6649
Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artifi Intell 172(16):1897–1916
Article MathSciNet MATH Google Scholar
Ioannou M, Sakkas G, Tsoumakas G, Vlahavas I (2010) Obtaining bipartitions from score vectors for multilabel classification. In: 2010, 22nd IEEE international conference on tools with artificial intelligence, vol. 1, IEEE, pp 409–416
Jordan A (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv Neural Inf Process Syst 14:841
Google Scholar
Karpathy A, Fei-Fei L (2014) Deep visual-semantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: 29th AAAI conference on artificial intelligence
Li J, Chen X, Hovy E, Jurafsky D (2015) Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066
Madjarov G, Kocev D, Gjorgjevikj D, Džeroski S (2012) An extensive experimental comparison of methods for multi-label learning. Pattern Recognit 45(9):3084–3104
Article Google Scholar
Mencia EL, Fürnkranz J (2008) Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Machine learning and knowledge discovery in databases, Springer, pp 50–65
Mikolov T, Karafiát M, Burget L, Cernockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: INTERSPEECH 2010, 11th annual conference of the international speech communication association, Makuhari, Chiba, Japan, 26–30 September 2010, pp 1045–1048
Mikolov T, Yih Wt, Zweig G (2013) Linguistic regularities in continuous space word representations. In: HLT-NAACL, vol 13, pp 746–751
Padhye A (2006) Comparing supervised and unsupervised classification of messages in the enron email corpus. Ph.D. thesis, University of Minnesota
Petterson J, Caetano TS (2010) Reverse multi-label learning. In: Advances in neural information processing systems, pp 1912–1920
Srivastava N, Mansimov E, Salakhutdinov R (2015) Unsupervised learning of video representations using lstms. arXiv preprint arXiv:1502.04681
Srivastava N, Salakhutdinov RR, Hinton GE (2013) Modeling documents with deep boltzmann machines. arXiv preprint arXiv:1309.6865
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP (2008) Multi-label classification of music into emotions. ISMIR 8:325–330
Google Scholar
Tsoumakas G, Katakis I, Vlahavas I (2008) Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings ECML/PKDD 2008 workshop on mining multidimensional data (MMD08), pp 30–44
Vembu S, Gärtner T (2011) Label ranking algorithms: a survey. In: Preference learning, Springer, Berlin, pp 45–64
Xue X, Zhang W, Zhang J, Wu B, Fan J, Lu Y (2011) Correlative multi-label multi-instance image annotation. In: 2011 IEEE international conference on Computer vision (ICCV), IEEE, pp 651–658
Yepes AJ, MacKinlay A, Bedo J, Garnavi R, Chen Q (2014) Deep belief networks and biomedical text categorisation. In: Australasian language technology association workshop, p 123
Zeng D, Liu K, Lai S, Zhou G, Zhao J (2014) Relation classification via convolutional deep neural network. In: Proceedings of COLING, pp 2335–2344
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Article MATH Google Scholar
Zhu X, Sobihani P, Guo H (2015) Long short-term memory over recursive structures. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1604–1612

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, School of Mechanical Electronic and Information Engineering, China University of Mining and Technology Beijing, Beijing, 100083, China
Yan Yan, Ying Wang & Wen-Chao Gao
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
Bo-Wen Zhang, Chun Yang & Xu-Cheng Yin

Authors

Yan Yan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Chao Gao
View author publications
You can also search for this author in PubMed Google Scholar
Bo-Wen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xu-Cheng Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Yan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, Y., Wang, Y., Gao, WC. et al. LSTM$^{2}$: Multi-Label Ranking for Document Classification. Neural Process Lett 47, 117–138 (2018). https://doi.org/10.1007/s11063-017-9636-0

Download citation

Published: 22 May 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s11063-017-9636-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LSTM\(^{2}\): Multi-Label Ranking for Document Classification

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

Learning from positive and unlabeled data: a survey

A review of semi-supervised learning for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

LSTM\(^{2}\): Multi-Label Ranking for Document Classification

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

Learning from positive and unlabeled data: a survey

A review of semi-supervised learning for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation