research-article

HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

Authors:
Tianchi Yang

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Linmei Hu

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Chuan Shi

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Houye Ji

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Xiaoli Li

Institute for Infocomm Research, Singapore

Institute for Infocomm Research, Singapore
View Profile

,
Liqiang Nie

Shan Dong University, Shandong Province, China

Shan Dong University, Shandong Province, China
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 39 Issue 3Article No.: 32pp 1–29https://doi.org/10.1145/3450352

Published:05 May 2021Publication History

ACM Transactions on Information Systems

Abstract

Short text classification has been widely explored in news tagging to provide more efficient search strategies and more effective search results for information retrieval. However, most existing studies, concentrating on long text classification, deliver unsatisfactory performance on short texts due to the sparsity issue and the insufficiency of labeled data. In this article, we propose a novel heterogeneous graph neural network-based method for semi-supervised short text classification, leveraging full advantage of limited labeled data and large unlabeled data through information propagation along the graph. Specifically, we first present a flexible heterogeneous information network (HIN) framework for modeling short texts, which can integrate any type of additional information and meanwhile capture their relations to address the semantic sparsity. Then, we propose Heterogeneous Graph Attention networks (HGAT) to embed the HIN for short text classification based on a dual-level attention mechanism, including node-level and type-level attentions. To efficiently classify new coming texts that do not previously exist in the HIN, we extend our model HGAT for inductive learning, avoiding re-training the model on the evolving HIN. Extensive experiments on single-/multi-label classification demonstrates that our proposed model HGAT significantly outperforms state-of-the-art methods across the benchmark datasets under both transductive and inductive learning.

References

Charu C. Aggarwal and ChengXiang Zhai. 2012. A survey of text classification algorithms. In Mining Text Data. Springer, 163–222. DOI:https://doi.org/10.1007/978-1-4614-3223-4_6Google ScholarDigital Library
Faizan Ahmad, Ahmed Abbasi, Jingjing Li, David G. Dobolyi, Richard G. Netemeyer, Gari D. Clifford, and Hsinchun Chen. 2020. A deep learning architecture for psychometric natural language processing. ACM Trans. Info. Syst. 38, 1, Article 6 (Feb. 2020), 29 pages. DOI:https://doi.org/10.1145/3365211Google ScholarDigital Library
David Blei, Andrew Ng, and Michael Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (May 2003), 993–1022. DOI:https://doi.org/10.1162/jmlr.2003.3.4-5.993Google Scholar
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2014. Spectral networks and locally connected networks on graphs. In Proceedings of the 2nd International Conference on Learning Representations (ICLR’14), Yoshua Bengio and Yann LeCun (Eds.). OpenReview.net. Retrieved from http://arxiv.org/abs/1312.6203.Google Scholar
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 29, Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett (Eds.). 3837–3845. Retrieved from http://papers.nips.cc/paper/6081-convolutional-neural-networks-on-graphs-with-fast-localized-spectral-filtering.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19). Association for Computational Linguistics, 4171–4186. DOI:https://doi.org/10.18653/v1/n19-1423Google Scholar
Di Yao, Jingping Bi, Jianhui Huang, and Jin Zhu. 2015. A word distributed representation-based framework for large-scale short text classification. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’15). IEEE, 1–7. DOI:https://doi.org/10.1109/IJCNN.2015.7280513Google Scholar
Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144. DOI:https://doi.org/10.1145/3097983.3098036Google ScholarDigital Library
Harris Drucker, Donghui Wu, and Vladimir Vapnik. 1999. Support vector machines for spam categorization. IEEE Trans. Neural Netw. 10, 5 (1999), 1048–1054. DOI:https://doi.org/10.1109/72.788645Google ScholarDigital Library
Jernej Flisar and Vili Podgorelec. 2020. Improving short text classification using information from DBpedia ontology. Fundamenta Informaticae 172, 3 (Feb. 2020), 261–297. DOI:https://doi.org/10.3233/FI-2020-1905Google ScholarCross Ref
Erfan Ghadery, Sajad Movahedi, Heshaam Faili, and Azadeh Shakery. 2019. MNCN: A multilingual ngram-based convolutional network for aspect category detection in online reviews. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6441–6448. DOI:https://doi.org/10.1609/aaai.v33i01.33016441Google ScholarCross Ref
Marco Gori, Gabriele Monfardini, and Franco Scarselli. 2005. A new model for learning in graph domains. In Proceedings of the IEEE International Joint Conference on Neural Networks, Vol. 2. IEEE, 729–734. DOI:https://doi.org/10.1109/IJCNN.2005.1555942Google ScholarCross Ref
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855–864. DOI:https://doi.org/10.1145/2939672.2939754Google ScholarDigital Library
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 30. 1024–1034. Retrieved from http://papers.nips.cc/paper/6703-inductive-representation-learning-on-large-graphs.Google Scholar
Ming Ji, Yizhou Sun, Marina Danilevsky, Jiawei Han, and Jing Gao. 2010. Graph regularized transductive classification on heterogeneous information networks. In Machine Learning and Knowledge Discovery in Databases. Vol. 6321. Springer, Berlin, 570–586. DOI:https://doi.org/10.1007/978-3-642-15880-3_42Google Scholar
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14), Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1746–1751. DOI:https://doi.org/10.3115/v1/d14-1181Google ScholarCross Ref
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17). OpenReview.net. Retrieved from https://openreview.net/forum?id=SJU4ayYgl.Google Scholar
Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of Machine Learning Research, Vol. 32. PMLR, 1188–1196. Retrieved from http://proceedings.mlr.press/v32/le14.html.Google Scholar
Chenliang Li, Shiqian Chen, Jian Xing, Aixin Sun, and Zongyang Ma. 2018. Seed-guided topic model for document filtering and classification. ACM Trans. Info. Syst. 37, 1, Article Article 9 (Dec. 2018), 37 pages. DOI:https://doi.org/10.1145/3238250Google Scholar
Chenliang Li, Yu Duan, Haoran Wang, Zhiqian Zhang, Aixin Sun, and Zongyang Ma. 2017. Enhancing topic modeling for short texts with auxiliary word embeddings. ACM Trans. Info. Syst. 36, 2, Article 11 (Aug. 2017), 30 pages. DOI:https://doi.org/10.1145/3091108Google ScholarDigital Library
Hu Linmei, Tianchi Yang, Chuan Shi, Houye Ji, and Xiaoli Li. 2019. Heterogeneous graph attention networks for semi-supervised short text classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics. DOI:https://doi.org/10.18653/v1/d19-1488Google ScholarCross Ref
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI’16). AAAI Press, 2873–2879. Google Scholar
Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. 2018. Weakly-supervised neural text classification. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM’18). Association for Computing Machinery, 983–992. DOI:https://doi.org/10.1145/3269206.3271737Google ScholarDigital Library
Liqiang Nie, Yongqi Li, Fuli Feng, Xuemeng Song, Meng Wang, and Yinglong Wang. 2020. Large-scale question tagging via joint question-topic embedding learning. ACM Trans. Info. Syst. 38, 2 (2020). DOI:https://doi.org/10.1145/3380954Google ScholarDigital Library
Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (ACL’05), Kevin Knight, Hwee Tou Ng, and Kemal Oflazer (Eds.). Association for Computer Linguistics, 115–124. Retrieved from https://www.aclweb.org/anthology/P05-1015/.Google ScholarDigital Library
Xuan-Hieu Phan, Le-Minh Nguyen, and Susumu Horiguchi. 2008. Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM Press. DOI:https://doi.org/10.1145/1367497.1367510Google ScholarDigital Library
Rafael Geraldeli Rossi, Alneu de Andrade Lopes, and Solange Oliveira Rezende. 2016. Optimization and label propagation in bipartite heterogeneous networks to improve transductive classification of texts. Info. Process. Manage. 52, 2 (Mar. 2016), 217–257. DOI:https://doi.org/10.1016/j.ipm.2015.07.004Google Scholar
François Rousseau, Emmanouil Kiagias, and Michalis Vazirgiannis. 2015. Text categorization as a graph classification problem. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). The Association for Computer Linguistics, 1702–1712. DOI:https://doi.org/10.3115/v1/p15-1164Google Scholar
Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. 2017. Dynamic routing between capsules. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 30. Curran Associates, 3856–3866. Retrieved from http://papers.nips.cc/paper/6975-dynamic-routing-between-capsules.Google Scholar
Franco Scarselli, Marco Gori, Ah Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The graph neural network model. IEEE Trans. Neural Netw. 20 (Jan. 2009), 61–80. DOI:https://doi.org/10.1109/TNN.2008.2005605Google Scholar
Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. Comput. Surveys 34, 1 (Mar. 2002), 1–47. DOI:https://doi.org/10.1145/505282.505283Google ScholarDigital Library
Kazuya Shimura, Jiyi Li, and Fumiyo Fukumoto. 2018. HFT-CNN: Learning hierarchical category structure for multi-label short text categorization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 811–816. DOI:https://doi.org/10.18653/v1/d18-1093Google ScholarCross Ref
Joao Silva, Luisa Coheur, Ana Cristina Mendes, and Andreas Wichert. 2011. From symbolic to sub-symbolic information in question classification. Artific. Intell. Rev. 35, 2 (Feb. 2011), 137–154. DOI:https://doi.org/10.1007/s10462-010-9188-4Google Scholar
Koustuv Sinha, Yue Dong, Jackie Chi Kit Cheung, and Derek Ruths. 2018. A hierarchical neural attention-based text classifier. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 817–823. DOI:https://doi.org/10.18653/v1/d18-1094Google ScholarCross Ref
Ge Song, Yunming Ye, Xiaolin Du, Xiaohui Huang, and Shifu Bie. 2014. Short text classification: A survey. J. Multimedia 9, 5 (May 2014), 635. DOI:https://doi.org/10.4304/jmm.9.5.635-643Google ScholarCross Ref
Jian Tang, Meng Qu, and Qiaozhu Mei. 2015. PTE: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Longbing Cao, Chengqi Zhang, Thorsten Joachims, Geoffrey I. Webb, Dragos D. Margineantu, and Graham Williams (Eds.). ACM, 1165–1174. DOI:https://doi.org/10.1145/2783258.2783307Google Scholar
Jesper E. Van Engelen and Holger H. Hoos. 2020. A survey on semi-supervised learning. Mach. Learn. 109, 2 (Feb. 2020), 373–440. DOI:https://doi.org/10.1007/s10994-019-05855-6Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 30. 5998–6008. Retrieved from http://papers.nips.cc/paper/7181-attention-is-all-you-need.Google Scholar
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18). OpenReview.net. Retrieved from https://openreview.net/forum?id=rJXMpikCZ.Google Scholar
Daniele Vitale, Paolo Ferragina, and Ugo Scaiella. 2012. Classification of short texts by deploying topical annotations. In Lecture Notes in Computer Science. Springer, Berlin, 376–387. DOI:https://doi.org/10.1007/978-3-642-28997-2_32Google Scholar
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han. 2016. Text classification with heterogeneous information network kernels. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, Dale Schuurmans and Michael P. Wellman (Eds.). AAAI Press, 2130–2136. Retrieved from http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12392.Google Scholar
Jin Wang, Zhongyuan Wang, Dawei Zhang, and Jun Yan. 2017. Combining knowledge with deep convolutional neural networks for short text classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Vol. 350. International Joint Conferences on Artificial Intelligence Organization. DOI:https://doi.org/10.24963/ijcai.2017/406Google ScholarCross Ref
Pu Wang and Carlotta Domeniconi. 2008. Building semantic kernels for text classification using wikipedia. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08). ACM Press, Las Vegas, NV, 713. DOI:https://doi.org/10.1145/1401890.1401976Google ScholarDigital Library
Sida I. Wang and Christopher D. Manning. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. The Association for Computer Linguistics, 90–94. Retrieved from https://www.aclweb.org/anthology/P12-2018/.Google Scholar
Xiang Wang, Ruhua Chen, Yan Jia, and Bin Zhou. 2013. Short text classification using Wikipedia concept-based document representation. In Proceedings of the International Conference on Information Technology and Applications. IEEE, 471–474. DOI:https://doi.org/10.1109/ita.2013.114Google ScholarDigital Library
Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S. Yu. 2019. Heterogeneous graph attention network. In Proceedings of the World Wide Web Conference (WWW’19). ACM Press. DOI:https://doi.org/10.1145/3308558.3313562Google Scholar
Xiaolong Wang, Yufei Ye, and Abhinav Gupta. 2018. Zero-shot recognition via semantic embeddings and knowledge graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, 6857–6866. DOI:https://doi.org/10.1109/CVPR.2018.00717Google ScholarCross Ref
Jingyun Xu, Yi Cai, Xin Wu, Xue Lei, Qingbao Huang, Ho-fung Leung, and Qing Li. 2020. Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386 (Apr. 2020), 42–53. DOI:https://doi.org/10.1016/j.neucom.2019.08.080Google Scholar
Min Yang, Wei Zhao, Jianbo Ye, Zeyang Lei, Zhou Zhao, and Soufei Zhang. 2018. Investigating capsule networks with dynamic routing for text classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 3110–3119. DOI:https://doi.org/10.18653/v1/D18-1350Google ScholarCross Ref
Yiming Yang and Christopher G. Chute. 1994. An example-based mapping method for text categorization and retrieval. ACM Trans. Info. Syst. 12, 3 (July 1994), 252–277. DOI:https://doi.org/10.1145/183422.183424Google ScholarDigital Library
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 7370–7377. DOI:https://doi.org/10.1609/aaai.v33i01.33017370Google ScholarCross Ref
Chunyong Yin, Jun Xiang, Hui Zhang, Jin Wang, Zhichao Yin, and Jeong-Uk Kim. 2015. A new SVM method for short text classification based on semi-supervised learning. In Proceedings of the 4th International Conference on Advanced Information Technology and Sensor Application (AITS’15). IEEE, 100–103. DOI:https://doi.org/10.1109/aits.2015.34Google ScholarDigital Library
Jichuan Zeng, Jing Li, Yan Song, Cuiyun Gao, Michael R. Lyu, and Irwin King. 2018. Topic memory networks for short text classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3120–3131. DOI:https://doi.org/10.18653/v1/d18-1351Google ScholarCross Ref
Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 28. 649–657. Retrieved from http://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.Google Scholar
Dengyong Zhou, Olivier Bousquet, Thomas N. Lal, Jason Weston, and Bernhard Schölkopf. 2004. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16. MIT Press, 321–328. Retrieved from http://papers.nips.cc/paper/2506-learning-with-local-and-global-consistency.pdf.Google Scholar
Guang-You Zhou and Jimmy Xiangji Huang. 2017. Modeling and mining domain shared knowledge for sentiment analysis. ACM Trans. Info. Syst. 36, 2, Article 18 (Aug. 2017), 36 pages. DOI:https://doi.org/10.1145/3091995Google ScholarDigital Library
Xiaojin Zhu, Zoubin Ghahramani, and John Lafferty. 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on International Conference on Machine Learning (ICML’03). AAAI Press, 912–919. Google ScholarDigital Library

Index Terms

HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

Recommendations

Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification▪
Abstract
In real-world scenarios, considerable human power and expert knowledge are required to label data. Therefore, solving short text classification problems in a semi-supervised manner is a good method. Existing graph-based semi-supervised short text ...
Read More
Semi-supervised document classification using heterogeneous rule selection
ICEC '17: Proceedings of the International Conference on Electronic Commerce

In traditional supervised classification, a large set of labeled data is required to train the model. However, labeled data are often hard to obtain and expensive, because human efforts are needed for the labeling. Therefore, semi-supervised learning ...
Read More
Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification
WWW '19: The World Wide Web Conference

Hierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to reduce the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 39, Issue 3
July 2021
432 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3450607
Editor:
Min Zhang
Tsinghua University, China
Issue’s Table of Contents
Copyright © 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 May 2021
- Accepted: 1 February 2021
- Revised: 1 January 2021
- Received: 1 May 2020
Published in tois Volume 39, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Short texts
graph neural networks
semi-supervised learning
heterogeneous information network
inductive learning
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 29
  Total Citations
  View Citations
- 2,004
  Total Downloads
- Downloads (Last 12 months)550
- Downloads (Last 6 weeks)78
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification▪

Semi-supervised document classification using heterogeneous rule selection

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification