Cognitive structure learning model for hierarchical multi-label text classification

doi:10.1016/j.knosys.2021.106876

Knowledge-Based Systems

Volume 218, 22 April 2021, 106876

https://doi.org/10.1016/j.knosys.2021.106876 Get rights and content

Abstract

The human mind grows in learning new knowledge, which finally organizes and develops a basic mental pattern called cognitive structure. Hierarchical multi-label text classification (HMLTC), a fundamental but challenging task in many real-world applications, aims to classify the documents with hierarchical labels to form a resembling cognitive structure learning process. Existing approaches for HMLTC mainly focus on partial new knowledge learning or the global cognitive-structure-like label structure utilization in a cognitive view. However, the complete cognitive structure learning model is a unity that is indispensably constructed by the global label structure utilization and partial knowledge learning, which is ignored among those HMLTC approaches. To address this problem, we will imitate the cognitive structure learning process into the HMLTC learning and propose a unified framework called Hierarchical Cognitive Structure Learning Model (HCSM) in this paper. HCSM is composed of the Attentional Ordered Recurrent Neural Network (AORNN) submodule and Hierarchical Bi-Directional Capsule (HBiCaps) submodule. Both submodules utilize the partial new knowledge and global hierarchical label structure comprehensively for the HMLTC task. On the one hand, AORNN extracts the semantic vector as partial new knowledge from the original text by the word-level and hierarchy-level embedding granularities. On the other hand, AORNN builds the hierarchical text representation learning corresponding to the global label structure by the document-level neurons ordering. HBiCaps employs an iteration to form a unified label categorization process similar to cognitive-structure learning: firstly, using the probability computation of local hierarchical relationships to maintain partial knowledge learning; secondly, modifying the global hierarchical label structure based on the dynamic routing mechanism between capsules. Moreover, the experimental results on four benchmark datasets demonstrate that HCSM outperforms or matches state-of-the-art text classification methods.

Introduction

In the psychology field, the cognitive structure is a basic mental pattern for organizing a person’s full knowledge. The cognitive structure provides meaning and guide to practices, and supervises the processing of new knowledge and retrieving stored knowledge. The earliest study on cognitive structure theory is back to the 1960s provided by educational psychologists [1]. Since then, precedent researches [2], [3], [4] extensively studied how the cognitive structure works in the human mind. With the rapid development of cognitive computing, existing works [5], [6] have verified the significant effect of the cognitive structure for machine learning.

Multiple labels for documents are mostly from human annotators, and semantics embedding in labels reflect the cognitive structure of each annotator. Compared with traditional flat multi-label text classification [7], [8], HMLTC is more like the process of cognitive structure learning, and the hierarchical label structure is more like the cognitive structure in a human mind view. The task of HMLTC is to assign a document to multiple hierarchical categories, typically in which semantic labels are organized in a tree-like, or Direct Acyclic Graph (DAG) structured hierarchy [9]. Fig. 1 illustrates such an example. Nowadays, the growth in web text volume, such as social messengers, microblogs, and web forum threads, has made it urgent to develop HMLTC that facilitates understanding or organizing such text information. Faced with the actual demand, both the industry and academia have successfully utilized some applied frameworks [10], [11] to promote the HMLTC growth, such as question answering [12], online advertising [13], and scientific literature organization [14].

According to Jean Piaget and William Perry [15], human’s existing cognitive structure serves as frames of reference to guarantee and facilitate the new knowledge learning process. In turn, new knowledge learning provides the practice and development for the existing cognitive structure. Based on the cognitive structure learning process, we group existing HMLTC into two major types in a human mind view, i.e., the partial knowledge learning approach and the global label structure utilization approach. The partial knowledge learning approach focuses on local semantic relationships within the limitations of the innate hierarchical structure or predefined setting. The partial knowledge learning approach seems like people have to make sense of new knowledge as basic concepts into the existing cognitive structure when facing a new field, such as relationships between nodes for ensemble learning [16], parent–child categories for transfer learning [17], and subgraphs for recurrent neural network [18]. Those above-mentioned partial knowledge frameworks generalize and develop the hierarchical label structure, preserving the independence of each knowledge framework but suffering from the error-propagation problem [19]. In contrast, the global label structure utilization approach focuses on the existing holistic hierarchical label structure, which has the advantage that the parameters of the global label structure utilization approach are considerably less than those of the partial knowledge learning approach. The global label structure utilization approach seems like people retrieve the old cognitive structure and modify it to accommodate the new knowledge. Several strategies for HMLTC can be regarded as the global label structure utilization approach, such as reinforcement learning [20], meta-learning [21], and graph convolutional network [22]. However, those global hierarchical label structure utilization methods mainly capture the partial knowledge from the entire structure, eventually causing the underfitting problem.

Actually, the partial knowledge learning and global label structure utilization are integral mental processes for cognitive structure learning. So far, there is no HMLTC approach that devises a unified model to imitate the cognitive structure learning, developed by the interrelated influence between the partial knowledge and the global label structure. To capture the relationship between the text and the label structure, we divide the unified model into two submodules for the text representation and hierarchical multi-label prediction. Both submodules merge the partial knowledge learning into the global hierarchical label structure to reinforce each other for the HMLTC task. For the text representation, each document has an innate hierarchical structure (words form sentences, sentences form paragraphs, paragraphs form documents). Some hierarchical attention approaches [23], [24] are designed to construct a document representation based on the word level, sentence level, or sentiment level about the innate document structure. Those attention mechanisms are treated as partial knowledge extractors but ignore the corresponding global hierarchical label structure. It is challenging to embed the global hierarchical label structure into those attention mechanisms to form a complete document representation, namely integrating new knowledge into the global label structure. For the hierarchical multi-label prediction, deep learning approaches [25], [26] have achieved a significant improvement in HMLTC. Among those deep learning approaches, the capsule neural network approach [27] using the dynamic routing mechanism [28] shows the superiority over traditional HMLTC to learn part-whole relationships automatically. Extending the dynamic routing process between capsules as a cognitive structure learning to develop the cognitive-structure-like label structure is also challenging.

To address the above challenges, we proposed a comprehensive hierarchical cognitive structure learning model called HCSM to interrelate the partial knowledge learning and global label structure as a unified cognitive structure process for the HMLTC task. HCSM is composed of two submodules, i.e., the AORNN submodule for text representation and the HBiCaps submodule for hierarchical multi-label prediction. AORNN employs two basic embedding levels (the word-embedding and hierarchy-embedding levels) as the partial knowledge learning, which uses the attention mechanism on the relevant contexts. After that, the AORNN submodule integrates those partial embedding vectors into the document-embedding level using the hierarchy-ordered neurons, thus building the relationship between the text and the global hierarchical label structure. The hierarchy-ordered neurons are the set of neurons of the long short-term memory (LSTM), which modifies LSTM architecture to track the global hierarchical label structure. The high-level hierarchy representation as the high-ranking neurons will store the long-term information preserved for many steps, while the low-level hierarchy representation as the low-ranking neurons will store the short-term information forgotten in a few steps. For those hierarchy-ordered neurons, the premise to erase (or update) high-ranking neurons is to erase (or update) all lower-ranking neurons first. HBiCaps employs an iteration composed of the hierarchical top-down and bottom-up traversal fashions. The iteration develops the global hierarchical label structure using partial knowledge from the text and local hierarchical relationships. The hierarchical top-down traversal aims to exploit the local hierarchical relationships between labels as probabilities, thereby learning partial knowledge embedding in the local modularity of labels. The hierarchical bottom-up traversal extends the dynamic routing mechanism between capsules as a cognitive structure learning, thereby merging the hierarchy-embedding representation with the local modularity of labels to form a cognitive structure learning classifier. Both submodules attempt to transform the complete cognitive structure learning process into HMLTC using the unified proposed model.

Comparative studies on four text datasets have been conducted in experiments to demonstrate the effectiveness of the proposed approach. The main contributions of our paper are as follows:

(1) We propose to deploy the cognitive structure learning in HMLTC as a unified model (HCSM). The imitation of cognitive structure advances HMLTC, integrating the partial new knowledge learning to interrelate the global label structure modeling, capturing semantic text representation, and providing meaningful text multi-label categories.

(2) AORNN, as a text representation submodule of HCSM, exploits the hierarchy-ordered neurons of modified LSTM to represent and organize the document-level text representation based on the partial knowledge learning of the word-level and hierarchy-level semantics. We use the AORNN submodule to build the relationship between partial new knowledge and the global hierarchical label structure to form a comprehensive text representation.

(3) The HBiCaps submodule of HCSM employs an iteration that utilizes the local hierarchical relationships between labels in the top-down traversal fashion. In the iteration’s bottom-up traversal fashion, HBiCaps merges relevant hierarchy-embedding text representation and local hierarchical relationships as partial new knowledge into the dynamic routing mechanism between capsules. During two traversal fashions, the global hierarchical label structure is developing for the better multi-label prediction of HMLTC.

Section snippets

Hierarchical multi-label classification

Silla and Freitas [19] grouped existing hierarchical multi-label classification approaches into three major categories, i.e., the flat, local, and global approach. The flat approach [29] is the simplest one to handle the hierarchical multi-label classification as traditional flat multi-label classification, ignoring hierarchical relationships. Compared with the traditional flat multi-label framework, the hierarchical structure reserves a rich source of relationships in a tree or DAG structure

Problem formalization

In this section, we introduce two basic definitions and then formulate the problem of HMLTC.

Definition 1 Hierarchical Structure

Suppose there is $H = {V, E}$ in a tree or DAG structure, where $V$ is a set of nodes $v_{i}$ to represent multi-level labels, and $E$ is a set of edges to represent the direct link between labels. The hierarchical set of nodes $V$ is distributed in the label set $Y = {Y^{1}, Y^{2}, \dots, Y^{m}}$ , where $Y^{i}$ is the set of possible categories in the $i$ th hierarchy and $m$ is the total number of the hierarchical level. We define the partial

Proposed approach

Based on our problem formalization, the proposed HCSM has two major submodules: (1) the AORNN submodule $G ()$ that aims to extract a hierarchical text representation from the contexts, and (2) the HBiCaps submodule $F ()$ that predicts the hierarchical multi-label structure based on the label space. The AORNN submodule modifies and applies various LSTM models in different embedding granularity levels, using partial embedding vectors and global ranking neurons corresponding to the cognitive structure

Evaluation measures and experimental datasets

To fairly measure our algorithm with baselines, we judge the results of experiments with widely used metrics [54], [55] in the field of multi-label text classification, i.e., the macro-precision, macro-recall, macro-F1 (MaP, MaR, MaF1), and the micro-precision, recall, F1 (MiP, MiR, MiF1). MaP, MaR, and MaF1 are defined in Formula (13), and MiP, MiR, and MiF1 are defined in Formula (14) as follows: $M a P = \frac{1}{γ} \sum_{i} \frac{N_{i}^{c}}{N_{i}^{p}} M a R = \frac{1}{γ} \sum_{i} \frac{N_{i}^{c}}{N_{i}^{g}} M a F 1 = \frac{2 \times M a P \times M a R}{M a P + M a R}$ $M i P = \frac{\sum_{i} N_{i}^{c}}{\sum_{i} N_{i}^{p}} M i R = \frac{\sum_{i} N_{i}^{c}}{\sum_{i} N_{i}^{g}} M i F 1 = \frac{2 \times M i P \times M i R}{M i P + M}$

Conclusions

In this paper, we deploy a unified method HCSM composed of the AORNN and HBiCaps submodules to import the cognitive structure learning for HMLTC. The AORNN submodule constructs word-embedding, hierarchy-embedding, and document-embedding levels to integrate partial word-to-hierarchy representation into a global hierarchical text representation based on hierarchy-ordered neurons. The HBiCaps submodule applies the partial text-hierarchy representation, local hierarchical relationships, and the

CRediT authorship contribution statement

Boyan Wang: Conceptualization, Methodology, Software, Investigation, Writing - original draft, Visualization, Funding acquisition. Xuegang Hu: Conceptualization, Validation, Formal analysis, Writing - review & editing, Funding acquisition. Peipei Li: Methodology, Resources, Writing - review & editing, Funding acquisition. Philip S. Yu: Conceptualization, Methodology, Resources, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported in part by the National Key Research and Development Program of China under grant 2016YFB1000901, the National Natural Science Foundation of China under grants (61976077, 62076085, 91746209), the Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT) of the Ministry of Education, China under grant IRT17R32, and China Scholarship Council .

References (61)

MirzaB. et al.
Meta-cognitive online sequential extreme learning machine for imbalanced and concept-drifting data classification
Neural Netw.
(2016)
WuQ. et al.
Multi-label collective classification via markov chain based learning method
Knowl.-Based Syst.
(2014)
SteinR.A. et al.
An analysis of hierarchical text classification using word embeddings
Inform. Sci.
(2019)
CerriR. et al.
Hierarchical multi-label classification using local neural networks
J. Comput. System Sci.
(2014)
ZhangL. et al.
Hierarchical multi-label classification using fully associative ensemble learning
Pattern Recognit.
(2017)
MoyanoJ.M. et al.
Review of ensembles of multi-label classifiers: models, experimental study and prospects
Inf. Fusion
(2018)
HuangH. et al.
Feature selection for hierarchical classification via joint semantic and structural information of labels
Knowl.-Based Syst.
(2020)
AliZ. et al.
Paper recommendation based on heterogeneous network embedding
Knowl.-Based Syst.
(2020)
AbroW.A. et al.
Multi-turn intent determination and slot filling with neural networks and regular expressions
Knowl.-Based Syst.
(2020)
ZhaoP. et al.
Modeling sentiment dependencies with graph convolutional networks for aspect-level sentiment classification
Knowl.-Based Syst.
(2020)

GargiuloF. et al.

Deep neural network for hierarchical extreme multi-label text classification

Appl. Soft Comput.

(2019)

BorgesH.B. et al.

An evaluation of global-model hierarchical classification algorithms for hierarchical classification problems with single path of labels

Comput. Math. Appl.

(2013)

AusubelD.P. et al.

Educational psychology: A cognitive view

(1968)

OrtonyA. et al.

The Cognitive Structure of Emotions

(1990)

DunloskyJ. et al.

Metacognition

(2008)

CushmanF. et al.

Finding faults: How moral dilemmas illuminate cognitive structure

Soc. Neurosci.

(2012)

LiuQ. et al.

Exploiting cognitive structure for adaptive learning

SIGKDD

(2019)

AggarwalC.C. et al.

A survey of text classification algorithms

RenZ. et al.

Hierarchical multi-label classification of social text streams

LiuL. et al.

Neuralclassifier: An open-source neural hierarchical multi-label text classification toolkit

ACL

(2019)

QuB. et al.

An evaluation of classification models for question topic categorization

J. Am. Soc. Inf. Sci. Technol.

(2012)

AgrawalR. et al.

Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages

WWW

(2013)

NavaneedhanC.G. et al.

What is meant by cognitive structures? How does it influence teaching–learning of psychology

IRA Int. J. Edu. Multidiscip. Stud.

(2017)

BanerjeeS. et al.

Hierarchical transfer learning for multi-label text classification

ACL

(2019)

PengH. et al.

Hierarchical taxonomy-aware and attentional graph capsule RCNNs for large-scale multi-label text classification

IEEE Trans. Knowl. Data Eng.

(2019)

SillaC.N. et al.

A survey of hierarchical classification across different application domains

Data Min. Knowl. Discov.

(2011)

MaoY. et al.

Hierarchical text classification with reinforced label assignment

(2019)

WuJ. et al.

Learning to learn and predict: A meta-learning approach for multi-label classification

(2019)

ZhouJ. et al.

Hierarchy-aware global model for hierarchical text classification

ACL

(2020)

YangZ. et al.

Hierarchical attention networks for document classification

NAACL

(2016)

Cited by (37)

Multi-label feature selection via latent representation learning and dynamic graph constraints
2024, Pattern Recognition
As an effective method to deal with the curse of dimensionality, multi-label feature selection aims to select the most representative subset of features by eliminating unfavorable features. Although great progress has been made in this field, how to mine adequate supervisory information from multi-label data remains a key challenge. Compared to the latent information of instances, the latent information of instance relevance contains both the basic information of instances and the latent relevance between instances. Base on this knowledge, we propose a novel multi-label feature selection method named LRDG that explores latent representation learning and dynamic graph constraints. Specifically, we introduce the latent representation of instance relevance as supervisory information for pseudo-label learning, and minimize information loss during pseudo-label learning by means of the label manifold, the non-negative constraints, and the minimization of the Frobenius norm between pseudo-labels and ground-truth labels. In addition, considering the shortcomings brought by traditional graph regularization, we propose to use the dynamic graph constructed from low-dimensional pseudo-labels to constrain feature weights. Extensive experiments on various multi-label datasets demonstrate the effectiveness of the proposed method. The code is available at https://github.com/yunbao520/LRDG.
SOAP classifier for free-text clinical notes with domain-specific pre-trained language models
2024, Expert Systems with Applications
The increasing use of electronic health records (EHRs) in healthcare has led to a significant amount of unstructured clinical text data. This paper proposes a model for classifying free-text clinical notes in the Portuguese language in sentences following the SOAP (Subjective, Objective, Assessment, and Plan) note standard using domain-specific pre-trained language models. Among the five pre-trained BERT models tested, BioBERTptRT achieved the best results with a precision of 0.9461, accuracy of 0.9434, recall of 0.9437, and F1-score of 0.9435. BioBERTptRT, specialized in the Portuguese language, clinical terminology, and the medical group’s domain, outperformed the other models, showing a 0.28% increase in the F1-score compared to the second most specialized model. The proposed model focuses on high-level sentence classification rather than entity-level classification and aims to structure clinical notes at a broader level. The study utilizes a private database of 10,000 anonymized health records containing 234,673 clinical notes. These notes were divided into 1,183,345 unique sentences used to retrain BioBERTptRT. Additionally, 100,021 sentences were manually labeled for use in fine-tuning the models This work contributes to the structuring of clinical notes by showcasing the performance improvements obtained through domain specialization in BERT networks. Additionally, it presents an analysis of the performance gains achieved by BioBERTpt compared to mBERT, both in our study and other related works. Furthermore, this study provides a valuable comparison between the distribution of sentences and the results obtained in this research and similar studies conducted in English.
Multi-Aspect co-Attentional Collaborative Filtering for extreme multi-label text classification
2023, Knowledge-Based Systems
Citation Excerpt :
However, the fundamental textual information extractor used by [4] is too simple because of its excessive pursuit of model lightness. To build a relation graph in a label set, some other models [28–30] tried to embed labels to search to consider the similarity within their feature space. For example, AnneXML [31] treated this problem as a weak-supervised task and employed KNN [32] on the label to get fewer available label candidates.
The extreme multi-label text classification (XMTC) keeps attracting researchers’ attention due to its wide application. Recent studies have been trying to enhance text representation or reduce the number of labels to optimize the lack of information in a text or the sparsity of the possibility vector. In the social recommendation field, a similar problem has already been defined and studied extensively, the methods from which can be adopted in XMTC to identify matching relations in large datasets accurately. Thus, we proposed a general architecture enhanced by a pre-defined global nearest label neighbor group for XMTC, which reformulates the learning task to an interaction function between document and label by a multi-layer perceptron. Further, with co-attention mechanism and neural collaborative filtering, our architecture learns informative label representation enhanced by document-specific label group vector and label-specific text feature vector, which builds an effective interaction function to get a matching score. Through extensive experiments comparing various models and ablation studies, results show that our proposed architecture outperforms most existing methods for XMTC and significantly improves the performance of elemental document encoders.
Class-imbalanced positive instances augmentation via three-line hybrid
2022, Knowledge-Based Systems
Citation Excerpt :
However, in the real world, datasets all exhibit various forms of irregularity, which cause the classifier to fail to learn useful knowledge in the training set, thereby reducing its classification performance [2,3]. In various fields, there have been some studies on the class-imbalance problem, including but not limited to fault diagnosis [4,5], network intrusion detection [6,7], text classification [8], and fraud detection [9,10]. Over the past decade, a number of solutions have been proposed to deal with the class-imbalance problem.
The class-imbalance problem is one of the researches of machine learning and data mining. To address the class-imbalance problem, the traditional oversampling algorithm only utilizes the information of the positive instances to generate the synthetic instances with similar characteristics to the minority instances, and there is a problem that the information of the majority instances cannot be used. When the minority instances are too few and too concentrated, such methods suffer from the problem of small disjuncts, resulting in overfitting of the training data. To solve this problem, we incorporate the genetic process of three-line hybrid rice, and a new positive instances augmentation algorithm, i.e., Three-line Hybrid Positive Instance Augmentation (THPIA) is proposed. The THPIA uses the genetic process of three-line hybrid rice to mixup the features of majority-class and minority-class to construct unlabeled instances. Then, the positive instances in the pool of the positive instances are randomly selected to hybridize with the randomly selected unlabeled instances, and the enhanced seed instances of the positive instances are obtained. Finally, a distance constraint is used to prevent the augmented positive instances from generating noisy instances in the negative region. The experimental results on 20 open datasets show that THPIA can effectively utilize the information of the majority instances to enhance the minority instances. Comparing with 7 state-of-the-art methods by Friedman test and Holm’s post-hoc test, THPIA is comparable to CDSMOTE and SMOTE-LOF, and outperforms the remaining 5 state-of-the-art algorithms.
Model-agnostic and diverse explanations for streaming rumour graphs
2022, Knowledge-Based Systems
The propagation of rumours on social media poses an important threat to societies, so that various techniques for rumour detection have been proposed recently. Yet, existing work focuses on what entities constitute a rumour, but provides little support to understand why the entities have been classified as such. This prevents an effective evaluation of the detected rumours as well as the design of countermeasures. In this work, we argue that explanations for detected rumours may be given in terms of examples of related rumours detected in the past. A diverse set of similar rumours helps users to generalize, i.e., to understand the properties that govern the detection of rumours. Since the spread of rumours in social media is commonly modelled using feature-annotated graphs, we propose a query-by-example approach that, given a rumour graph, extracts the $k$ most similar and diverse subgraphs from past rumours. The challenge is that all of the computations require fast assessment of similarities between graphs. To achieve an efficient and adaptive realization of the approach in a streaming setting, we present a novel graph representation learning technique and report on implementation considerations. Our evaluation experiments show that our approach outperforms baseline techniques in delivering meaningful explanations for various rumour propagation behaviours.
Hierarchical classification for account code suggestion
2022, Knowledge-Based Systems
Citation Excerpt :
More advanced NLP techniques such as BERT (Bidirectional Encoder Representations from Transformers) have been widely successful for other invoice processing tasks i.e., extracting relevant data from images of invoices and payments [21,22]. Several studies focus on solving hierarchical multi-label classification problems due to its applicability to common tasks such as protein function prediction [14] and document classification or annotation [23]. Many of these methods can be also used in the hierarchical single-label context.
As part of invoice processing, businesses are required to manually classify each line item on an invoice to a specific financial account. This can be time-consuming and challenging when there is a large set of account codes to choose from. Failing to select the correct account will lead to errors in financial reporting which can be misleading for stakeholders and have adverse effects during a financial audit. The emergence of cloud-based accounting platforms has introduced potential areas for automated support across invoice processing tasks such as helping bookkeepers to select the correct financial account. Traditionally, account code suggestion is framed as a multi-class classification task, however, we explore the applicability of using a hierarchical single-label classifier. Account codes can be expressed as taxonomic classes, either explicitly through the chart of accounts or implicitly through associations with reconciled bank accounts. Our research provides evidence to support that the exploitation of hierarchical information from induced taxonomies can be highly advantageous for improving classification and recommendation performance. Furthermore, we introduce Top-K Parent Boosting, a novel post-processing strategy. We demonstrate the suitability of Top-K Parent Boosting for DAG structures and highlight its superiority in improving recommendation performance over former strategies.

View all citing articles on Scopus

View full text

Cognitive structure learning model for hierarchical multi-label text classification

Abstract

Introduction

Section snippets

Hierarchical multi-label classification

Problem formalization

Proposed approach

Evaluation measures and experimental datasets

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Neural Netw.

Knowl.-Based Syst.

Inform. Sci.

J. Comput. System Sci.

Pattern Recognit.

Inf. Fusion

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Appl. Soft Comput.

Comput. Math. Appl.

Educational psychology: A cognitive view

The Cognitive Structure of Emotions

Metacognition

Finding faults: How moral dilemmas illuminate cognitive structure

Soc. Neurosci.

Exploiting cognitive structure for adaptive learning

SIGKDD

A survey of text classification algorithms

Hierarchical multi-label classification of social text streams

Neuralclassifier: An open-source neural hierarchical multi-label text classification toolkit

ACL

An evaluation of classification models for question topic categorization

J. Am. Soc. Inf. Sci. Technol.

Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages

WWW

What is meant by cognitive structures? How does it influence teaching–learning of psychology

IRA Int. J. Edu. Multidiscip. Stud.

Hierarchical transfer learning for multi-label text classification

ACL

Hierarchical taxonomy-aware and attentional graph capsule RCNNs for large-scale multi-label text classification

IEEE Trans. Knowl. Data Eng.

A survey of hierarchical classification across different application domains

Data Min. Knowl. Discov.

Hierarchical text classification with reinforced label assignment

Learning to learn and predict: A meta-learning approach for multi-label classification

Hierarchy-aware global model for hierarchical text classification

ACL

Hierarchical attention networks for document classification

NAACL