Three-way multi-granularity learning towards open topic classification
Introduction
Traditional supervised learning focuses on deterministic conditions and closed world assumption [32], [1], which means that the classes appeared in the test set must be known in the training set [12], [3]. However, this assumption may often be violated in real-world applications. For instance, in open-world object recognition, new objects may appear constantly, and a classifier built from old objects may incorrectly classify a new object as one of the old objects [33]. This situation calls models to be more robust and adapt to the open dynamic environment, such as changes of value/attribute and even the appearance of new categories. Hence, this will present more significant challenges and broader application prospects for current machine learning researches and developments. Open dynamic situations call for on-the-job learning, as opposed to traditional closed world environment. On-the-job learning proposed by Liu [26], which refers to learning after the model has been deployed in an application or during model application. Here, Liu [26] defined an open dynamic system should (1) discover unknowns and create new learning tasks from the unknowns, (2) collect training or ground-truth data through interactions with users and the environment by imitation of humans or other agents, and (3) incrementally learn the new tasks. The whole process is also needed to be carried out on the fly in a self-motivated and self-supervised manner.
The text classification by topics is helpful for searching, data mining, and text analysis. However, topic classification is time-consuming and error-prone, especially the open dynamic tasks such as the dialog system and real-time news reports. Most of the existing state-of-the-art methods rely on supervised algorithms with fixed training data and view tasks in isolation rather than looking at such tasks as a whole. The data is constantly changing for open topic classification tasks, so it creates uncertainty in open classes and the learned knowledge. As shown in Fig. 1, we have four available tags (known classes) for specific topics, such as exchange charge, cancel a transfer, pending top-up, and verify identity. However, there are also texts with open/unknown topics. From the perspective of multi-granularity learning, the open topic classification tasks can be divided into multiple granularity levels. In the coarser granularity level, it is necessary to distinguish these texts from the known topics as much as possible, and all these unknown topics will be considered a whole. In the finer granularity, known topics and unknown topics will be further processed according to the current feature space. The processing procedure of open topic classification tasks can be viewed as a granularity construction process that discovers knowledge from coarser granularity to finer granularity. This paper tries to connect three-way multi-granularity learning with open dynamic learning, and then deal with the uncertainty to empower the ability of learning continually in open topic classification.
For open topic classification tasks, three underlying challenges remain to be addressed. First, based on the open-world assumption, the uncertainty in an open dynamic environment needs to be further studied. Second, most of the advanced topic classification systems center on using complex structures to capture the information, which requires a long time to converge during their training stage. And last, a desirable open dynamic model can capture knowledge at different granularity levels to embody the granularity change of dynamic data. To address these issues, we propose three-way multi-granularity learning towards open topic classification (TWMG-Open) model. In this work, we follow the open-world assumption and the framework of three-way multi-granularity learning has been demonstrated as an effective method for open problems. Also, searching for an appropriate granularity level for decision or classification is a crucial problem [19]. This paper deliberates open topic classification with the framework of three-way multi-granularity and constructs decision-making processes according to different granularity levels of dynamic data.
In this work, the open topic classification task has been discussed in light of three-way multi-granularity learning. Besides, it tackles the entire process from open detection/discovery to open classification and considers different granularities for different kinds of problems. By constructing a built-in knowledge base, the ability of continual learning is formed and the three-way multi-granularity learning is utilized to enhance the accuracy and validity of the learned knowledge through knowledge accumulation.
The remainder of this paper is organized as follows. In Section 2, a review of three-way multi-granularity learning and open topic classification are presented. Section 3 constructs a framework of three-way multi-granularity learning towards open topic classification and introduces each part of the proposed model accordingly. Section 4 designs a series of experiments and provides the experimental analysis. Finally, the conclusion is given in Section 5.
Section snippets
Three-way decision and three-way multi-granularity learning
Three-way decision (3WD) proposed by Yao [38] is initially to describe the three regions of decision-theoretic rough sets, and further be widely investigated as a philosophy of thinking in three, a methodology of working with three, and a mechanism of processing through three [42], [40]. In the traditional two-way classification models, an object is assigned to only two regions: the positive region for positive instances, and the negative region for negative instances. However, either region
Proposed model
With the inspiration of the open-world learning paradigm, we combine multi-granularity learning with open topic classification to construct a dynamic three-way multi-granularity enhanced open topic classification model (TWMG-Open). Given the unknown topic classes arising constantly, TWMG-Open detects unknown topics from known topics in the coarser granularity and then figures out the space feature of unknowns and how new document collections with unknown topics affected the knowledge base in
Experiments
A series of experiments were conducted to demonstrate the effectiveness of proposed cost-sensitive three-way multi-granular open topic classification method. All the experiments were performed on a computer with Intel Xeon E5-2678 v3 and NVIDIA GeForce RTX 3090. The Python version is 3.7 for Windows OS x64.
Conclusions
For an open dynamic task, we are interested in detecting and managing the uncertainty, and implementing the three-way multi-granularity structure in knowledge accumulation. On the basis of three-way multi-granularity learning, the open topic classification was investigated in different levels of granularity to conduct a more efficient approach of learning knowledge continually. Compared with traditional static open systems, we used three datasets to conduct a series of experiments, which showed
CRediT authorship contribution statement
Xin Yang: Conceptualization, Methodology, Writing – original draft. Yujie Li: Software, Writing – original draft. Dan Meng: Writing – review & editing. Yuxuan Yang: Software, Writing – review & editing. Dun Liu: Writing – review & editing. Tianrui Li: Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Nos. 61773324, 61876157), the Humanity and Social Science Youth Foundation of Ministry of Education of China (No. 20YJC630191), the Fintech Innovation Center of Southwestern University of Finance and Economics, and the Financial Intelligence & Financial Engineering Key Laboratory of Sichuan Province.
References (50)
- et al.
Machine learning for email spam filtering: review, approaches and open research problems
Heliyon
(2019) - et al.
Sequential three-way classifier with justifiable granularity
Knowl.-Based Syst.
(2019) - et al.
Sequential three-way decision and granulation for cost-sensitive face recognition
Knowl.-Based Syst.
(2016) - et al.
Cost-sensitive sequential three-way decision modeling using a deep neural network
Int. J. Approximate Reasoning
(2017) - et al.
A post-processing method for detecting unknown intent of dialogue system via pre-trained deep neural network classifier
Knowl.-Based Syst.
(2019) - et al.
Multi-label classification with a reject option
Pattern Recogn.
(2013) - et al.
Sequential three-way decisions via multi-granularity
Inf. Sci.
(2020) - et al.
A temporal-spatial composite sequential approach of three-way granular computing
Inf. Sci.
(2019) - et al.
Local temporal-spatial multi-granularity learning for sequential three-way granular computing
Inf. Sci.
(2020) Three-way decisions with probabilistic rough sets
Inf. Sci.
(2010)
Three-way decision and granular computing
Int. J. Approximate Reasoning
Three-way granular computing, rough sets, and formal concept analysis
Int. J. Approximate Reasoning
A three-way clustering method based on an improved dbscan algorithm
Phys. A
A tree-based incremental overlapping clustering method using the three-way decision theory
Knowl.-Based Syst.
Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes
Inf. Sci.
Sequential three-way decision based on multi-granular autoencoder features
Inf. Sci.
A cost-sensitive three-way combination technique for ensemble learning in sentiment classification
Int. J. Approximate Reasoning
Three-way enhanced convolutional neural networks for sentence-level sentiment classification
Inf. Sci.
Towards open world recognition
Efficient intent detection with dual sentence encoders
Lifelong machine learning
Synthesis Lectures Artif. Intell. Mach. Learn.
A review of hand gesture and sign language recognition techniques
Int. J. Mach. Learn. Cybern.
On optimum recognition error and reject tradeoff
IEEE Trans. Inf. Theory
Structural scaffolds for citation intent classification in scientific publications, in
Mean shift: A robust approach toward feature space analysis
IEEE Trans. Pattern Anal. Mach. Intell.
Cited by (12)
Granular computing-based deep learning for text classification
2024, Information SciencesText characterization based on recurrence networks
2023, Information SciencesMining multigranularity decision rules of concept cognition for knowledge graphs based on three-way decision
2023, Information Processing and ManagementA review of sequential three-way decision and multi-granularity learning
2023, International Journal of Approximate ReasoningCitation Excerpt :Therefore, three-way multi-granularity continual learning in open dynamic environment may a potential research topic in future. In fact, there has been a small amount of researches on this topic, Yang et al. [121] studied the three-way multi-granularity learning towards open topic classification and Li et al. [49] proposed a sequential three-way decision model based on continual learning to consider a situation where a system needs to learn new categories after a change of environment. There are three types of continual learning [92], namely class incremental continual learning, task incremental continual learning and domain incremental continual learning.