Elsevier

Information Sciences

Volume 220, 20 January 2013, Pages 34-45
Information Sciences

Evolving fuzzy pattern trees for binary classification on data streams

https://doi.org/10.1016/j.ins.2012.02.034Get rights and content

Abstract

Fuzzy pattern trees (FPTs) have recently been introduced as a novel model class for machine learning. In this paper, we consider the problem of learning fuzzy pattern trees for binary classification from data streams. Apart from its practical relevance, this problem is also interesting from a methodological point of view. First, the aspect of efficiency plays an important role in the context of data streams, since learning has to be accomplished under hard time (and memory) constraints. Moreover, a learning algorithm should be adaptive in the sense that an up-to-date model is offered at any time, taking new data items into consideration as soon as they arrive and perhaps forgetting old ones that have become obsolete due to a change of the underlying data generating process. To meet these requirements, we develop an evolving version of fuzzy pattern tree learning, in which model adaptation is realized by anticipating possible local changes of the current model, and confirming these changes through statistical hypothesis testing. In experimental studies, we compare our method to a state-of-the-art tree-based classifier for learning from data streams, showing that evolving pattern trees are competitive in terms of performance while typically producing smaller and more compact models.

Introduction

Fuzzy pattern tree induction was recently introduced as a novel machine learning method for classification by Huang et al. [11]. Independently, the same type of model structure was proposed in [23] under the name “fuzzy operator tree”. An alternative to the original algorithm for learning pattern trees, as proposed in [11], was developed by Senge and Hüllermeier in [20]. Besides, an FPT variant for regression was introduced in [19].

Roughly speaking, a fuzzy pattern tree is a hierarchical, tree-like structure, whose inner nodes are marked with generalized (fuzzy) logical and arithmetic operators. It implements a recursive function that maps a combination of attribute values, entered in the leaf nodes, to a number in the unit interval, produced as an output by the root of the tree. The model class of fuzzy pattern trees is interesting for several reasons. Apart from some properties that make it appealing from a learning point of view (like a built-in feature selection mechanism and the possibility to guarantee monotonicity in certain attributes), FPTs are arguably attractive from an interpretation point of view. Generally, each tree can be considered as a kind of (generalized) logical description of a class.1 In this regard, pattern trees can be considered as a viable alternative to classical fuzzy rule models. Compared to such models, the hierarchical structure of pattern trees further allows for a more compact representation and for trading off accuracy against model simplicity in a seamless manner.

In recent years, the idea of adaptive learning in dynamical environments has received considerable attention, especially under the slogan of “learning from data streams” [8]. Closely related to this, a special branch of data-driven fuzzy systems modeling has emerged under the notion of “evolving fuzzy systems” [2], [15], [1], [16]. Despite small differences regarding the basic assumptions and the technical setting, the emphasis of goals and performance criteria, or the focus on specific types of applications, the key motivation of these and related fields is the idea of a system that learns incrementally, and maybe even in real-time, on a continuous stream of data, and which is able to properly adapt itself to changes of environmental conditions or properties of the data-generating process.

Motivated by these developments, we propose an extended version of fuzzy pattern trees suitable for learning from data streams. More specifically, building on the (batch learning) algorithm for pattern tree induction as proposed in [20], we develop an evolving variant for the problem of binary classification. The rest of the paper is organized as follows. In Section 2, we start with a brief description of the data stream scenario and recall the special requirements it involves for learning. Fuzzy pattern trees are explained in Section 3, in which we also recall the basic algorithm for learning such trees in batch mode. An extension of this algorithm for learning from data streams in then proposed in Section 4. Finally, an empirical evaluation of this method is presented in Section 5, where evolving fuzzy pattern trees are compared with so-called Hoeffding trees [13] on different types of data streams, both in terms of performance and readability.

Section snippets

Learning from data streams

In recent years, so-called data streams have attracted considerable attention in different fields of computer science, including database systems, data mining, and distributed systems. As the notion suggests, a data stream can roughly be thought of as an ordered sequence of data items, where the input arrives more or less continuously as time progresses [10], [9], [8]. There are various applications in which streams of this type are produced, such as network monitoring, telecommunication

Fuzzy pattern trees

As already mentioned earlier, a fuzzy pattern tree is a hierarchical, tree-like structure. The inner nodes of an FPT are marked with generalized (fuzzy) operators, either logical and arithmetic, whereas the leaf nodes are associated with fuzzy predicates on input attributes. A pattern tree propagates information from the leaf to the root node: a node takes the values of its descendants as input, combines them using the respective operator, and submits the output to its predecessor. Thus, a

Evolving fuzzy pattern trees

The basic idea of our evolving version of fuzzy pattern tree learning (eFPT) is to maintain an ensemble of pattern trees, consisting of a current (active) model and a set of neighbor models. The current model is used to make predictions, while the neighbor models can be seen as anticipated adaptations: they are kept ready to replace the current model in case of a drop in performance, caused, for example, by a drift of the concept to be learned. More generally, the current model is replaced or,

Empirical evaluation

In this section, we compare our evolving fuzzy pattern trees (eFPTs) with Hoeffding trees [13], a state-of-the-art approach for classification on data streams, in terms of performance, stability, and handling of concept drift. We use eFPT in its default setting (i.e., using default parameters n = 100, α = 0.01, p = 3). Experiments are not only conducted with real data sets, but also with synthetic data. As an important advantage of synthetic data, let us note that it allows for conducting experiments

Summary and conclusions

We have proposed an evolving version of the fuzzy pattern tree classifier that meets the increased requirements of incremental learning on data streams. The key idea of eFPT is to maintain, in addition to the current model, a set of neighbor trees that can replace the current model if the performance of the latter is no longer optimal. Thus, a modification of the current model is realized implicitly in the form of a replacement by an alternative tree. A replacement decision is made on the basis

References (23)

  • P.P. Angelov et al.

    Evolving fuzzy classifiers using different model architectures

    Fuzzy Sets and Systems

    (2008)
  • P.P. Angelov et al.

    Evolving Intelligent Systems

    (2010)
  • S. Ben-David, J. Gehrke, D. Kifer, Detecting change in data streams, in: Proceedings of the 30th International...
  • A. Bifet, R. Kirkby, Massive Online Analysis Manual, 2009....
  • P. Domingos, G. Hulten, Catching up with the data: research issues in mining data streams, in: 2001 ACM SIGMOD Workshop...
  • A. Frank, A. Asuncion, UCI Machine Learning Repository, 2010....
  • M.M. Gaber et al.

    Mining data streams: a review

    ACM SIGMOD Record

    (2005)
  • J. Gama et al.

    Learning from Data Streams

    (2007)
  • M. Garofalakis, J. Gehrke, R. Rastogi, Querying and mining data streams: you only get one look, in: Proceedings of the...
  • L. Golab et al.

    Issues in data stream management

    SIGMOD Record

    (2003)
  • Z. Huang et al.

    Pattern trees induction: a new machine learning method

    IEEE Transactions on Fuzzy Systems

    (2008)
  • Cited by (33)

    • Online density estimation over high-dimensional stationary and non-stationary data streams

      2019, Data and Knowledge Engineering
      Citation Excerpt :

      For a system with the mission of online processing on open-ended data streams, the general design criteria [18] are listed below. These criteria have been widely used as metrics for the evaluation of methods of processing data streams, [19–21]. In this work, BSP method [16] is used as the core for density estimation.

    • CS-IBC: Cuckoo search based incremental binary classifier for data streams

      2019, Journal of King Saud University - Computer and Information Sciences
      Citation Excerpt :

      Ammar Shaker et al., (Shaker et al., 2013) proposed “Evolving fuzzy pattern trees for binary classification on data streams”, which is contemporarily similar to the proposed model of this manuscript, since both aim to learn and classify the records in incremental fashion. Though the model devised in Shaker et al., 2013 differ from contemporary models in the approach that evinced significance in process completion and classification accuracy, still it is limited due to the constraints of the traditional fuzzy reasoning. Some of the constraints are process complexity due to multiple evolutions of the fuzzification and misclassification due to fuzzy reasoning.

    • Finding the hottest item in data streams

      2018, Information Sciences
      Citation Excerpt :

      The data stream exists in many applications especially when the application itself continuously generates or collects data, such as sensor streams [2,13], financial monitoring streams [16,21], biomolecular streams [3,10], etc. Due to the stream volume, substantial analytical tasks have been developed to extract the underlying knowledge of the stream data, including clustering [8,9,14,22], classification [24], mining frequent patterns [7,19,23,26,27], estimating mutual information [17], etc. The hottest item problem can be viewed as a monitoring problem that keeps tracking the best performing item over time.

    • IFC-Filter: Membership function generation for inductive fuzzy classification

      2015, Expert Systems with Applications
      Citation Excerpt :

      The main difference is that FCT are a class of models where leaf nodes can predict the degrees of possibility for multiple classes. Fuzzy pattern trees is a recently emerging class of fuzzy tree algorithms (Huang, Gedeon, & Nikravesh, 2008; Senge & Hullermeier, 2015; Shaker, Senge, & Hüllermeier, 2013). Instead of dividing the input space top-down, pattern trees are constructed bottom-up, where the leaf nodes represent fuzzified input variables that are then combined and aggregated using different tree nodes containing arithmetic and fuzzy-logic operators.

    • A similarity-based approach for data stream classification

      2014, Expert Systems with Applications
      Citation Excerpt :

      For the single classifier-based approach, the main issue is to build a model from a small portion of the data stream and incrementally update the model using newly arrived examples. The main techniques used are: Artificial Neural Networks (LEARN (Polikar, Udpa, & Honavar, 2000), Fuzzy-UCF (Orriols-Puig, Casillas, & Bernado, 2008)); Rule Learning (Facil (Ferrer-Troyano, Aguilar-Ruiz, & Santos, 2005), OGA (Vivekanandan & Nedunchezhian, 2011), AC-DS (Su, Liu, & Song, 2011)); Decision trees (VFDT (Domingos & Hulten, 2000), VFDTc (Gama, Rocha, & Medas, 2003), FlexDT (Hashemi & Yang, 2009), eFTP (Shaker, Senge, & Hullermeier, 2013)); and Instance-based Learning (TWF and LWF Salganicoff, 1997, SlidingWindows (Klinkenberg & Joachims, 2000), IBL-DS (Beringer & Hullermeier, 2007), IBLStreams (Shaker & Hullermeier, 2013)). For ensemble-based approach, a number of base classifiers are built from different portions of the data stream, and then all base models are combined to form an ensemble of classifiers.

    View all citing articles on Scopus
    View full text