Elsevier

Ad Hoc Networks

Volume 112, 1 March 2021, 102361
Ad Hoc Networks

M-GBDT2NN: A more generalized framework of GBDT2NN for online update

https://doi.org/10.1016/j.adhoc.2020.102361Get rights and content

Abstract

The large-scale, scalable, flexible characteristics make the researches of online update more important in the Internet of Things (IoT). Gradient Boosting Decision Tree (GBDT) is commonly used to deal with the numerical data, while it cannot be updated online. To better solve the prediction problems, we propose a more generalized approach, M-GBDT2NN, based on GBDT2NN. Compared with GBDT2NN, the new method is also applicable to the multi-classification problems, besides binary classification, regression. In the new framework, it takes the iteration of GBDT as the smallest unit to ensure the additive relation among the distilled models, and it predicts a probability vector rather than a numerical value. This paper analyzes the generalization and the ability of online update of M-GBDT2NN. The experimental results demonstrate that the proposed method can perform better than the other methods in both multi-classification problems and online update.

Introduction

With the development the fifth-generation cellular network (5G), the modern communication makes it possible to deploy enormous wireless sensors in the framework of the IoT. Under the IoT networks, the wireless-enabled devices range from smartphones to sensors, drones, connected vehicles, wearables, and even the virtual reality devices [1]. Meanwhile, both the 5G and the wireless technologies are widely used in many applications, such as smart grid, smart city, agriculture, environment, and energy [2], [3]. For example, the advanced 5G and wireless technologies are used to assist the smart grid in data transmission and grid management, where the smart grid deals with small distributed generation sources rather than the large centralized generation [4]. The IoT networks have advantages in large scale, scalability, flexibility and so on [5], [6]. However, these characteristics also mean that the collected data from the IoT networks are changeable and time-sensitive, which greatly increase the difficulty of data processing and mining. Therefore, the related studies of wireless networks have significance in both theoretical research and practical applications.

A lot of machine learning methods, such as logistics regression (LR) [7], support vector machine (SVM) [8], k-NearestNeighbor (k-NN) [9], GBDT [10] and so on, have been used in the field of IoT [11], [12]. Though these methods have made a lot of achievements, there are still some problems: (1) the online update of models; (2) the problems of multi-classification. In the applications of IoT, many of the collected data are numerical, such as voltage, current, temperature, sound, pollution levels, humidity, wind, and so on [13]. Many classical algorithms, like GBDT, SVM, and et al. deal with these dense numerical data very well, but they cannot be updated online. Facing with the large-scale, changeable data, these methods are not applicable. Additionally, there are many multi-classification issues, such as the intrusion detection of multiple sources in smart grid [14], [15], fault diagnosis [16] in sensor networks and so on. Most of these proposed methods are for the binary classification and regression problems, but the researches of multi-classification are few.

As one of the most popular methods of machine learning, GBDT can efficiently excavate the underlying rules from dense numerical data [17]. GBDT belongs to the gradient boosting algorithm family, and it learns the predictions by a serial weak learners. Taking the decision trees as the weak learners ensures that GBDT has strengths in distinctive features discovery, features combination as well as results explanation [10]. Besides, the mechanism of GBDT for multi-classification is highly efficient [18]. In particular, GBDT trains more than one trees at each iteration to predict the probabilities of the classes separately. It achieves to deal with all classes of data at the same time, while retaining multiple learners at each iteration.

On the other hand, GBDT has one main disadvantage. It cannot update the model incrementally because the decision trees are not differentiable. The usual method of ensuring the GBDT model effective is to retrain the model from scratch frequently. But it is expensive or even impossible to re-collect, store the whole training data, and rebuild the models. To solve this problem, researchers have proposed a lot of improvements. In [19], Ke proposed a new framework DeepGBM, which can deal with categorical and numerical features well while retaining the ability of online update. DeepGBM consisted of two parts: CatNN was mainly for sparse categorical features and GBDT2NN was aimed at dense numerical features. GBDT2NN took use of the knowledge distillation technology to approximate the function of the tree structure by the neural network (NN) [20] model. The new framework combined the advantages of GBDT and NNs together and realized the online update at the same time. It has significant meanings to both GBDT and NNs. But the GBDT2NN model is only applicable to the regression and binary problems.

Considering that most data collected from IoT networks are numerical, this paper mainly improves GBDT2NN. Specifically, we propose a more general framework, M-GBDT2NN, which can deal with the regression, binary classification, and also the multi-classification problems. Due to the parallel relation of the decision trees at the same iteration, we define the iteration of GBDT as the smallest unit to ensure the additivity of different units, which means that the trees at one iteration will be distilled into a NN model. M-GBDT2NN also adopts the Leaf Embedding Distillation and Tree Grouping strategies to improve the efficiency. To evaluate the performance of the proposed M-GBDT2NN, we conduct several experiments on publicly available datasets. The results demonstrate that our method performs better than some other methods on the multi-classification problems, and it can achieve the online update well.

The rest of this paper is organized as follows. In Section 2, we give a brief introduction on the improvements of GBDT for online update and the existing approaches on multi-classification problems. In Section 3, we introduce the mechanism of GBDT for multi-classification problems and the framework of GBDT2NN. And the proposed framework M-GBDT2NN and the theoretical analysis are described in Section 4. Descriptions of the datasets and the experiment results are shown in Section 5. Section 6 concludes the paper and discusses some guidelines for the future work.

Section snippets

The improvements of GBDT for online update

There are a lot of methods proposed to achieve the online update of GBDT. Some studies tried to modify the structure of the decision tree to make the model suitable for the streaming data [21], [22]. XGBoost [23] and LightGBM [24] kept the trees’ structures fixed, and only updated the leaf outputs to realize the online update. However, these methods are still tree-based, thus it is difficult to breakthrough some bottlenecks caused by the inherent structure.

At the meantime, NNs have advantages

Preliminaries

For better understanding, we introduce the mechanism of GBDT for multi-classification and the framework of GBDT2NN in this chapter.

Smallest unit

As mentioned in 3.1, GBDT for multi-classification trains multiple decision trees at each iteration, and each decision tree predicts a probability of one class. Thus the results learned from the decision trees of different classes cannot be added. If taking the decision tree as the smallest unit of distillation as in GBDT2NN, it will greatly increase the complexity of the following learning process. Because we must record the index of class that the decision tree belongs to, and then the Tree

Experiments

As discussed in Section 4.4, M-GBDT2NN degenerates into GBDT2NN when dealing with regression and binary classification problems, and the performances of GBDT2NN have been proved in [19]. Therefore, this chapter mainly focuses on verifying the generalization and online update performances of the proposed M-GBDT2NN on multi-classification problems. We conduct several experimental comparisons on four benchmark datasets. Specifically, Pendigit1

Conclusion

To better solve the extreme volumes of data collected from the IoT networks, this paper proposes a more generalized framework for the online update, M-GBDT2NN, based on GBDT2NN. M-GBDT2NN takes all decision trees at each iteration of GBDT as the smallest unit to ensure the additive relation between the distilled NNs. Besides, the outputs of M-GBDT2NN is a vector rather than a numerical value. We give the theoretical interpretation about the generalization and the online update ability of

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work is supported by The Science and Technology Project of State Grid “Security Protection Technology of Embedded Components and Control Units in Power System Terminal”, China (2019GW-12).

Jinchao Huang received the B.S. degree in school of electronic information and communications from Huazhong University of Science and Technology, Hubei, China, in 2015. She is currently pursuing her Ph.D. degree with Department of Electronic Engineering, Shanghai Jiao Tong University. Her research interests include dimensionality reduction, machine learning, and recommendation systems.

References (36)

  • Calvo-ZaragozaJ. et al.

    Improving kNN multi-label classification in prototype selection scenarios using class proposals

    Pattern Recognit.

    (2015)
  • LuongN.C. et al.

    Data collection and wireless communication in internet of things (iot) using economic analysis and pricing models: A survey

    IEEE Commun. Surv. Tutor.

    (2016)
  • ShahS.H. et al.

    A survey: Internet of things (iot) technologies, applications and challenges

  • YaqoobI. et al.

    Internet of things architecture: Recent advances, taxonomy, requirements, and open challenges

    IEEE Wirel. Commun.

    (2017)
  • CramerJ.S.

    The Origins of Logistic Regression

    (2002)
  • SuykensJ.A. et al.

    Least squares support vector machine classifiers

    Neural Process. Lett.

    (1999)
  • DudaR.O. et al.

    Pattern Classification and Scene Analysis, Vol. 3

    (1973)
  • FriedmanJ.H.

    Greedy function approximation: a gradient boosting machine

    Ann. Statist.

    (2001)
  • Cited by (0)

    Jinchao Huang received the B.S. degree in school of electronic information and communications from Huazhong University of Science and Technology, Hubei, China, in 2015. She is currently pursuing her Ph.D. degree with Department of Electronic Engineering, Shanghai Jiao Tong University. Her research interests include dimensionality reduction, machine learning, and recommendation systems.

    Tong Li received the Master degree from Beijing Jiao Tong University, Beijing, China, in 2016. He is now working in the State Grid Liaoning Electric Power Research Institute, and he is mainly responsible for the security testing of Liaoning Company’s information system, the application and implementation of scientific and technological projects, and the technology supervision of information and communication.

    Yidong Yuan received the B.E degree in electronic science and technology from Tianjin University in 2005, and the M.S degree in microelectronics and solid-state electronics from Tianjin University, Tianjin, China, in 2007. He is currently studying for a Ph.D. in Microelectronics. His current research interests include IC design and security technology.

    Shenghong Li received the Ph.D. degree in radio engineering from the Beijing University of Posts and Telecommunications in 1999. Since 1999, he has been with Shanghai Jiao Tong University, as a Research Fellow, an Associate Professor, and a Professor, successively. In 2010, he was a Visiting Scholar with Nanyang Technological University, Singapore. His research interests include information security, signal and information processing, and artificial intelligence.

    View full text