Abstract
Due to concept drifts, maintaining an up-to-date model is a challenging task for most of the current classification approaches used in data stream mining. Both the incremental classifiers and the ensemble classifiers spend most of their time in updating their temporary models and at the same time, a big sample buffer for training a classifier is necessary for most of them. These two drawbacks constrain further application in classifying a data stream. In this paper, we present a hormone based nearest neighbor classification algorithm for data stream classification, in which the classifier is updated every time a new record arrives. The records could be seen as locations in the feature space, and each location can accommodate only one endocrine cell. The classifier consists of endocrine cells on the boundaries of different classes. Every time a new record arrives, the cell that resides in the most unfit location will move to the new arrived record. In this way, the changing boundaries between different classes are recorded by the locations where endocrine cells reside in. The main advantages of the proposed method are the saving of the sample buffer and the improving of the classification accuracy. It is very important for conditions where the hardware resources are very expensive or the main memory is limited. Experiments on synthetic and real life data sets show that the proposed algorithm is able to classify data streams with less memory space and classification error.
Similar content being viewed by others
References
Masud MM, Gao J, Khan L, Han J, Thuraisingham B (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874
Abdulsalam H, Skillicorn DB, Martin P (2011) Classification using streaming random forests. IEEE Trans Knowl Data Eng 23(1):22–36
Masud M, Gao J, Khan L, Han J, Thuraisingham B (2009) A multi-partition multi-chunk ensemble technique to classify concept-drifting data streams. In: Proc Pacific-Asia conf knowledge discovery and data mining (PAKDD’09)
Hassan YF (2010) Rough sets for adapting wavelet neural networks as a new classifier system. Appl Intell 35(2):260–268
Lee KK, Yoon WC, Baek DH (2006) A classification method using a hybrid genetic algorithm combined with an adaptive procedure for the pool of ellipsoids. Appl Intell 25(3):293–304
Zhang X, Chen G, Wei Q (2011) Building a highly-compact and accurate associative classifier. Appl Intell 34(1):74–86
Aggarwal CC, Han J, Wang J, Yu PS (2006) A framework for on-demand classification of evolving data streams. IEEE Trans Knowl Data Eng 18(5):577–589
Tsai CJ, Lee CI, Yang WP (2008) An efficient and sensitive decision tree approach to mining concept-drifting data streams. Informatica 19(1):135–156
Abdulsalam H, Skillicorn D, Martin P (2008) Classifying evolving data streams using dynamic streaming random forests. In: Proc 19th int’l conf database and expert systems applications (DEXA), pp 643–651
Wang P, Wang H, Wu X, Wang W, Shi B (2007) A low-granularity classifier for data streams with concept drifts and biased class distributio. IEEE Trans Knowl Data Eng 19(9):1202–1213
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proc sixth ACM SIGKDD, pp 71–80
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proc seventh ACM SIGKDD int’l conf knowledge discovery and data mining (KDD’01), pp 97–106
Kasabov N (2002) Evolving connectionist systems: methods and applications in bioinformatics, brain study and intelligent machines. Springer, New York
Ihara H, Mori K (1984) Autonomous decentralized computer control systems. Computer 17:57–66
Miyamoto S, Mori K, Ihara H (1984) Autonomous decentralized control and its application to the rapid transit system. Comput Ind 5:115–124
Shen WM, Will P, Galstyan A et al (2004) Hormone-inspired self-organization and distributed control of robotic swarms. Auton Robots 17:93–105
Avila-Garcia O, Canamero L (2005) Hormonal modulation of perception in motivation-based action selection architectures. In: The AISB’05 symposium. SSAISB Press, New York, pp 9–16
Kravitz EA (1998) Hormonal control of behavior: amines and the biasing of behavioral output in lobsters. Science 241:1175–1181
Avila-Garcia O, Canamero L (2004) Using hormonal feedback to modulate action selection in a competitive scenario. In: Proceeding of the 8th international conference on simulation of adaptive behavior. Springer, Heidelberg, pp 243–252
Shen WM (2003) Self-organization through digital hormones. IEEE Intell Syst 18:81–83
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory IT-13:21–27
Triguero I, García S, Herrera F (2010) IPADE: iterative prototype adjustment for nearest neighbor classification. IEEE Trans Neural Netw 21(12):1984–1990
Min YJ, Ta YP, Chen KB (2010) A nonparametric feature extraction and its application to nearest neighbor classification for hyperspectral image data. IEEE Trans Geosci Remote Sens 48(3):1279–1293
Bermejo S, Cabestany J (2004) Local averaging of ensembles of LVQ-based nearest neighbor classifiers. Appl Intell 20(1):47–58
Zhang S (2011) Shell-neighbor method and its application in missing data imputation. Appl Intell 35(1):123–133
Huang CC, Lee HM (2004) A grey-based nearest neighbor approach for missing attribute value prediction. Appl Intell 20(3):239–252
Miyamoto S, Mori K, Ihara H (1984) Autonomous decentralized control and its application to the rapid transit system. Comput Ind 5:115–124
Shen WM, Salemi B, Will P (2002) Hormone-inspired adaptive communication and distributed control for CONRO self-reconfigurable robots. IEEE Trans Robot Autom 18:700–712
Mendao M (2007) A neuro-endocrine control architecture applied to mobile robotics. Dissertation for the Doctoral Degree, Canterbury, University of Kent, pp 1–49
Walker J, Wilson M (2008) A performance sensitive hormone-inspired system for task distribution amongst evolving robots. In: 2008 IEEE/RSJ international conference on intelligent robots and systems. IEEE Press, New York, pp 1293–1298
Xu QZ, Wang L (2011) Lattice-based artificial endocrine system model and its application in robotic swarms. Sci China Ser F 54(4):795–811
Zhang Y, Jin X (2006) An automatic construction and organization strategy for ensemble learning on data streams. SIGMOD Rec 35(3):28–33
Masud M, Gao J, Khan L, Han J, Thuraisingham B (2009) A multi-partition multi-chunk ensemble technique to classify concept-drifting data streams. In: Proc Pacific-Asia conf knowledge discovery and data mining (PAKDD’09)
Sun Y, Mao G, Liu X, Liu C (2007) Mining concept drifts from data streams based on multi-classifiers. In: Proc 21st int’l conf advanced information networking and applications workshops (AINAW), pp 257–263
Wang P, Wang H, Wu X, Wang W, Shi B (2005) On reducing classifier granularity in mining concept-drifting data streams. In: Proc fifth IEEE int’l conf data mining (ICDM’05)
Pang S, Ozawa S, Kasabov N (2005) Incremental linear discriminant analysis for classification of data streams. IEEE Trans Syst Man Cybern, Part B, Cybern 35(5):905–914
Xu WH, Qin Z, Chang Y (2011) Clustering feature decision tree for semi-supervised classification form high-speed data stream. J Zhejiang Univ Sci C (Comput & Electron) 12(8):615–628
Blake C, Merz C (1998) UCI Repository of Machine Learning Databases, Dept of Information and Computer Science, Univ. of California, Irvine
Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proc ninth ACM SIGKDD int’l conf knowledge discovery and data mining
Gao J, Ding B, Han J (2008) Classifying data streams with skewed class distributions and concept drifts. IEEE Internet Comput 12(6):37–49
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, L., Wang, L. & Xu, Q. Data stream classification with artificial endocrine system. Appl Intell 37, 390–404 (2012). https://doi.org/10.1007/s10489-011-0334-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-011-0334-8