Skip to main content
Log in

Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

A data stream is a sequence of observations produced by a generating process which may evolve over time. In such a time-varying stream the relationship between input features and labels, or concepts, can change. Adapting to changes in concept is most often done by destroying and incrementally rebuilding the current classifier. Many systems additionally store and reuse previously built models to more efficiently adapt when stream conditions drift to a previously seen state. Reusing a model offers increased classification performance over rebuilding, and provides an indicator, or transparency, into the hidden state of the generating process. When only a subset of past models can be stored for reuse, for example due to memory constraints, the choice of which models to store for optimal future reuse is an important problem. Current methods of evaluating which models to store use valuation policies such as age, time since last use, accuracy and diversity. These policies are often not optimal, losing predictive performance by undervaluing complex models. We propose a new valuation policy based on advantage, the misclassifications avoided by reusing a model rather than training a new model, which more accurately reflects the true value of model storage. We evaluate our method on synthetic and real world data, including a real world air pollution dataset. Our results show accuracy increases of up to 6% using our valuation policy, while preserving transparency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Source code

The source code for the implementation can be found at https://github.com/BenHals/AdvantageMemManagement.

References

  • Ahmadi Z, Kramer S (2018) Modeling recurring concepts in data streams: a graph-based framework. Knowl Inf Syst 55(1):15–44

    Article  Google Scholar 

  • Alippi C, Boracchi G, Roveri M (2013) Just-in-time classifiers for recurrent concepts. IEEE Trans Neural Netw Learn Syst 24(4):620–634

    Article  Google Scholar 

  • Anderson R, Koh YS, Dobbie G (2016) Cpf: concept profiling framework for recurring drifts in data streams. In: Australasian joint conference on artificial intelligence. Springer, pp 203–214

  • Baena-Garcıa M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: Fourth international workshop on knowledge discovery from data streams, vol 6, pp 77–86

  • Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM international conference on data mining. SIAM, pp 443–448

  • Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) Moa: massive online analysis. J Mach Learn Res 11(May):1601–1604

    Google Scholar 

  • Borchani H, Martínez AM, Masegosa AR, Langseth H, Nielsen TD, Salmerón A, Fernández A, Madsen AL, Sáez R (2015) Modeling concept drift: a probabilistic graphical model based approach. In: International symposium on intelligent data analysis. Springer, pp 72–83

  • Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67

    Article  MathSciNet  Google Scholar 

  • Chen K, Koh YS, Riddle P (2015) Tracking drift severity in data streams. In: Australasian joint conference on artificial intelligence. Springer, pp 96–108

  • Chiu CW, Minku LL (2018) Diversity-based pool of models for dealing with recurring concepts. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1–8

  • Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD’00, pp 71–80, https://doi.org/10.1145/347090.347107

  • Gama J, Kosina P (2014) Recurrent concepts in data streams classification. Knowl Inf Syst 40(3):489–507

    Article  Google Scholar 

  • Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Brazilian symposium on artificial intelligence. Springer, pp 286–295

  • Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44

    Article  Google Scholar 

  • Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495

    Article  MathSciNet  Google Scholar 

  • Gomes JB, Menasalvas E, Sousa PA (2010) Tracking recurrent concepts using context. In: International conference on rough sets and current trends in computing. Springer, pp 168–177

  • Gonçalves PM Jr, De Barros RSM (2013) RCD: a recurring concept drift framework. Pattern Recognit Lett 34(9):1018–1025

    Article  Google Scholar 

  • Haque A, Khan L, Baron M (2016) Sand: Semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the Thirtieth AAAI conference on artificial intelligence, AAAI’16. , AAAI Press, pp 1652–1658, http://dl.acm.org/citation.cfm?id=3016100.3016130

  • Harries MB, Sammut C, Horn K (1998) Extracting hidden context. Mach Learn 32(2):101–126

    Article  Google Scholar 

  • Hosseini MJ, Ahmadi Z, Beigy H (2012) New management operations on classifiers pool to track recurring concepts. In: International conference on data warehousing and knowledge discovery. Springer, pp 327–339

  • Iwama K, Zhang G (2007) Optimal resource augmentations for online knapsack. In: Approximation, randomization, and combinatorial optimization. Springer, Algorithms and Techniques, pp 180–188

  • Jaber G, Cornuéjols A, Tarroux P (2013) Online learning: searching for the best forgetting strategy under concept drift. In: International conference on neural information processing. Springer, pp 400–408

  • Katakis I, Tsoumakas G, Vlahavas I (2010) Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl Inf Syst 22(3):371–391

    Article  Google Scholar 

  • Kauschke S, Fürnkranz J (2018) Batchwise patching of classifiers. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative applications of artificial intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, pp 3374–3381

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174

    Article  Google Scholar 

  • Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2018) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363

    Google Scholar 

  • Montiel J, Read J, Bifet A, Abdessalem T (2018) Scikit-multiflow: a multi-output streaming framework. J Mach Learn Res 19(72):1–5. http://jmlr.org/papers/v19/18-251.html

  • Olivares G, Kachhara A, Longley I, Barraza F (2019) ODIN Arrowtown dataset. https://doi.org/10.6084/m9.figshare.97707381

  • Oza N (2011) FLTz flight simulator. https://c3.nasa.gov/dashlink/projects/42/resources/?type=ds

  • Parker BS, Khan L (2015) Detecting and tracking concept class drift and emergence in non-stationary fast data streams. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, AAAI’15. AAAI Press, pp 2908–2913, http://dl.acm.org/citation.cfm?id=2888116.2888121

  • Shaker A, Senge R, Hüllermeier E (2013) Evolving fuzzy pattern trees for binary classification on data streams. Inf Sci 220:34–45

    Article  Google Scholar 

  • Sripirakas S, Pears R (2014) Mining recurrent concepts in data streams using the discrete Fourier transform. In: International conference on data warehousing and knowledge discovery. Springer, pp 439–451

  • Suárez-Cetrulo A, Cervantes A, Quintana D (2019) Incremental market behavior classification in presence of recurring concepts. Entropy 21(1):25

    Article  Google Scholar 

  • Webb GI, Hyde R, Cao H, Nguyen HL, Petitjean F (2016) Characterizing concept drift. Data Min Knowl Disc 30(4):964–994

    Article  MathSciNet  Google Scholar 

  • Young J (2019) Rain in Australia. https://www.kaggle.com/jsphyg/weather-dataset -rattle-package

  • Zhu X (2010) Stream data mining repository. http://www.cse.fau.edu/~xqzhu/stream.html

  • Žliobaitė I, Bifet A, Read J, Pfahringer B, Holmes G (2015) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482

    Article  MathSciNet  Google Scholar 

Download references

Funding

The work was supported by the Marsden Fund Council from New Zealand Government funding (Project ID 18-UOA-005), managed by Royal Society Te Apārangi.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ben Halstead.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Responsible editor: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Description of datasets

We have three important considerations for selection of real world data streams. The first is that they are proper data streams, with each observation proceeding the previous observation in time preventing information leakage. The second is the inclusion of concept drift as well as recurrences in concept. Recurrences are needed to make sure repository models are actually utilized, otherwise no differences in performance due to valuation policy will emerge. The third is a large number of distinct concepts. We perform evaluation by restricting repository size, thus we need as many concepts appearing in a stream as our maximum repository size. This poses problems, as standard datasets used in other works often do not fit both criteria. For example a common evaluation set is the ‘Poker Hands’ set, where 5 card poker hands are classified by hand type. This set is not a stream, and does not feature drift so we choose not to evaluate against it. Another is the ‘Electricity’ set, featuring electricity prices in the Australian state of New South Wales. This is a time based stream, and it features a single significant concept drift making it a standard concept drift testing stream. However the lack of concept recurrence makes the use of a repository, and thus a valuation policy, have minimal effect on accuracy. Because of this, testing against it produces no interesting results.

We used four real world datasets used in other works as well as one new dataset in which we see the potential for a large number of concepts as well as concept recurrence.

  • Airlines: This dataset is included in the MOA framework (Bifet et al. 2010). This data set contains 539,383 observations of commercial flights within the US between October 1987 and April 2008 with 7 attributes and one label of two possible classes. Each label class signifies whether or not a flight was delayed, with the features used to predict being Unique Carrier, Flight Number, Actual Elapsed Time, Origin, Destination, Distance and Diverted. The length of the dataset, as well as the possibility of underlying contexts, make it likely to contain a number of recurring concepts.

  • PowerSupply (Zhu 2010): Observations consist of hourly readings from an Italian electricity compacy with 2 attributes, combinations of power supply sources, and 24 label classes, each classifying one hour. The dataset contains 29,928 observations. The target of this dataset is to predict the hour a current power supply combination comes from. Concept drift in this set comes from underlying changes in season, weather and day of the week.

  • AusRain (Young 2019): Observations consist of daily meteorological information from a selection of Australian weather stations, with a classification task to predict whether or not it will rain the next day. Each of the 142,000 observations contains 24 features such as temperature, humidity, wind speed and pressure with a binary target for rain the following day. We hypothesize concepts here will be driven by underlying weather conditions, such as seasons or large scale events.

  • NasaFlight  (Oza 2011): Observations consist of 30 sensor readings taken during simulated flights. Sensor readings correspond to quantities such as rudder angle, degree of yaw and velocity. These flights were randomly simulated using the “FTLz” software created by the Intelligent Flight Control (IFC) group at NASA ARC. The dataset contains 25,043 observations, which were discretized. The classification task is to predict an increase or decrease in the moving average of the velocity feature. Concepts in this dataset comes from differences in flight pattern, for example, take off, cruising, descent and landing.

  • WindM (Olivares et al. 2019): Observations consist of wood smoke pollution levels from sensors distributed throughout a small town. We consider the \(PM_{2.5}\) reading, the levels of particles in the air smaller than 2.5 micrometers. These are considered the most dangerous. Not all sensors contributing to this dataset are active at every time point. We select 9 which are active for a contiguous 22,626 sample portion of the stream. The serial numbers for these sensors were: ‘ODIN-0175 (81992)’, ‘ODIN-0026 (90358)’, ‘ODIN-0016 (90259)’, ‘ODIN-0014 (90234)’, ‘ODIN-0063 (90002)’, ‘ODIN-0178 (81976)’, ‘ODIN-0179 (81968)’, ‘ODIN-0177 (81943)’, ‘ODIN-0183 (81877)’.

    The sensor at the center of this geographic area, ‘ODIN-0175 (81992)’, was chosen as the target sensor. Readings from this sensor were lagged by one timestep and categorized as increase, decrease or no change. The classification task is then to predict the change in target sensor readings for the next time step, given the readings from surrounding sensors. Concepts in this dataset come from a wide range of causes, we hypothesize the largest impacts will be from wind speed, wind direction and location of smoke source.

Fig. 14
figure 14

Sensitivity for filter threshold

Appendix B: Parameter sensitivity

We introduce two new parameters, a filtering threshold on past accuracy which models must score above to be considered and a Kappa Agreement threshold for merging models which refer to the same concept. The filter threshold sets how far models can be from ‘normal’ operation before we conclude they do not represent a concept, i.e., a threshold value of 0.8 means a model is filtered out if it scores below 80% of its normal accuracy. We take this to indicate the period being tested is not a recurrence, even if the model is still relatively accurate. Sensitivity of accuracy to this threshold is shown in Fig. 14. We can see that a threshold set too low displays a constant accuracy level. This effect is because the filter is too relaxed and behaviour is equal to the standard method of not filtering On the other hand, setting a threshold too high filters out too many models and accuracy falls sharply. We found a value of 0.85 to work well on all datasets tested, filtering enough models out to provide a benefit but not too many to prevent models being reused.

Fig. 15
figure 15

Sensitivity for Kappa agreement threshold

The Kappa Agreement threshold sets a limit for how similar the predictions made by two models must be before we merge them. Merging two models which do not refer to the same concept can destroy some predictive ability on either concept, and can reduce transparency benefits gained from reuse. Kappa Agreement measures how similar two labeling functions are by the similarity of the predictions they make on a shared set of inputs, in our case the warning period when a drift is detected. A Kappa Agreement value above 0.8 indicates it is highly likely the labeling functions are similar (Landis and Koch 1977). Sensitivity analysis shown in Fig. 15 shows that picking any value above 0.7 does not have a significant impact on accuracy, so we selected 0.9 to be conservative in retaining transparency.

Fig. 16
figure 16

Number of false alarms and average number of observations between a drift and its detection

Appendix C: False alarm and detection delay

Figure 16 shows the number of false alarm drift detections and the average number of observations between a drift and the first drift detection proceeding it. This information is also shown in Table 3, however here all repository sizes tested are shown. As repository size increases, the number of false alarms and the average drift detection delay decreases more using our policies (#E, AAC and EP) compared to baselines.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Halstead, B., Koh, Y.S., Riddle, P. et al. Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency. Data Min Knowl Disc 35, 796–836 (2021). https://doi.org/10.1007/s10618-021-00736-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-021-00736-w

Keywords

Navigation