Prediction of time series using an analysis filter bank of LSTM units

https://doi.org/10.1016/j.cie.2021.107371Get rights and content

Highlights

  • A combination of LSTM and convolutional layers in an architecture similar to a filter bank is proposed.

  • The proposed architecture performs better when the time series is noisy.

  • The frequency response of the filters in the convolutional layer, resemble a complementary filter bank response.

Abstract

Time series emerge in various applications such as financial data and production data, however, most of the generated data exhibit nonlinear inter-dependency between samples and noise, making necessary the development of methods capable of handling such nonlinearities and other abnormalities. In this paper we present an architecture for prediction of time series embedded in noise. The proposed architecture combines a convolutional and long short term memory (LSTM) layers into a structure similar to an analysis filterbank of two channels. The first element of each channel is a convolutional layer followed by a LSTM, which is able to find temporal dependencies of the signal. Finally the channels are summed to obtain a prediction. We found that the frequency response of the filters resemble a complementary filter bank response, with each channel having a maximum at different bands which could suggest that it characterizes the incoming signal in frequency. Comparisons with other methods demonstrate that the proposed method offer much better results in terms of different error measures.

Introduction

Time series arises in several branches of science and technology such as: financial data, sensor networks (Tulone and Madden, 2006), weather records, industrial observations, and many other sources. However, most of the generated data by these applications exhibit nonlinear inter-dependency between samples, and measurements are often contaminated by noise coming from the sensor or the environment where the measurements were made. As a result, nonlinear approaches are often required for the analysis and forecasting. Therefore, the methods used for the analysis must be robust to noise or outliers that contaminate the data, thus, making crucial the development of methods capable of handling such ailments and nonlinearities.

Several algorithms for time series analysis and forecasting have been proposed in the literature as pointed out in (Längkvist et al., 2014, Rahimi and Khashei, 2018); conventional approaches include autoregressive models such as ARMA, ARIMA (Chatfield, 2016), and hidden markov models (Juang and Rabiner, 1991). However, with the recent interest in deep neural networks (Krizhevsky et al., 2012), the exploration of methods for time series forecast using new network architectures has been increased, especially the use of LSTM (Hochreiter and Schmidhuber, 1997, Rodriguez et al., 2018), because of their ability to capture the long-term and short-term dependencies in a sequence. This type of networks has been successfully applied to problems in natural language processing (Gers and Schmidhuber, 2001, Liu et al., 2015), recognition of handwritten sequences (Graves et al., 2009), and electric power forecasting (Gensler et al., 2016).

However, the modeling of long sequences, such as documents or physiological signals, requires that the LSTM network keeps dependencies between elements of the series for a long period of time and some important features could be lost in the process (Liu et al., 2015).

One approach to overcome this problem is through a multiscale analysis (Soltani, 2002, Costa et al., 2002, Ferreira et al., 2006, Liu et al., 2015), this allows the analysis of important features at multiple scales of time, and reduces dependencies intervals. For this end, the time series is decomposed in a hierarchy of new time series that are easier to model and predict, separating the fast dynamics from the slow ones and facilitating the analysis of long range correlations (Ercolano et al., 2017).

In this work we propose a multiscale network based on an LSTM architecture as the main prediction element, and preceded by convolutional layers. We hope that, in this configuration the filters of the convolutional layers make the network more immune to noise and to outliers in the data. In the literature there is the use of LSTM in conjunction with convolutional networks, for example in (Xingjian et al., 2015) the network topology is adapted to data in two dimensions or images, also, in (Oh et al., 2018) they use the LSTM and convolutional layers to classify arrhythmia. In contrast, in this work a convolutional network is introduced as a filter of adaptive coefficients to the bandwidth of a one dimensional signal. Another related work is (Kim and Cho, 2019) which is focused on predicting residential energy consumption, using multi-input to use other data that could help in the prediction, such as, in their case, voltage, intensity, and sub metering, however here we are interested in the use of single input time series.

The rest of the document is organized as follows: Section 2 offers an introduction of neural networks, Section 3 explains the proposed methodology, Section 4 shows the results obtained, and finally Section 5 offers conclusions of this work and future directions.

Section snippets

Neural networks

This section offers a brief introduction to the network architectures used in the proposed scheme. For a more detailed treatment of this, the reader can consult (LeCun et al., 2015, Goodfellow et al., 2016).

Proposed scheme

Given a time series, represented by X={x(1),x(2),,x(n)}, the prediction problem consists in obtaining a future value x(n+1), a challenging task, due to undesirable disturbances in the data, such as noise or outliers. A common way of dealing with such unwanted local fluctuations is through the application of filters, particularly finite-response filters of linear phase (Chatfield, 2016). The classic filters used are generally of the moving averaging type.

In this work, we propose a network that

Results

In this section, the results of a series of experiments for validation and comparison of the proposed algorithm are presented. The methods evaluated are ARIMA, a dense feed forward network (NN), a BiLSTM, and a simple LSTM network. The used ARIMA model consists of a four order AR term and a seven order AM term, two non-seasonal differences were used to approximate stationarity, the coefficients for the term were determined by a conjugate direction method (Powell, 1964). The NN consists of three

Conclusions

An algorithm for the prediction of time series was presented through a combination of LSTM and a convolutional networks. The architecture of the proposed network behaves similar to a filter bank. In this case, the filters adapt to the signals according to the training set by adjusting the coeficients of the convolution layers. It is expected that each of the filters that make up the network extracts different characteristics of interest of the signal, so that later these characteristics can be

References (38)

  • M. Costa et al.

    Multiscale entropy analysis of complex physiologic time series

    Physical review letters

    (2002)
  • G. Ercolano et al.

    Two deep approaches for adl recognition: A multi-scale lstm and a cnn-lstm with a 3d matrix skeleton representation

  • M.A. Ferreira et al.

    Multi-scale and hidden resolution time series models

    Bayesian Analysis

    (2006)
  • Gensler, A., Henze, J., Sick, B., & Raabe, N. (2016). Deep learning for solar power forecasting—an approach using...
  • F.A. Gers et al.

    Lstm recurrent networks learn simple context-free and context-sensitive languages

    IEEE Transactions on Neural Networks

    (2001)
  • I. Goodfellow et al.
    (2016)
  • A. Graves et al.

    A novel connectionist system for unconstrained handwriting recognition

    IEEE transactions on pattern analysis and machine intelligence

    (2009)
  • S. Hochreiter

    Untersuchungen zu dynamischen neuronalen netzen

    Diploma, Technische Universität München

    (1991)
  • S. Hochreiter et al.

    Long short-term memory

    Neural computation

    (1997)
  • Cited by (16)

    • Advanced Prognosis methodology based on behavioral indicators and Chained Sequential Memory Neural Networks with a diesel engine application

      2023, Computers in Industry
      Citation Excerpt :

      Traditional architectures such as Multilayer Perceptrons still play an important role in this field (Hou et al., 2020). Other complex architectures based on Recurrent Neural Networks (RNN) are frequently implemented due to their proven capabilities in the prediction and forecast of time series, in particular, Long Short Term Memory (LSTM) networks (Yu et al., 2019; Jalali et al., 2021; Mejia et al., 2021). Previous works in the field of Predictive Maintenance (PdM) and condition-based assessments are built on LSTM NN.

    View all citing articles on Scopus
    View full text