Authors:
Georgios Dimitropoulos
;
Estela Papagianni
and
Vasileios Megalooikonomou
Affiliation:
University of Patras, Greece
Keyword(s):
Data Mining, HSLC Algorithm, Data Streams, Time series, Classification, Lag Correlation, Incremental SVM Learning Algorithm, Stock Trend Prediction, Features Extraction.
Abstract:
Time series data are ubiquitous and their analysis necessitates the use of effective data mining methods to aid towards decision making. The mining problems that are studied in this paper are lag correlation discovery and classification. For the first problem, a new lag correlation algorithm for time series, the Highly Sparse Lag Correlation (HSLC) is proposed. This algorithm is a combination of Boolean Lag Correlation (BLC) and Hierarchical Boolean Representation (HBR) algorithms and aims to improve the time performance of Pearson Lag Correlation (PLC) algorithm. The classification algorithm that is employed for data streams is an incremental support vector machine (SVM) learning algorithm. To verify the effectiveness and efficiency of the proposed schemes, the lag correlation discovery algorithm is experimentally tested on electroencephalography (EEG) data, whereas the classification algorithm that operates on streams is tested on real financial data. The HSLC algorithm achieves be
tter time performance than previous state-of-the-art methods such as the PLC algorithm and the incremental SVM learning algorithm that we adopt, increases the accuracy achieved by non-incremental models.
(More)