Abstract:
Sequence Series Data (SSD) refers to multi-dimensional data involving measurements over sequences, which can be ordered. This type of data is frequently encountered in ge...Show MoreMetadata
Abstract:
Sequence Series Data (SSD) refers to multi-dimensional data involving measurements over sequences, which can be ordered. This type of data is frequently encountered in genomic data sets and text sentiment analysis data sets, but collecting them can be time-consuming and labour-intensive. These factors result in low-resolution data sets. Therefore, we employed six machine learning regression methods to perform SSD super-resolution, i.e. to recover high-resolution data sets using self-similarity in low-resolution data sets. Furthermore, we propose a novel Long-Short Term Memory (LSTM) network, namely Interaction Encoded LSTM (IELSTM) network, which is capable of handling multiple distant interactions among sequences. IELSTM network generally shows better overall reconstruction quality when compared with ridge regression, LASSO regression, orthogonal matching pursuit regression, multilayer perceptron regression, and random forest regression, on four genomic data sets.
Date of Conference: 27 November 2017 - 01 December 2017
Date Added to IEEE Xplore: 05 February 2018
ISBN Information: