Abstract
A problem of learning from a database where each sample consists of several time series and a single response is considered. We are interested in maximum data reduction that preserves predictive power of the original time series, and at the same time allows reasonable reconstruction quality of the original signals. Each signal is decomposed into a set of wavelet features that are coded according to their importance consisting of two terms. The first depends on the influence of the feature on the expected signal reconstruction error, and the second is determined by feature importance for the response prediction. The latter is calculated by building series of boosted decision tree ensembles. We demonstrate that such combination maintains small signal distortion rates, and ensures no increase in the prediction error in contrast to the unsupervised compression with the same reduction ratio.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Borisov, A., Eruhimov, V., Tuv, E.: Dynamic soft feature selection for tree-based ensembles. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications. Springer, New York (2005)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Daubechies, I.: Ten lectures on wavelets. SIAM, Philadelphia, PA (1992)
Donoho, D.L., Johnstone, I.M.: Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association 90(432), 1200–1224 (1995)
Donoho, D.L.: Denoising via soft-thresholding. IEEE Trans. Infrom. Theory 41(3), 613–627 (1995)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Technical report, Dept. of Statistics, Stanford University (1999)
Friedman, J.H.: Stochastic gradient boosting. Technical report, Dept. of Statistics, Stanford University (1999)
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Springer, Heidelberg (1991)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Jeong, M.K., Lu, J.-C., Huo, X., et al.: Wavelet-based data reduction techniques for process fault detection. Technometrics 48(1), 26–40 (2006)
Jin, J., Shi, J.: Automatic feature extraction of waveform signals for in-process diagnostic performance improvement. Journal of Intellifent Manufacturing 12, 257–268 (2001)
MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge (2003), available from http://www.inference.phy.cam.ac.uk/mackay/itila/
Mallat, S.: A Wavelet Tour on Signal Processing. Academic Press, London (1999)
Torkkola, K., Tuv, E.: Ensembles of regularized least squares classifiers for high-dimensional problems. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications. Springer, Heidelberg (2005)
Tuv, E.: Feature selection and ensemble learning. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications. Springer, New York (2005)
Tuv, E., Torkkola, K.: Feature filtering with ensembles using artificial contrasts. accepted for publication in IEEE Intelligent Systems Journal (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Eruhimov, V., Martyanov, V., Raulefs, P., Tuv, E. (2006). Combining Unsupervised and Supervised Approaches to Feature Selection for Multivariate Signal Compression. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2006. IDEAL 2006. Lecture Notes in Computer Science, vol 4224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875581_58
Download citation
DOI: https://doi.org/10.1007/11875581_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45485-4
Online ISBN: 978-3-540-45487-8
eBook Packages: Computer ScienceComputer Science (R0)