Essential Attributes Generation for Some Data Mining Tasks

Krawczak, Maciej; Szkatuła, Grażyna

doi:10.1007/978-3-319-07176-3_5

Maciej Krawczak^24,25 &
Grażyna Szkatuła²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8468))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

2212 Accesses

Abstract

In this paper, we introduce a new approach referred to as Essential Attributes Generation (EAG) to reduce the dimensionality of multidimensional real-valued data series. We form a new representation of the original data. The approach is based on the concept of essential attributes generated by a multilayer neural network. The EAG generates a vector of real valued new attributes which form the compressed representation of the original data. The attributes are synthetic, and while not being directly interpretable, they still retain important features of the original data series. The approach has found applications to classification as well as clustering tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Application of feature selection methods for automated clustering analysis: a review on synthetic datasets

Article Open access 22 April 2017

A taxonomy of unsupervised feature selection methods including their pros, cons, and challenges

Article 22 July 2024

Feature selection for high-dimensional data

Article 15 February 2016

References

Astrom, K.J.: On the choice of sampling rates in parametric identification of time series. Information Sciences 1(3), 273–278 (1969)
Article MathSciNet Google Scholar
Azzouzi, M., Nabney, I.T.: Analyzing time series structure with Hidden Markov Models. In: Proceedings of the IEEE Conference on Neural Networks and Signal Processing, pp. 402–408 (1998)
Google Scholar
Chan, K.P., Fu, A.C.: Efficient time series matching by wavelets. In: Proceedings of the 15th IEEE International Conference on Data Engineering, pp. 126–133 (1999)
Google Scholar
Cybenko, G.: Approximations by superpositions of sigmoidal functions. Mathematics of Control, Signals, and Systems 2(4), 303–314 (1989)
Article MATH MathSciNet Google Scholar
Dreyfus, G.: Neural Networks Methodology and Applications. Springer, Berlin (2005)
MATH Google Scholar
Faloutsos, C., Ranganathan, M., Manolopulos, Y.: Fast subsequence matching in time-series databases. SIGMOD Record 23, 519–529 (1994)
Article Google Scholar
Frohlich, H., Chapelle, O., Scholkopf, B.: Feature selection for support vector machines by means of genetic algorithms. In: ICTAI, pp. 142–148 (2003)
Google Scholar
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.): Feature extraction foundations and applications. Springer, Berlin (2005)
Google Scholar
Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003)
Article Google Scholar
Inselberg, A.: Parallel Coordinates: VISUAL Multidimensional Geometry and its Applications. Springer (2009)
Google Scholar
Jolliffe, I.T.: Principal Component Analysis. Springer, Berlin (2002)
MATH Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. J. Knowl. Inform. Syst. 3(3), 263–286 (2000)
Article Google Scholar
Krawczak, M.: Multilayer Neural Systems and Generalized Net Models. Ac. Publ. House EXIT, Warsaw (2003a)
Google Scholar
Krawczak, M.: Heuristic dynamic programming - Learning as control problem. In: Rutkowski, L., Kacprzyk, J. (eds.) Neural Networks and Soft Computing, pp. 218–223. Physica Verlag, Heidelberg (2003b)
Chapter Google Scholar
Krawczak, M., Szkatuła, G.: Time series envelopes for classification. In: IEEE Intelligent Systems Conference, London, July 7-9 (2010)
Google Scholar
Krawczak, M., Szkatuła, G.: A hybrid approach for dimension reduction in classification. Control and Cybernetics 40(2), 527–552 (2011)
Google Scholar
Krawczak, M., Szkatuła, G.: Nominal Time Series Representation for the Clustering Problem. In: IEEE 6th International Conference, Intelligent Systems, Sofia, pp. 182–187 (2012)
Google Scholar
Krawczak, M., Szkatuła, G.: An approach to dimensionality reduction in time series. Information Sciences 260, 15–36 (2014)
Article MathSciNet Google Scholar
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Journal Data Mining and Knowledge Discovery 15(2), 107–144 (2007)
Article MathSciNet Google Scholar
Lee, S., Kwon, D., Lee, S.: Dimensionality reduction for indexing time series based on the minimum distance. Journal of Inform. Science and Engineering 19, 697–711 (2003)
MathSciNet Google Scholar
Maimon, O., Rokach, L. (eds.): Data mining and knowledge discovery handbook. Springer (2010)
Google Scholar
Fu, T.-C.: A review on time series data mining. Engineering Applications of Artificial Intelligence 24, 164–181 (2011)
Article Google Scholar
Yang, K., Shahabi, C.: On the stationarity of multivariate time series for correlation-based data analysis. In: Proceedings of the Fifth IEEE International Conference on Data Mining, pp. 805–808 (2005)
Google Scholar
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary norms. In: Proceedings of International Conference on Very Large Data Bases, Cairo, Egypt (2000)
Google Scholar
Wnek, J., Michalski, R.S.: Hypothesis-driven Constructive Induction in AQ17-HCI: A Method and Experiments. Machine Learning 14, 139–168 (1994)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447, Warsaw, Poland
Maciej Krawczak & Grażyna Szkatuła
Warsaw School of Information Technology, Newelska 6, 01-447, Warsaw, Poland
Maciej Krawczak

Authors

Maciej Krawczak
View author publications
You can also search for this author in PubMed Google Scholar
Grażyna Szkatuła
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Częstochowa University of Technology, Armii Krajowej 36, 42-200, Częstochowa, Poland
Leszek Rutkowski & Rafał Scherer &
Częstochowa University of Technology, 42-200, Częstochowa, Poland
Marcin Korytkowski
AGH University of Science and Technology, Mickiewicza 30, 30-059, Kraków, Poland
Ryszard Tadeusiewicz
Computer Science Division, Department of Electrical Engineering and Computer Sciences, University of California Berkeley, 94720-1776, Berkeley, CA, USA
Lotfi A. Zadeh
Computational Intelligence Laboratory, Electrical and Computer Engineering, University of Louisville, 405 Lutz Hall, 40292, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krawczak, M., Szkatuła, G. (2014). Essential Attributes Generation for Some Data Mining Tasks. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2014. Lecture Notes in Computer Science(), vol 8468. Springer, Cham. https://doi.org/10.1007/978-3-319-07176-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-07176-3_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07175-6
Online ISBN: 978-3-319-07176-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics