Feature selection on probabilistic symbolic objects

Ziani, Djamal

doi:10.1007/s11704-014-3359-4

Feature selection on probabilistic symbolic objects

Research Article
Published: 23 October 2014

Volume 8, pages 933–947, (2014)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Djamal Ziani¹

52 Accesses
2 Citations
Explore all metrics

Abstract

In data analysis tasks, we are often confronted to very high dimensional data. Based on the purpose of a data analysis study, feature selection will find and select the relevant subset of features from the original features. Many feature selection algorithms have been proposed in classical data analysis, but very few in symbolic data analysis (SDA) which is an extension of the classical data analysis, since it uses rich objects instead to simple matrices. A symbolic object, compared to the data used in classical data analysis can describe not only individuals, but also most of the time a cluster of individuals. In this paper we present an unsupervised feature selection algorithm on probabilistic symbolic objects (PSOs), with the purpose of discrimination. A PSO is a symbolic object that describes a cluster of individuals by modal variables using relative frequency distribution associated with each value. This paper presents new dissimilarity measures between PSOs, which are used as feature selection criteria, and explains how to reduce the complexity of the algorithm by using the discrimination matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering of modal-valued symbolic data

Article 24 October 2020

Nataša Kejžar, Simona Korenjak-Černe & Vladimir Batagelj

Similarity and Dissimilarity Measures for Mixed Feature-Type Symbolic Data

Unsupervised feature selection based on decision graph

Article 02 December 2016

Jinrong He, Yingzhou Bi, … Shenwen Wang

References

Billard L, Diday E. Symbolic data analysis. John Wiley & Sons, Ltd., 2006
Book MATH Google Scholar
Diday E, Esposito F. An introduction to symbolic data analysis and the SODAS software. Intelligent Data Analysis, 2003, 7(6): 583–601
Google Scholar
Diday E. Probabilist, possibilist and belief objects for knowledge analysis. Annals of Operations Research, 1995, 55(2): 227–276
Article MATH Google Scholar
Ziani D. Sélection de variables sur un ensemble d’objets symboliques: traitement des dépendances entre variables. Paris: University of Paris Dauphine, Dissertation for the Doctoral Degree 1996 (in French)
Google Scholar
Lebbe J. Représentation des concepts en biologie et en médecine. Dissertation for the Doctoral Degree, 1991 (in French)
Google Scholar
Bock H H, Diday E. Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer, 2000, 389–391
Book Google Scholar
Ziani D. Feature selection on Boolean symbolic objects. International Journal of Computer Science & Information Technology, 2013, 5(6): 1
Article Google Scholar
Malerba D, Esposito F, Monopoli M. Comparing dissimilarity measures for probabilistic symbolic objects. Data mining III, Series Management Information Systems, 2002, 6: 31–40
Google Scholar
Rached Z, Alajaji F, Campbell L L. Rényi’s divergence and entropy rates for finite alphabet Markov sources. IEEE Transactions on Information Theory, 2001, 47(4): 1553–1561
Article MATH MathSciNet Google Scholar
Kullback S, Leibler R A. On information and sufficiency. Annals of Mathematical Statistics, 1951, 22(1): 79–86
Article MATH MathSciNet Google Scholar
Beirlant J, Devroye L, Györfi L, Vajda I. Large deviations of divergence measures on partitions. Journal of Statistical Planning and Inference, 2001, 93(1): 1–16
Article MATH MathSciNet Google Scholar
Ziani D, Khalil Z, Vignes R. Recherche de sous-ensembles minimaux de variables à partir d’objets symboliques. In: Proceedings of the 5th èmes Journées “Symbolique-Numérique”. 1994, 794–799 (in French)
Google Scholar
Esposito F, Malerba D, Appice A. Dissimilarity and matching. Symbolic Data Analysis and the SODAS Software, 2008, 61–66
Google Scholar
Frank A, Asuncion A. Uci machine learning repository [http://archive.ics.uci.edu/ml]. irvine, ca: University of california. School of Information and Computer Science, 2010, 213
Google Scholar
Browne C, Düntsch I, Gediga G. Iris revisited: a comparison of discriminant and enhanced rough set data analysis. Rough Sets in Knowledge Discovery 2, 1998, 19: 345–368
Article Google Scholar
Dash M, Choi K, Scheuermann P, Liu H. Feature selection for clustering — a filter solution. In: Proceedings of the 2002 IEEE International Conference on Data Mining. 2002, 115–122
Google Scholar
Dy J G, Brodley C E. Feature selection for unsupervised learning. The Journal of Machine Learning Research, 2004, 5: 845–889
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Information Systems Department, College of Computer and Information Sciences, King Saud University, Riyadh, 11543, Saudi Arabia
Djamal Ziani

Authors

Djamal Ziani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Djamal Ziani.

Additional information

Djamal Ziani is an assistant professor in Computer Sciences and Information Systems College, King Saud University Saudi Arabia from 2009 until now. He is a researcher in ERP and in data management group of CCIS, King Saud University. He received his MS degree in computer sciences from University of Valenciennes, France in 1992, and his PhD degree in computer sciences from University of Paris Dauphine, France in 1996. Researcher in CLOREC project, INRIA Rocquencourt, France from 1992 to 1996. Post Doc in Department of Computer Sciences and Operational Research of University of Montreal, Canada from 1997 to 1998. Consultant and project manager in many companies in Canada from 1998 to 2009.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ziani, D. Feature selection on probabilistic symbolic objects. Front. Comput. Sci. 8, 933–947 (2014). https://doi.org/10.1007/s11704-014-3359-4

Download citation

Received: 24 September 2013
Accepted: 10 June 2014
Published: 23 October 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s11704-014-3359-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection on probabilistic symbolic objects

Abstract

Access this article

Similar content being viewed by others

Clustering of modal-valued symbolic data

Similarity and Dissimilarity Measures for Mixed Feature-Type Symbolic Data

Unsupervised feature selection based on decision graph

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature selection on probabilistic symbolic objects

Abstract

Access this article

Similar content being viewed by others

Clustering of modal-valued symbolic data

Similarity and Dissimilarity Measures for Mixed Feature-Type Symbolic Data

Unsupervised feature selection based on decision graph

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation