A PageRank-Based Method to Extract Fuzzy Expressions as Features in Supervised Classification Problems

Carmona, Pablo; Castro, Juan Luis; Lozano, Jesús; Suárez, José Ignacio

doi:10.1007/978-3-030-00374-6_15

A PageRank-Based Method to Extract Fuzzy Expressions as Features in Supervised Classification Problems

Pablo Carmona²⁰,
Juan Luis Castro²¹,
Jesús Lozano²⁰ &
…
José Ignacio Suárez²⁰

Conference paper
First Online: 27 September 2018

837 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11160))

Abstract

This work presents a new ranking method inspired on PageRank to reduce the dimensionality of the feature space in supervised classification problems. More precisely, as it relies on a weighted directed graph, it is ultimately inspired on TextRank, a PageRank based method that adds weights to the edges to express the strength of the connections between nodes. The method is based on dividing each original feature used to describe the training set into a set of fuzzy predicates and then ranking all of them by their ability to differentiate among classes in the light of this training set. The fuzzy predicates with the best scores can be then used as new features, replacing the original ones. The novelty of the proposal relies on being an approach halfway between feature selection and feature extraction approaches, being able to improve the discrimination ability of the original features but preserving the interpretability of the new features in the sense that they are fuzzy expressions. Preliminary results supports the suitability of the proposal.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In case of categorical features, each value of the feature will be represented by a fuzzy singleton.

References

Cai, J., Luo, J., Wang, S., Yang, S.: Feature selection in machine learning: a new perspective. Neurocomputing 300, 70–79 (2018)
Article Google Scholar
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)
MATH Google Scholar
Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: WTF: the who to follow service at Twitter. In: Proceedings of the 22nd International Conference on World Wide Web - WWW 2013. ACM Press (2013)
Google Scholar
Khalid, S., Khalil, T., Nasreen, S.: A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference. IEEE, August 2014
Google Scholar
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection. ACM Comput. Surv. 50(6), 1–45 (2017)
Article Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP 2004, July 2004
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report 1999–66, Stanford InfoLab, November 1999. http://ilpubs.stanford.edu:8090/422/. Previous number = SIDL-WP-1999-0120
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1), 23–69 (2003)
Article Google Scholar

Download references

Acknowledgments

This work has been co-funded by the grants to research groups, IB16048 Proyect (Government of Extremadura), and the NanoSen-AQM Project (SOE2/P1/E0569).

Author information

Authors and Affiliations

Industrial Engineering School, University of Extremadura, Badajoz, Spain
Pablo Carmona, Jesús Lozano & José Ignacio Suárez
Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
Juan Luis Castro

Authors

Pablo Carmona
View author publications
You can also search for this author in PubMed Google Scholar
Juan Luis Castro
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Lozano
View author publications
You can also search for this author in PubMed Google Scholar
José Ignacio Suárez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo Carmona .

Editor information

Editors and Affiliations

Andalusian Research Institute on Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
Francisco Herrera
Andalusian Research Institute on Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
Sergio Damas
Andalusian Research Institute on Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
Rosana Montes
Andalusian Research Institute on Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
Sergio Alonso
Andalusian Research Institute on Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
Óscar Cordón
Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
Antonio González
School of Engineering, Pablo de Olavide University, Seville, Spain
Alicia Troncoso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Carmona, P., Castro, J.L., Lozano, J., Suárez, J.I. (2018). A PageRank-Based Method to Extract Fuzzy Expressions as Features in Supervised Classification Problems. In: Herrera, F., et al. Advances in Artificial Intelligence. CAEPIA 2018. Lecture Notes in Computer Science(), vol 11160. Springer, Cham. https://doi.org/10.1007/978-3-030-00374-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-00374-6_15
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00373-9
Online ISBN: 978-3-030-00374-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics