Skip to main content

A PageRank-Based Method to Extract Fuzzy Expressions as Features in Supervised Classification Problems

  • Conference paper
  • First Online:
  • 837 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11160))

Abstract

This work presents a new ranking method inspired on PageRank to reduce the dimensionality of the feature space in supervised classification problems. More precisely, as it relies on a weighted directed graph, it is ultimately inspired on TextRank, a PageRank based method that adds weights to the edges to express the strength of the connections between nodes. The method is based on dividing each original feature used to describe the training set into a set of fuzzy predicates and then ranking all of them by their ability to differentiate among classes in the light of this training set. The fuzzy predicates with the best scores can be then used as new features, replacing the original ones. The novelty of the proposal relies on being an approach halfway between feature selection and feature extraction approaches, being able to improve the discrimination ability of the original features but preserving the interpretability of the new features in the sense that they are fuzzy expressions. Preliminary results supports the suitability of the proposal.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In case of categorical features, each value of the feature will be represented by a fuzzy singleton.

References

  1. Cai, J., Luo, J., Wang, S., Yang, S.: Feature selection in machine learning: a new perspective. Neurocomputing 300, 70–79 (2018)

    Article  Google Scholar 

  2. Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  3. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)

    MATH  Google Scholar 

  4. Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: WTF: the who to follow service at Twitter. In: Proceedings of the 22nd International Conference on World Wide Web - WWW 2013. ACM Press (2013)

    Google Scholar 

  5. Khalid, S., Khalil, T., Nasreen, S.: A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference. IEEE, August 2014

    Google Scholar 

  6. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection. ACM Comput. Surv. 50(6), 1–45 (2017)

    Article  Google Scholar 

  7. Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP 2004, July 2004

    Google Scholar 

  8. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report 1999–66, Stanford InfoLab, November 1999. http://ilpubs.stanford.edu:8090/422/. Previous number = SIDL-WP-1999-0120

  9. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  10. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  11. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1), 23–69 (2003)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been co-funded by the grants to research groups, IB16048 Proyect (Government of Extremadura), and the NanoSen-AQM Project (SOE2/P1/E0569).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pablo Carmona .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carmona, P., Castro, J.L., Lozano, J., Suárez, J.I. (2018). A PageRank-Based Method to Extract Fuzzy Expressions as Features in Supervised Classification Problems. In: Herrera, F., et al. Advances in Artificial Intelligence. CAEPIA 2018. Lecture Notes in Computer Science(), vol 11160. Springer, Cham. https://doi.org/10.1007/978-3-030-00374-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00374-6_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00373-9

  • Online ISBN: 978-3-030-00374-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics