Skip to main content
Log in

Novel Fuzzy Correlation Coefficient and Variable Selection Method for Fuzzy Regression Analysis Based on Distance Approach

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

In data analysis, analyzing the relationships between the variables such as correlation analysis and regression analysis are very important. Correlation analysis and regression analysis are not only very important in analyzing the influence relationship and causal relationship of variables but also serve as the basis for statistical analysis. Furthermore, they are essential and important as basic analysis for machine learning analysis such as deep learning. This is because in analyzing the input and output in deep learning, variables with high correlation are selected first, and in analyzing the causal relationship, it is basic to first conduct basic analysis such as regression analysis. Especially, when data are observed as fuzzy data with ambiguous information, it is difficult to propose unique methods for those analyses due to its complexity. However, the application of fuzzy theory to correlation analysis for data with such ambiguous information has not been an effective study, and several studies have been conducted in cases where the data is not general fuzzy data or interval estimation. As a result, the effectiveness of the fuzzy theory was not highlighted. In particular, the variable selection method for selecting important variables in multiple regression analysis is a very important and essential process in regression analysis. A variable that is significant in simple regression analysis may not be significant in multiple regression analysis due to its relationship with other variables. Therefore, not all variables that affect the dependent variable can be used as independent variables in multiple regression analysis. Therefore, multiple regression analysis goes through the process of excluding some variables. But until now, the process of fuzzy multiple regression analysis has not been applied without a variable selection method and the significance of important variables has not been emphasized that much. In this paper, a fuzzy correlation coefficient and multiple fuzzy regression analysis using variable section method are proposed. For this, first defuzzification and fuzzy ordering are defined. And then fuzzy correlation coefficient is proposed using \({\varvec{L}}_{2}\) distance. Next, fuzzy sum of squares are defined for F-statistics to test the significance of the regression model. Using this F-statistics, fuzzy R2, and fuzzy RMSE, several variable selection methods are proposed based on distance approach. For the data analysis, foreign exchange reserve data and house price of South Korea have been applied which are important indicators for economic crisis. The financial data is mostly recorded as closing values, but the closing values cannot be the representative of the given period of time. Therefore, we can deal with the financial data as fuzzy data which have some fluctuation that can be considered as vagueness that the data originally include. We have used foreign exchange reserve data and house price data with several financial variables. And the proposed fuzzy correlation coefficient and variable selection for fuzzy regression analysis are applied to these financial data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig.1
Fig. 2
Fig.3

Similar content being viewed by others

Data Availability

The data described in this article are openly available in Bank of Korea at http://ecos.bok.or.kr and KB real estate data bank at https://kbland.kr/.

References

  1. Hong, D.H., Hwang, S.Y.: Correlation of intuitionistic fuzzy sets in probability spaces. Fuzzy Sets Syst. 75, 77–81 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  2. Chiang, D.-A., Lin, N.P.: Correlation of fuzzy sets. Fuzzy Sets Syst. 102, 221–226 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  3. Chaudhuri, B.B., Bhattacharya, A.: On correlation between two fuzzy sets. Fuzzy Sets Syst. 118, 447–456 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  4. Liua, S.-T., Kao, C.: Fuzzy measures for correlation coefficient of fuzzy numbers. Fuzzy Sets Syst. 128, 267–275 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  5. Hong, D.H.: Fuzzy measures for a correlation coefficient of fuzzy numbers under TW(the weakest t-norm)-based fuzzy arithmetic operations. Inf. Sci. 176, 150–160 (2006)

    Article  MATH  Google Scholar 

  6. Saneifard, R., Saneifard, R.: Correlation coefficient between fuzzy numbers based on central interval. J. fuzzy Set Valued Anal. 2012, 1–9 (2012)

    MathSciNet  MATH  Google Scholar 

  7. Basaran, M. A., Simonetti, B., D’Ambra, L.: Fuzzy correlation and fuzzy non-linear regression analysis. Statistical Decision-Making, 203–220 (2016)

  8. Bustince, H., Burillo, P.: Correlation of interval-valued intuitionistic fuzzy sets. Fuzzy Sets Syst. 74, 237–244 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cheng, Y.-T., Yang, C.-C.: The application of fuzzy correlation coefficient with fuzzy interval data. Int. J. Innov. Manag., Inform. Prod. 4, 65–71 (2014)

    Google Scholar 

  10. Yoon, J.H., Choi, S.H.: Separate fuzzy regression with crisp input and fuzzy output. J. Korean Data Inform. Sci. Soc. 18(2), 301–314 (2007)

    Google Scholar 

  11. Kim, H.K., Yoon, J.H., Li, Y.: Asymptotic properties of least squares estimation with fuzzy observations. Inf. Sci. 178(2), 439–451 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  12. Yoon, J.H., Kim, H.K., Choi, S.H.: Asymptotic consistency of least squares estimators in fuzzy regression model. Commun. Stat. Appl. Methods 15(6), 799–813 (2008)

    Google Scholar 

  13. Yoon, J.H., Choi, S.H.: Componentwise fuzzy linear regression using least squares estimation. J. Multiple-Valued Logic Soft Comput. 15, 137–153 (2009)

    MathSciNet  MATH  Google Scholar 

  14. Yoon, J.H., Choi, S.H.: Fuzzy Linear Regression Using Distribution Free Method. Commun. Stat. Appl. Methods 16(5), 781–790 (2009)

    Google Scholar 

  15. Yoon, J.H., Choi, S.H.: General fuzzy regression using least squares method. Int. J. Syst. Sci. 41(5), 477–485 (2010)

    Article  MATH  Google Scholar 

  16. Yoon, J.H., Choi, S.H.: Fuzzy least squares estimation with new fuzzy operations. In: Synergies of soft computing and statistics for intelligent data analysis. Springer, Berlin (2013)

    Google Scholar 

  17. Jung, H.-Y., Yoon, J.H., Choi, S.H.: Fuzzy linear regression using rank transform method. Fuzzy Sets Syst. 274(1), 97–108 (2014)

    MathSciNet  MATH  Google Scholar 

  18. Namdari, M., Yoon, J.H., Abadi, A., Taheri, S.M., Choi, S.H.: Fuzzy logistic regression with least absolute deviations estimators. Soft. Comput. 19, 909–917 (2015)

    Article  Google Scholar 

  19. Lee, W.J., Jung, H.-Y., Choi, S.H., Yoon, J.H.: The statistical inferences of fuzzy regression based on bootstrap techniques. Soft Comput. 19, 883–890 (2015)

    Article  MATH  Google Scholar 

  20. Lee, W.J., Jung, H.-Y., Yoon, J.H., Choi, S.H.: Analysis of variance for fuzzy data based on permutation method. Int. J. Fuzzy Logic Intell. Syst. 17(1), 43–50 (2017)

    Article  Google Scholar 

  21. Yoon, J.H., Choi, S.H., Grzegorzewski, P.: On asymptotic properties of the multiple fuzzy least squares estimator. In: Soft methods for data science, p. 456. Springer, Berlin (2017)

    Google Scholar 

  22. Yoon, J.H., Kyeong, D., Seo, K.: A hybrid method based on F-transform for robust estimators. Int. J. Approximate Reasoning 104, 75–83 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  23. Yoon, J.H.: Fuzzy mediation analysis. Int. J. Fuzzy Syst. 22(1), 338–349 (2020)

    Article  Google Scholar 

  24. Yoon, J.H.: Fuzzy moderation and moderated-mediation analysis. Int. J. Fuzzy Syst. 22(6), 1948–1960 (2020)

    Article  Google Scholar 

  25. D’Urso, P., Santoro, A.: Goodness of fit and variable selection in the fuzzy multiple linear regression. Fuzzy Sets Syst. 157, 2627–2647 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  26. Kashani, M., Arashi, M., Rabiei, M.R., D’Urso, P., Giovanni, L.D.: A fuzzy penalized regression model with variable selection. Expert Syst. Appl. 175, 114696 (2021)

    Article  Google Scholar 

  27. Gładysz, B., Kuchta, D.: A method of variable selection for fuzzy regression—the possibility approach. Op. Res. Decis. 21, 5–15 (2011)

    MathSciNet  MATH  Google Scholar 

  28. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)

    Article  MATH  Google Scholar 

  29. Diamond, P.: Fuzzy least squares. Inform. Sci. 46, 141–157 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  30. Zadeh, L.A.: Similarity relations and fuzzy orderings? Inf. Sci. 3, 177–200 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  31. Choobineh, F.: An index for ordering fuzzy numbers. Fuzzy Sets Syst. 54, 287–294 (1993)

    Article  MathSciNet  Google Scholar 

  32. Wang, X., Kerre, E.E.: Reasonable properties for the ordering of fuzzy quantities (I). Fuzzy Sets Syst. 118, 375–385 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  33. Leekwijck, W.V., Kerre, E.E.: Defuzzification: criteria and classification. Fuzzy Sets and Syst. 108(2), 159–178 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  34. Bank of Korea: Economic statistics system. http://ecos.bok.or.kr (2020)

  35. KB real estate data bank: https://kbland.kr/ (2023)

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A2C1A01011131).

Funding

National Research Foundation of Korea, No. 2020R1A2C1A01011131,Jin Hee Yoon

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jin Hee Yoon or Yoo Young Koo.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yoon, J.H., Kim, D.J. & Koo, Y.Y. Novel Fuzzy Correlation Coefficient and Variable Selection Method for Fuzzy Regression Analysis Based on Distance Approach. Int. J. Fuzzy Syst. 25, 2969–2985 (2023). https://doi.org/10.1007/s40815-023-01546-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-023-01546-6

Keywords

Navigation