Skip to main content

An Analysis of Human Perception of Partitions of Numerical Factor Domains

  • Conference paper
  • First Online:
Information Integration and Web Intelligence (iiWAS 2022)

Abstract

In Machine learning (ML), several discretization techniques and mathematical approaches are used to partition numerical data attributes. However, cut-points retrieved by discretizing techniques often do not match with human perceived cut-points. Therefore, understanding the human perception for discretizing the numerical attribute is important for developing an effective discretizing technique. In this paper, we conduct a study of human perception of partitions in numerical data that reflects best the impact of one independent numerical attribute on another dependent numerical attribute. We aim to understand how expert data scientists and statisticians partition numerical attributes under different types of data points, such as dense data points, outliers, and uneven random points. The findings lead to an interesting discussion about the importance of human perception under distinct kinds of data points for finding partitions of numerical attributes.

This work has been partially conducted in the project “ICT programme” which was supported by the European Union through the European Social Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aupetit, M., Sedlmair, M., Abbas, M.M., Baggag, A., Bensmail, H.: Toward perception-based evaluation of clustering techniques for visual analytics. In: Proceedings of VIS2019 - IEEE Visualization Conference, pp. 141–145 (2019)

    Google Scholar 

  2. Demiralp, Ç., Bernstein, M.S., Heer, J.: Learning perceptual kernels for visualization design. IEEE Trans. Visual Comput. Graph. 20(12), 1933–1942 (2014)

    Article  Google Scholar 

  3. Draheim, D.: Generalized Jeffrey conditionalization: a frequentist semantics of partial conditionalization. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69868-7

    Article  MATH  Google Scholar 

  4. Draheim, D.: Future perspectives of association rule mining based on partial conditionalization. In: Proceedings of DEXA’2019 - the 30th International Conference on Database and Expert Systems Applications, LNCS, vol. 11706, p. xvi. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27615-7

  5. Etemadpour, R., da Motta, R.C., de Souza Paiva, J.G., Minghim, R., de Oliveira, M.C.F., Linsen, L.: Role of human perception in cluster-based visual analysis of multidimensional data projections. In: Proceedings of IVAPP -International Conference on Information Visualization Theory and Applications, pp. 276–283 (2014)

    Google Scholar 

  6. Garcia, S., Luengo, J., Sáez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2012)

    Article  Google Scholar 

  7. Janosi, A., Steinbrunn, W., Pfisterer, M., Detrano, R.: Heart Disease. UCI machine learning repository (1988)

    Google Scholar 

  8. Kalish, M.: DC public employee salaries (2011). https://data.world/codefordc/dc-public-employee-salaries-2011

  9. Kaushik, M.: Datasets (2022). https://github.com/minakshikaushik/LSQM-measure.git

  10. Kaushik, M., Sharma, R., Peious, S.A., Draheim, D.: Impact-Driven Discretization of Numerical Factors: Case of Two- and Three-Partitioning. In: Srirama, S.N., Lin, J.C.-W., Bhatnagar, R., Agarwal, S., Reddy, P.K. (eds.) BDA 2021. LNCS, vol. 13147, pp. 244–260. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93620-4_18

    Chapter  Google Scholar 

  11. Kaushik, M., et al.: A systematic assessment of numerical association rule mining methods. SN Comput. Sci. 2(5), 1–13 (2021)

    Article  Google Scholar 

  12. Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)

    Google Scholar 

  13. Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: An enabling technique. Data Min. Knowl. Disc. 6(4), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  14. Naik, S.: NJ teacher salaries. (2016). https://data.world/sheilnaik/nj-teacher-salaries-2016

  15. Arakkal Peious, S., Sharma, R., Kaushik, M., Shah, S.A., Yahia, S.B.: Grand reports: a tool for generalizing association rule mining to numeric target values. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 28–37. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_3

    Chapter  Google Scholar 

  16. Shahin, M., et al.: Big data analytics in association rule mining: A systematic literature review. In: Proceedings of BDET 2021- International Conference on Big Data Engineering and Technology, pp. 40–49. ACM (2021)

    Google Scholar 

  17. Sharma, R., et al.: A novel framework for unification of association rule mining, online analytical processing and statistical reasoning. IEEE Access 10, 12792–12813 (2022). https://doi.org/10.1109/ACCESS.2022.3142537

    Article  Google Scholar 

  18. Sharma, R., Kaushik, M., Peious, S.A., Shahin, M., Yadav, A.S., Draheim, D.: Towards unification of statistical reasoning, OLAP and association rule mining: semantics and pragmatics. In: Database Systems for Advanced Applications. DASFAA 2022, LNCS, vol. 13245. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-00123-9_48

  19. Sharma, R., Kaushik, M., Peious, S.A., Yahia, S.B., Draheim, D.: Expected vs. unexpected: selecting right measures of interestingness. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 38–47. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_4

    Chapter  Google Scholar 

  20. Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proceedings of ACM SIGMOD 1996 - International Conference on Management of Data, pp. 1–12 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minakshi Kaushik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaushik, M., Sharma, R., Shahin, M., Peious, S.A., Draheim, D. (2022). An Analysis of Human Perception of Partitions of Numerical Factor Domains. In: Pardede, E., Delir Haghighi, P., Khalil, I., Kotsis, G. (eds) Information Integration and Web Intelligence. iiWAS 2022. Lecture Notes in Computer Science, vol 13635. Springer, Cham. https://doi.org/10.1007/978-3-031-21047-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21047-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21046-4

  • Online ISBN: 978-3-031-21047-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics