Abstract
In recent years, the information technology industry around the world has grown strong. At the same time, we also face a new challenge with the explosion in the amount of information. Although there is a huge amount of data, the information that we actually have is lacking, and the implications behind the data have not been fully exploited. Scientists have researched new ways to fully exploit the information contained in the database. Since the late 1980s, the concept of knowledge discovery in databases was first mentioned. This is the process of detecting latent, unknown, and useful knowledge in large databases, while overcoming the limitations of traditional database models with only data query tools that cannot find new information, and is information hidden in the database. Knowledge mining in a database is the process of discovering new, useful, and information hidden in a database. Since the early 1980s, Z. Pawlak has proposed the Rough Set theory with a very solid mathematical basis. This theory is practiced by many research groups working in the field of general information technology and exploring knowledge in the database and applied in research. Rough Set theory is more widely applied in the field of knowledge discovery, while being useful in solving problems of data classification and association rules through discovery, and especially useful in problems dealing with ambiguous and uncertain data. Specifically, in theory, the raw set of data is displayed using information systems or tables. With large data tables having imperfect data, redundant data, or continuous data or represented in the form of symbols, the Rough Set theory allows knowledge exploration in databases like this to detect hidden knowledge from these "raw" blocks of data. The found knowledge is expressed in the form of rules and patterns. After finding the most general rules for data representation, one can calculate the strength and dependence between attributes in the information system. In this paper, the authors research the recommendation system, rough set theory, theory of approximation, and fuzzy rough set theory, thereby building a partial model. Software enables users to exploit association rules of their database, thereby facilitating appropriate purchase or import decisions. The system can support user design options of database features, load data from the SQL Server by Apache Spark, and export the statistics to website to be reported.

























Similar content being viewed by others
Availability of data and materials
Please contact the corresponding author for data requests. The C# coding and sample rough sets database is available.
References
Fayyad U (1997) Data mining and knowledge discovery in databases: implications for scientific databases. In: Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150), 1997, pp 2–11. https://doi.org/10.1109/SSDM.1997.621141
Garani G, Chernov A, Savvas I, Butakova M (2019) A data warehouse approach for business intelligence. In: 2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 2019, pp 70–75. https://doi.org/10.1109/WETICE.2019.00022.
Zhang Q, Xie Q, Wang G (2016) A survey on rough set theory and its applications. CAAI Trans Intell Technol 1(4):323–333. https://doi.org/10.1016/j.trit.2016.11.001
Kusiak A (2021) Rough set theory: a data mining tool for semiconductor manufacturing. IEEE Trans Electron Packag Manuf 24(1):44–50. https://doi.org/10.1109/6104.924792
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356. https://doi.org/10.1007/BF01001956
Patel H, Patel D (2017) Crop prediction framework using rough set theory. Int J Eng Technol 9:2505–2513. https://doi.org/10.21817/ijet/2017/v9i3/1709030266
Grzymala-Busse JW (2005) Rough set theory with applications to data mining. https://doi.org/10.1007/11364160_7
Nair B, Mohandas V, Sakthivel N (2010) A decision tree-rough set hybrid system for stock market trend prediction. Int J Comput Appl. https://doi.org/10.5120/1106-1449
Khanzadi M, Gholamian M (2018) Building a rough sets-based prediction model for classifying large-scale construction projects based on sustainable success index. Eng Constr Archit Manag. https://doi.org/10.1108/ECAM-05-2016-0110
Tiwari S, Pandit R, Richhariya V (2012) Predicting future trends in stock market by decision tree rough-set based hybrid system with HHMM. Int J Electron Comput Sci Eng 1:1578–1587
Talasila V, Madhubabu K, Mahadasyam M, Atchala N, Kande L (2020) The prediction of diseases using rough set theory with recurrent neural network in big data analytics. Int J Intell Eng Syst 13:10–18. https://doi.org/10.22266/ijies2020.1031.02
Isinkaye FO, Folajimi YO, Ojokoh BA (2015) Recommendation systems: principles, methods and evaluation. Egypt Inform J 16(3):261–273. https://doi.org/10.1016/j.eij.2015.06.005
Düntsch I, Gediga G (1998) Uncertainty measures of rough set prediction. Artif Intell 106(1):109–137. https://doi.org/10.1016/S0004-3702(98)00091-5
Yu D, Xu Z, Pedrycz W (2020) Bibliometric analysis of rough sets research. Appl Soft Comput 94:1–10. https://doi.org/10.1016/j.asoc.2020.106467
Vidhya KA, Geetha TV (2017) Rough set theory for document clustering: a review. J Intell Fuzzy Syst 32(3):2165–2185. https://doi.org/10.3233/JIFS-162006
Ang KK, Quek C (2005) Stock trading using PSEC and RSPOP: a novel evolving rough set-based neuro-fuzzy approach. In: 2005 IEEE Congress on Evolutionary Computation, vol 2, pp 1032–1039. https://doi.org/10.1109/CEC.2005.1554804
Andhalkar S, Momin BF (2018) Rough set theory and its extended algorithms. In: 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), 2018, pp 1434–1438. https://doi.org/10.1109/ICCONS.2018.8663100
Chaudhuri A, De K, Chatterjee D (2013) Discovering stock price prediction rules of bombay stock exchange using rough fuzzy multi layer perception networks. https://arxiv.org/abs/1307.1895.
Ibedou I, Abbas SE (2020) Fuzzy rough sets with a fuzzy ideal. J Egypt Math Soc 28:1–13. https://doi.org/10.1186/s42787-020-00096-2
Rybinski H, Podsiadło M (2015) Application of fuzzy rough sets to financial time series forecasting. https://doi.org/10.1007/978-3-319-19941-2_38
Behmanesh M, Adibi P, Karshenas H (2021) Weighted least squares twin support vector machine with fuzzy rough set theory for imbalanced data classification. https://arxiv.org/abs/2105.01198.
Zhang K, Zhan J, Wu W-Z (2020) Novel fuzzy rough set models and corresponding applications to multi-criteria decision-making. Fuzzy Sets Syst 383:92–126. https://doi.org/10.1016/j.fss.2019.06.019
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: SIGMOD Conference. https://doi.org/10.1145/170036.170072
Fan J, Li D (1998) An overview of data mining and knowledge discovery. J Comput Sci Technol 13:348–368. https://doi.org/10.1007/BF02946624
Huh J-H (2018) Big data analysis for personalized health activities: machine learning processing for automatic keyword extraction approach. Symmetry 10:93. https://doi.org/10.3390/sym10040093
Yingzhuo X, Xuewen W (2021) Research on community consumer behavior based on association rules analysis. In: 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP), pp 1213–1216, https://doi.org/10.1109/ICSP51882.2021.9408917.
Ai D, Pan H, Li X, Gao Y, He D (2018) Association rule mining algorithms on high-dimensional datasets. Artif Life Robot 23:423–427. https://doi.org/10.1007/s10015-018-0437-y
Dhandayudam P, Krishnamurthi I (2013) Customer behavior analysis using rough set approach. J Theor Appl Electron Commerce Res 8:21–33. https://doi.org/10.4067/S0718-18762013000200003
Zhang Y, Zhao Z, Yu J, Wang K (2015) Research on E-commerce consumer behavior prediction based on rough sets. Int J u- e-Serv Sci Technol 8:69–76. https://doi.org/10.14257/ijunesst.2015.8.4.08
Hassan NRS, Ibrahim SFM (2012) Forecasting stock market trends using rough set. 9(1), 1–20. https://doi.org/10.21608/jsfc.2012.26367.
Shaikh E, Mohiuddin I, Alufaisan Y, Nahvi I (2019) Apache Spark: a big data processing engine, pp 1–6. https://doi.org/10.1109/MENACOMM46666.2019.8988541.
Wang F, Wen Y, Guo T, Liu J, Cao B (2020) Collaborative filtering and association rule mining-based market basket recommendation on spark. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.5565
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB '94). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 487–499
Sun D, Teng S, Zhang W, Zhu H (2007) An algorithm to improve the effectiveness of apriori, pp 385–390. https://doi.org/10.1109/COGINF.2007.4341914.
Albuquerque L, Roque F, Valente Neto F, Koroiva R, Buss D, Baptista D, Hepp L, Kuhlmann M, Sundar S, Covich A, Pinto J (2021) Large-scale prediction of tropical stream water quality using rough sets theory. Ecol Inform 61:101226. https://doi.org/10.1016/j.ecoinf.2021.101226
Cheng C-H, Chen Y-H, Liu J-W (2009) Classifying Cinnamomums using rough sets classifier based on interval-discretization. Plant Syst Evol 280:89–97. https://doi.org/10.1007/s00606-009-0161-0
Yao Y (2020) Three-way granular computing, rough sets, and formal concept analysis. Int J Approx Reason 116:106–125. https://doi.org/10.1016/j.ijar.2019.11.002
Stanczyk U, Zielosko B (2020) Heuristic-based feature selection for rough set approach. Int J Approx Reason 125:187–202. https://doi.org/10.1016/j.ijar.2020.07.005
Chelly Dagdia Z, Zarges C, Beck G et al (2020) A scalable and effective rough set theory-based approach for big data pre-processing. Knowl Inf Syst 62:3321–3386. https://doi.org/10.1007/s10115-020-01467-y
Golan RH, Ziarko W (1995) A methodology for stock market analysis utilizing rough set theory. In: Proceedings of 1995 Conference on Computational Intelligence for Financial Engineering (CIFEr), 1995, pp 32–40. https://doi.org/10.1109/CIFER.1995.495230.
Mardani A, Nilashi M, Antucheviciene J, Tavana M, Bausys R, Ibrahim O (2017) Recent fuzzy generalisations of rough sets theory: a systematic review and methodological critique of the literature. Complexity. https://doi.org/10.1155/2017/1608147
Novák V (2020) Topology in the alternative set theory and rough sets via fuzzy type theory. Mathematics 8:1–22. https://doi.org/10.3390/math8030432
Ducange P, Fazzolari M, Marcelloni F (2020) An overview of recent distributed algorithms for learning fuzzy models in Big Data classification. J Big Data. https://doi.org/10.1186/s40537-020-00298-6
Chelly Dagdia Z, Zarges C, Beck G, Lebbah M (2020) A scalable and effective rough set theory-based approach for big data pre-processing. Knowl Inf Syst 62:1–66. https://doi.org/10.1007/s10115-020-01467-y
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tran, D.T., Huh, JH. Building a model to exploit association rules and analyze purchasing behavior based on rough set theory. J Supercomput 78, 11051–11091 (2022). https://doi.org/10.1007/s11227-021-04275-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04275-5