Abstract
Rapid advancement of industrial internet of things (IoT) technology has changed the supply chain network to an open system to meet the high demand for individualized products and provide better customer experiences. However the open-system supply chain has forced many small and midsize enterprises (SMEs) to adopt vertical integration by being divided into smaller companies with a distinctive business for each SME but a central alliance to produce a range of products and gain competencies. Therefore, existing models do not guarantee the protection of data privacy of individual SMEs. Moreover, especially for the IoT environment, collecting data in a secure way and revealing valuable knowledge in an IoT network is difficult. How to share data in a secure framework is of paramount importance in the internet of behavior field. In this article, a privacy-preserving data-mining framework is proposed for joint-venture industrial collaborative activities by combining federated learning and a “pre-large concept” of data-mining techniques. The novelty of the proposed approach is that, while mining high-utility itemsets (HUIs) from multiple datasets, it does not require direct data sharing. In the proposed method, the federated-learning framework can learn from aggregated learning parameters without scanning all data from different sets. The pre-large concept in this approach reduces the amount of scanning into different datasets. Thus, the approach makes it possible to train federated learning more quickly while protecting the privacy of individual data owners. The approach has been tested on real industrial datasets in a collaborative environment. Extensive experimental results show that the approach achieves high accuracy compared with conventional data-mining techniques while preserving the privacy of datasets.
- [1] . 1994. Fast algorithms for mining association rules. In Proceedings of the 1994 International Conference on Very Large Data Bases. 487–499.Google Scholar
- [2] . 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM-SIGMOD International Conference on Management of Data. 207–216.Google ScholarDigital Library
- [3] . 2022. OHUQI: Mining on-shelf high-utility quantitative itemsets. The Journal of Supercomputing 78 (2022), 1–25.Google Scholar
- [4] . 2010. A new fast vertical method for mining frequent patterns. International Journal of Computational Intelligence Systems 3, 6 (2010), 733–744.Google ScholarCross Ref
- [5] . 2012. A new algorithm for fast mining frequent itemsets using N-lists. Science China Information Sciences 55, 9 (2012), 2008–2030.Google ScholarCross Ref
- [6] . 2016. DiffNodesets: An efficient structure for fast mining frequent itemsets. Applied Soft Computing 41 (2016), 214–223.Google ScholarDigital Library
- [7] . 2014. Fast mining frequent itemsets using nodesets. Expert Systems with Applications 41, 10 (2014), 4505–4512.Google ScholarCross Ref
- [8] . 2020. A deep learning model for smart manufacturing using convolutional LSTM neural network autoencoders. IEEE Transactions on Industrial Informatics 16, 9 (2020), 6069–6078.
DOI: DOI: Google ScholarCross Ref - [9] . 2016. The SPMF open-source data mining library version 2. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 36–40.Google ScholarCross Ref
- [10] . 2021. Anomaly rule detection in sequence data. IEEE Transactions on Knowledge and Data Engineering (2021).Google Scholar
- [11] . 2017. Differentially private federated learning: A client level perspective. arXiv:1712.07557. Retrieved from https://arxiv.org/abs/1712.07557.Google Scholar
- [12] . 2018. Privacy-preserving ridge regression with only linearly-homomorphic encryption. In Proceedings of the International Conference on Applied Cryptography and Network Security. Springer, 243–261.Google ScholarDigital Library
- [13] . 2005. Fast algorithms for frequent itemset mining using fp-trees. IEEE Transactions on Knowledge and Data Engineering 17, 10 (2005), 1347–1362.Google ScholarDigital Library
- [14] . 2000. Mining frequent patterns without candidate generation. ACM SIGMOD Record 29, 2 (2000), 1–12.Google ScholarDigital Library
- [15] . 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv:1711.10677. Retrieved from https://arxiv.org/abs/1711.10677.Google Scholar
- [16] . 2009. Maintaining pre-large FUSP trees for record deletion. In Proceedings of the 2009 International Conference on New Trends in Information and Service Science.Google Scholar
- [17] . 2001. A new incremental data mining algorithm using pre-large itemsets. Intelligent Data Analysis 5, 2 (2001), 111–129.Google ScholarCross Ref
- [18] . 2021. Research on collaborative recommendation of dynamic medical services based on cloud platforms in the industrial interconnection environment. Technological Forecasting and Social Change 170 (2021), 120895.
DOI: DOI: Google ScholarCross Ref - [19] . 2020. Towards smart manufacturing using spiral digital twin framework and twinchain. IEEE Transactions on Industrial Informatics 18, 2 (2020), 1359–1366.
DOI: DOI: Google ScholarCross Ref - [20] . 2015. Updating the built prelarge fast updated sequential pattern trees with sequence modification. International Journal of Data Warehousing & Mining 11, 1 (2015), 1–22.Google ScholarDigital Library
- [21] . 2019. Cloud-based manufacturing equipment and big data analytics to enable on-demand manufacturing services. Robotics and Computer-Integrated Manufacturing 57 (2019), 92–102.
DOI: DOI: Google ScholarDigital Library - [22] . 2017. Secureml: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy. IEEE, 19–38.Google ScholarCross Ref
- [23] . 2018. Entity resolution and federated learning get a federated resolution. arXiv:1803.04035. Retrieved from https://arxiv.org/abs/1803.04035.Google Scholar
- [24] . 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 1345–1359.Google ScholarDigital Library
- [25] . 2021. The Smart Factory: Responsive, Adaptive, Connected Manufacturing- A Deloitte Series on Industry 4.0, Digital Manufacturing Enterprises, and Digital Supply Networks.
Technical Report . Deloitte University Press.Google Scholar - [26] . 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM-SIGSAC Conference on Computer and Communications Security. 1310–1321.Google ScholarDigital Library
- [27] . 2019. Reconfigurable smart factory for drug packing in healthcare industry 4.0. IEEE Transactions on Industrial Informatics 15, 1 (2019), 507–516.
DOI: DOI: Google ScholarCross Ref - [28] . 2020. A collaborative architecture of the industrial internet platform for manufacturing systems. Robotics and Computer-Integrated Manufacturing 61 (2020), 101854.
DOI: DOI: Google ScholarDigital Library - [29] . 2019. Fog-IBDIS: Industrial big data integration and sharing with fog computing for manufacturing systems. Engineering 5, 4 (2019), 662–670.
DOI: DOI: Google ScholarCross Ref - [30] . 2021. Forward privacy preservation in IoT-enabled healthcare systems. IEEE Transactions on Industrial Informatics 18, 3 (2021), 1991–1999.Google ScholarCross Ref
- [31] . 2004. A support-ordered trie for fast frequent itemset discovery. IEEE Transactions on Knowledge and Data Engineering 16, 7 (2004), 875–879.Google ScholarDigital Library
- [32] . 2021. Industrial Design and Development Software System Architecture Based on Model-Based Systems Engineering and Cloud Computing Annual Reviews in Control 51 (2021), 401–423.
DOI: DOI: Google ScholarCross Ref - [33] . 2021. Pre-large based utility-oriented data analytics for transaction modifications in Internet of Things. IEEE Internet of Things Journal PP, 99 (2021), 1–1.Google Scholar
Index Terms
- A Privacy Frequent Itemsets Mining Framework for Collaboration in IoT Using Federated Learning
Recommendations
An efficient pattern growth approach for mining fault tolerant frequent itemsets
Highlights- Mining fault tolerant (FT) frequent itemsets are computationally expensive.
- ...
AbstractMining fault tolerant (FT) frequent itemsets from transactional databases are computationally more expensive than mining exact matching frequent itemsets. Previous algorithms mine FT frequent itemsets using Apriori heuristic. Apriori-...
Frequent itemset mining using cellular learning automata
A core issue of the association rule extracting process in the data mining field is to find the frequent patterns in the database of operational transactions. If these patterns discovered, the decision making process and determining strategies in ...
Privacy-preserving federated mining of frequent itemsets
AbstractIn the growing concerns about data privacy and increasingly stringent data security regulations, it is not feasible to directly mine data or share data if the dataset contains private data. Collecting and analyzing data from multiple parties ...
Comments