Conquering insufficient/imbalanced data learning for the Internet of Medical Things

Lan, Zi-Ching; Huang, Guan-Yu; Li, Yun-Pei; Rho, Seungmin; Vimal, S.; Chen, Bo-Wei

doi:10.1007/s00521-022-06897-z

Conquering insufficient/imbalanced data learning for the Internet of Medical Things

S.I. : Neural Computing for IOT based Intelligent Healthcare Systems
Published: 04 February 2022

Volume 35, pages 22949–22958, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zi-Ching Lan¹,
Guan-Yu Huang¹,
Yun-Pei Li¹,
Seungmin Rho²,
S. Vimal³ &
…
Bo-Wei Chen ORCID: orcid.org/0000-0001-6526-9017¹

437 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

This study presents a data augmentation technique that solves insufficient/imbalanced data problems during crowdsensing by the Internet of Medical Things (IoMT) or wireless sensor networks (WSNs), owing to diversified locations and heterogeneous conditions. This may cause problems because the samples in various categories may vary in quantities, which create skew distributions. Besides, pattern analysis of insufficient observed samples also generates biased models. In view of such, this work proposes synthetic minority oversampling generative adversarial networks (SMOGANs) for processing imbalanced data, where insufficient samples in quantities can be automatically expanded, so that different classes contain equal numbers of samples, subsequently avoiding biased modeling. The SMOGAN consists of two modules, where the first one is the synthetic minority oversampling technique (SMOTE), and the second involves a GAN. The former is used to initialize the proposed system, in which insufficient/imbalanced data samples are roughly augmented in quantities. Subsequently, the GAN enriches feature diversities of those pseudoreal samples formerly augmented by the SMOTE. Experiments on open datasets were carried out for evaluation. To assess the capability of data augmentation, only 4.00% of the real data were reserved as minority classes and then sent into different data augmentation methods for comparison. Analytical results showed that the proposed SMOGANs outperformed the baselines. Accuracy was increased compared with the baselines. Such results showed that the proposed SMOGAN could improve data collection problems of insufficient/imbalanced datasets by enhancing data quantities and qualities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bidirectional self-adaptive resampling in internet of things big data learning

Article 05 December 2018

Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN

Evidential Generative Adversarial Networks for Handling Imbalanced Learning

References

Aadil F, Ahsan W, Rehman ZU, Shah PA, Rho S, Mehmood I (2018) Clustering algorithm for Internet of Vehicles (IoV) based on dragonfly optimizer (CAVDO). J Supercomput 74(9):4542–4567
Article Google Scholar
Lin JC-W, Srivastava G, Zhang Y, Djenouri Y, Aloqaily M (2021) Privacy-preserving multiobjective sanitization model in 6G IoT environments. IEEE Internet Things J 8(7):5340–5349
Article Google Scholar
Carli R, Dotoli M, Pellegrino R (2017) A hierarchical decision-making strategy for the energy management of smart cities. IEEE Trans Autom Sci Eng 14(2):505–523
Article Google Scholar
Chen J, Low KH, Yao Y, Jaillet P (2015) Gaussian process decentralized data fusion and active sensing for spatiotemporal traffic modeling and prediction in mobility-on-demand systems. IEEE Trans Autom Sci Eng 12(3):901–921
Article Google Scholar
Shu Z, Wan J, Lin J, Wang S, Li D, Rho S, Yang C (2016) Traffic engineering in software-defined networking: measurement and management. IEEE Access 4:3246–3256
Article Google Scholar
Elmisery AM, Rho S, Botvich D (2016) A fog based middleware for automated compliance with OECD privacy principles in internet of healthcare things. IEEE Access 4:8418–8441
Article Google Scholar
Ji W, Xu J, Qiao H, Zhou M, Liang B (2019) Visual IoT: enabling internet of things visualization in smart cities. IEEE Network 33(2):102–110
Article Google Scholar
Ji W, Liang B, Wang Y, Qiu R, Yang Z (2020) Crowd V-IoE: visual Internet of Everything architecture in AI-driven fog computing. IEEE Wirel Commun 27(2):51–57
Article Google Scholar
Ji W, Duan LY, Huang X, Chai Y (2020) Astute video transmission for geographically dispersed devices in Visual IoT systems. IEEE Trans Mobile Comput 21(2):448–464
Article Google Scholar
Lopez J, Rios R, Bao F, Wang G (2017) Evolving privacy: from sensors to the Internet of Things. Futur Gener Comput Syst 75:46–57
Article Google Scholar
Li P, Li T, Ye H, Li J, Chen X, Xiang Y (2018) Privacy-preserving machine learning with multiple data providers. Futur Gener Comput Syst 87:341–350
Article Google Scholar
Wu F, Li X, Xu L, Kumari S (2020) A privacy-preserving scheme with identity traceable property for smart gri. Comput Commun 157(1):38–44
Article Google Scholar
Haddad BM, Yang S, Karam LJ, Ye J, Patel NS, Braun MW (2016) Multifeature, sparse-based approach for defects detection and classification in semiconductor units. IEEE Trans Autom Sci Eng 15(1):145–159
Article Google Scholar
Niu S, Li B, Wang X, Lin H (2020) Defect image sample generation with GAN for improving defect recognition. IEEE Trans Autom Sci Eng 17(3):1611–1622
Google Scholar
Jiang X, Ge Z (2020) Data augmentation classifier for imbalanced fault classification. IEEE Trans Autom Sci Eng 18(3):1206–1217
Article Google Scholar
Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive GA-based model for closed high-utility itemset mining. Appl Soft Comput 108(18):2021
Google Scholar
Liu X-Y, Wu J, Zhou Z-H (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern B Cybern 39(2):539–550
Article Google Scholar
Yap BW, Abd Rani K, Abd Rahman HA, Fong S, Khairudin Z, Abdullah NN (2013) An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In: Proc. 1st international conference on advanced data and information engineering, Kuala Lumpur, Malaysia, Dec 16–18, pp 13–22
Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Exp Newsl 6(1):20–29
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artificial Intell Res 16:321–357
Article MATH Google Scholar
Agrawal A, Viktor HL, Paquet E (2015) SCUT: multi-class imbalanced data classification using SMOTE and cluster-based undersampling. In: Proc. 2015 international joint conference on knowledge discovery, Knowledge Engineering and Knowledge Management, Lisbon, Portugal, Nov 12–14, pp. 226–234
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Proc. 2005 international conference on intelligent computing, Hefei, China, Aug 23–26, pp 878–887
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proc. 2008 IEEE international joint conference on neural networks, Hong Kong, China, Jun 01–08, pp 1322–1328
Sanabila HR, Jatmiko W (2018) Ensemble learning on large scale financial imbalanced data, In: Proc. 2018 international workshop on big data and information security, Jakarta, Indonesia, May 12–13, pp 93–98
Goodfellow J, Pouget-Abadie I, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proc. 28th international conference on neural information processing systems, Montreal, Quebec, Canada, Dec 08–13
Mathew J, Pang CK, Luo M, Leong WH (2017) Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans Neural Netw Learn Syst 29(9):4065–4076
Article Google Scholar
Tang B, He H (2015) KernelADASYN: kernel based adaptive synthetic data generation for imbalanced learning. In: Proc. 2015 IEEE congress on evolutionary computation, Sendai, Japan, May 25–28, pp 664–671
Hong X, Chen S, Harris CJ (2007) A kernel-based two-class classifier for imbalanced data sets. IEEE Trans Neural Netw 18(1):28–41
Article Google Scholar
Tanaka FHKdS, Aranha C (2019) Data augmentation using GANs,” ArXiv
Scott M, Plested J (2019) GAN-SMOTE: a generative adversarial network approach to synthetic minority oversampling for one-hot encoded data. In: Proc. 26th international conference on neural information processing, Sydney, New South Wales, Australia, Dec 12–15, pp 29–35
Pan Z, Yu W, Yi X, Khan A, Yuan F, Zheng Y (2019) Recent progress on generative adversarial networks (GANs): a survey. IEEE Access 7:36322–36333
Article Google Scholar
Mullick SS, Datta S, Das S (2019) Generative adversarial minority oversampling, In: Proc. 2019 international conference on computer vision, Seoul, South Korea, Oct 27–Nov 02, pp 1695–1704
Bertorello P, Koh LP (2019) SMate: synthetic minority adversarial technique,” Social Science Research Network
Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets: a review. GESTS Int Trans Comput Sci Eng 30(1):25–36
Google Scholar
Liu M, Gao D, Liu G, He J, Jin L, Zhou C, Yang F (2019) Learning based adaptive network immune mechanism to defense eavesdropping attacks. IEEE Access 7:182814–182826
Article Google Scholar
Qu N, Li Z, Zuo J, Chen J (2020) Fault detection on insulated overhead conductors based on DWT-LSTM and partial discharge. IEEE Access 8:87060–87070
Article Google Scholar
Nguyen T, Le T, Vu H, Phung D (2017) Dual discriminator generative adversarial nets. In: Proc. 31st international conference on neural information processing systems, Long Beach, California, United States, Dec 04–09, pp 2670–2680
Nagarajan V, Kolter JZ (2017) Gradient descent GAN optimization is locally stable. In: Proc. 31st international conference on neural information processing systems, Long Beach, California, United States, Dec 04–09, pp 5585–5595
Lin JCW, Shao Y, Djenouri Y, Yun U (2021) ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowledge-Based Systems
Lucic M, Kurach K, Michalski M, Gelly S, Bousquet O (2018) Are GANs created equal? A large-scale study. In: Proc. 32rd international conference on neural information processing systems, Montréal, Canada, Dec 03–08, pp 698–707

Download references

Acknowledgements

This work is supported in part by the Ministry of Science and Technology, Taiwan (107-2218-E-110-013-MY3) and by 2019 NVIDIA Data Science GPU Grants (Project: SMO-GANs for Extremely Imbalanced Data).

Author information

Authors and Affiliations

National Sun Yat-Sen University, Kaohsiung, Taiwan
Zi-Ching Lan, Guan-Yu Huang, Yun-Pei Li & Bo-Wei Chen
Chung-Ang University, Seoul, Korea
Seungmin Rho
Ramco Institute of Technology, Rajapalayam, India
S. Vimal

Authors

Zi-Ching Lan
View author publications
You can also search for this author in PubMed Google Scholar
Guan-Yu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yun-Pei Li
View author publications
You can also search for this author in PubMed Google Scholar
Seungmin Rho
View author publications
You can also search for this author in PubMed Google Scholar
S. Vimal
View author publications
You can also search for this author in PubMed Google Scholar
Bo-Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo-Wei Chen.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest between themselves.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lan, ZC., Huang, GY., Li, YP. et al. Conquering insufficient/imbalanced data learning for the Internet of Medical Things. Neural Comput & Applic 35, 22949–22958 (2023). https://doi.org/10.1007/s00521-022-06897-z

Download citation

Received: 22 May 2021
Accepted: 04 January 2022
Published: 04 February 2022
Issue Date: November 2023
DOI: https://doi.org/10.1007/s00521-022-06897-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Conquering insufficient/imbalanced data learning for the Internet of Medical Things

Abstract

Access this article

Similar content being viewed by others

Bidirectional self-adaptive resampling in internet of things big data learning

Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN

Evidential Generative Adversarial Networks for Handling Imbalanced Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Conquering insufficient/imbalanced data learning for the Internet of Medical Things

Abstract

Access this article

Similar content being viewed by others

Bidirectional self-adaptive resampling in internet of things big data learning

Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN

Evidential Generative Adversarial Networks for Handling Imbalanced Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation