Knowledge process of health big data using MapReduce-based associative mining

Choi, So-Young; Chung, Kyungyong

doi:10.1007/s00779-019-01230-3

Knowledge process of health big data using MapReduce-based associative mining

Original Article
Published: 31 May 2019

Volume 24, pages 571–581, (2020)
Cite this article

Personal and Ubiquitous Computing Aims and scope Submit manuscript

So-Young Choi¹ &
Kyungyong Chung²

473 Accesses
20 Citations
1 Altmetric
Explore all metrics

Abstract

Big-data knowledge processing technology facilitates efficient health management services by systematically collecting and promoting information using distributed/parallel processing with the health platform’s common data model. Thus, it enables knowledge expansion for healthcare data. In this study, we propose a big-data knowledge process for the health industry using Hadoop’s MapReduce software for association mining. The proposed method provides efficient health management knowledge services by collecting and processing heterogeneous health information using WebBot and the common data model. Hadoop is a proprietary method of effectively processing distributed big data. It is a knowledge processing model that combines MapReduce-based distributed processing and a method of finding mining-based associations. The input data in MapReduce is extracted from chronic disease nomenclature from health big data. The corpus divides big data into several blocks of a certain size, creating map tasks. Through the map function of the mapper of each map task, <|key|, value> sets composed of pairs of a key and a value are created. In the map process, a key is created using the same method used for a frequent item set of the Apriori algorithm. The key is a set of 2^p keys and its value is set to the occurrence frequency of the key. By summing up the values of the same keys by combining, the size of data is decreased and the load of a software program is also decreased. In addition, for each key, the reducer is designated through hash partitioning and stored in the reduce task. In the reduce process, the results of the map are allocated to each reducer, and alignment and merge steps are taken based on the keys. For the same |key|, the values are summed up by performing the reduce function. In this instance, keys whose values fail to meet the minimum support criterion are eliminated. Therefore, from a set of <|key|, value>, a frequent item set that meets the minimum support criterion is extracted. The association rules between datasets constituting the frequent item set are determined, and the support and reliability are calculated to examine whether they are actually associated. As the value of the frequent item set is higher, the support and reliability are also higher. Thus means that the association is obvious. A knowledge base is then constructed using the extracted association rules by repeatedly performing the MapReduce process. Closely associated knowledge bases are created and semantically related in real time with high probability. Furthermore, mining-based knowledge processing of health big data infers more meaningful associations between chronic diseases. The proposed method adds technological value and intelligent efficiency to support the health and medical fields promote healthy lives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MapReduce based integration of health hubs: a healthcare design approach

Article 17 April 2019

A MapReduce-Based Association Rule Mining Using Hadoop Cluster—An Application of Disease Analysis

Analysis of Diabetes and Heart Disease in Big Data Using MapReduce Framework

References

Jung H, Yoo H, Chung K (2016) Associative context mining for ontology-driven hidden knowledge discovery. Clust Comput 19(4):2261–2271
Article Google Scholar
Yoo H, Chung K (2017) PHR based diabetes index service model using life behavior analysis. Wirel Pers Commun 93(1):161–174
Article Google Scholar
Jung H, Chung K (2015) Sequential pattern profiling based bio-detection for smart health service. Clust Comput 18(1):209–219
Article Google Scholar
Chung K, Park RC (2016) PHR open platform based smart health service using distributed object group framework. Clust Comput 19(1):505–517
Article Google Scholar
Kim JC, Chung K (2018) Neural-network based adaptive context prediction model for ambient intelligence. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-0972-3
Jung H, Chung KY, Lee YH (2015) Decision supporting method for chronic disease patients based on mining frequent pattern. Multimed Tools Appl 74(20):8979–8991
Article Google Scholar
Jung H, Chung K (2016) Knowledge-based dietary nutrition recommendation for obese management. Inf Technol Manag 17(1):29–42
Article Google Scholar
Kim, J. C., & Chung, K. (2018). Mining health-risk factors using PHR similarity in a hybrid P2P network. Peer Peer Netw. Appl 11(6):1278-1287.
Hwang IS, Chung KY, Rim KW, Lee JH (2010) Improving the map/reduce model through data distribution and task progress scheduling. J Korea Contents Assoc 10(10):78–85
Article Google Scholar
Park E, Choi H, Park S, Oh S, Lee KY, Shim J (2015) Efficient processing of multiple group-by queries in MapReduce for big data analysis. KIISE Transactions on Computing Practices 21(5):387–392
Article Google Scholar
Noh H, Min J (2012) A quadtree construction method based on MapReduce framework for big data. In Proc of the Korean Information Science Society 39(2C):7–9
Google Scholar
Jin, C., Chen, J., & Liu, H. (2017). Mapreduce-based entity matching with multiple blocking functions. Front. Comput. Sci 11(5):895-911.
ApacheTM Hadoop, http://hadoop.apache.org/. Accessed Sept 2018
OHDSI (2018) Observational health data sciences and informatics. https://www.ohdsi.org. Accessed 14 Sept 2018
Chung KY, Na Y, Lee JH (2013) Interactive design recommendation using sensor based smart wear and weather WebBot. Wirel Pers Commun 73(2):243–256
Article Google Scholar
Health Insurance Review and Assessment Service (HIRA). http://opendata.hira.or.kr/. Accessed Sept 2018
Zhao J, Tao J, Streit A (2016) Enabling collaborative MapReduce on the cloud with a single-sign-on mechanism. Computing 98(1–2):55–72
Article MathSciNet Google Scholar
Agrawal R, Srikant R (1995) Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering USA, pp 3–14
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In Proceedings of 20th international conference on very large data base, USA, pp 487–499
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Article MathSciNet Google Scholar
Oh SY, Chung K, Han JS (2016) Towards ubiquitous health with convergence. Int J Technold Health Care 24(3):411–413
Article Google Scholar
Chung K, Kim JC, Park RC (2016) Knowledge-based health service considering user convenience using hybrid Wi-fi P2P. Inf Technol Manag 17(1):67–80
Article Google Scholar
Yoo H, Chung K (2018) Mining-based Lifecare recommendation using peer-to-peer dataset and adaptive decision feedback. Peer-to-Peer Networking and Applications 11(6):1309–1320
Article Google Scholar
Adomavicius G, Tuzhilin A (2015) Context-aware recommender systems. In: Recommender systems handbook. Springer, pp 217–253
Song CW, Jung H, Chung K (2017) Development of a medical big-data mining process using topic modeling, Clust Comput https://doi.org/10.1007/s10586-017-0942-0. Accessed Sept 2018
Mashal I, Alsaryrah O, Chung TY (2016) Testing and evaluating recommendation algorithms in internet of things. J Ambient Intell Humaniz Comput 7(6):889–900
Article Google Scholar
Huang YJ, Powers R, Montelione GT (2005) Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. J Am Chem Soc 127(6):1665–1674
Article Google Scholar
Kim, J. C., & Chung, K. (2019). Mining based time-series sleeping pattern analysis for life bigdata. Wirel Pers Commun 105(2):475-489.
Chung KY, Lee JH (2004) User preference mining through hybrid collaborative filtering and content-based filtering in recommendation system. IEICE Trans Inf Syst E87-D(12):2781–2790
Google Scholar
Jung H, Chung K (2016) PHR based life health index mobile service using decision support model. Wirel Pers Commun 86(1):315–332
Article Google Scholar
Kim, J. C., Chung, K. (2019). Prediction Model of User Physical Activity using Data Characteristicsbased Long Short-term Memory Recurrent Neural Networks. KSII Trans. Internet Inf. Syst. 13(4):2060-2077.
Yoo, H., Chung, K. (2018). Heart Rate Variability based Stress Index Service Model using Bio-Sensor. Cluster Comput 21(1):1139-1149.
Chung, K., Yoo, H. (2019). Blockchain Network based Topic Mining Process for Cognitive Manufacturing. Wirel Pers Commun 105(2):583-597.

Download references

Acknowledgements

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2018-0-01405) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation).

Author information

Authors and Affiliations

Data Mining Lab., Department of Computer Science, Kyonggi University, 154–42, Gwanggyosan-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16227, South Korea
So-Young Choi
Division of Computer Science and Engineering, Kyonggi University, 154-42, Gwanggyosan-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16227, South Korea
Kyungyong Chung

Authors

So-Young Choi
View author publications
You can also search for this author in PubMed Google Scholar
Kyungyong Chung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyungyong Chung.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choi, SY., Chung, K. Knowledge process of health big data using MapReduce-based associative mining. Pers Ubiquit Comput 24, 571–581 (2020). https://doi.org/10.1007/s00779-019-01230-3

Download citation

Received: 27 September 2018
Accepted: 03 May 2019
Published: 31 May 2019
Issue Date: October 2020
DOI: https://doi.org/10.1007/s00779-019-01230-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge process of health big data using MapReduce-based associative mining

Abstract

Access this article

Similar content being viewed by others

MapReduce based integration of health hubs: a healthcare design approach

A MapReduce-Based Association Rule Mining Using Hadoop Cluster—An Application of Disease Analysis

Analysis of Diabetes and Heart Disease in Big Data Using MapReduce Framework

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Knowledge process of health big data using MapReduce-based associative mining

Abstract

Access this article

Similar content being viewed by others

MapReduce based integration of health hubs: a healthcare design approach

A MapReduce-Based Association Rule Mining Using Hadoop Cluster—An Application of Disease Analysis

Analysis of Diabetes and Heart Disease in Big Data Using MapReduce Framework

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation