What Can the Big Data Eco-System and Data Analytics Do for E-Health? A Smooth Review Study

Benabderrahmane, Sidahmed

doi:10.1007/978-3-319-56148-6_56

Sidahmed Benabderrahmane¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10208))

Included in the following conference series:

International Conference on Bioinformatics and Biomedical Engineering

1957 Accesses

Abstract

In this paper we present a global overview of the present usage and future trends of the different big data ecosystems in the E-Health’s scientific domains. Indeed, bioinformaticians as well as medicine practitioners are actually generating very large amounts of data, and thus storing, managing, and analyzing these large scale data-sets still represent a big challenge. The used Big Data ecosystems are involved at different steps of the production chain, i.e., from the acquisition of both structured and non-structured data, the storage in traditional and/or NoSQL databases, and finally the analytics using the Map Reduce framework. We will discuss in this smooth survey, all these parts of the ecosystem and will give some use cases on real data-sets in the domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Inmon, W.H., Linstedt, D.: A brief history of big data. In: Inmon, W.H., Linstedt, D. (eds.) Data Architecture: A Primer for the Data Scientist, pp. 45–48. Morgan Kaufmann, Boston (2015)
Chapter Google Scholar
Secchi, P., Paganoni, A.M.: Advances in Complex Data Modeling. Springer, Heidelberg (2014)
Google Scholar
Fawcett, T., Provost, F.: Data Science for Business What You Need to Know about Data Mining and Data-Analytic Thinking. OReilly Media, Sebastopol (2013)
Google Scholar
Zou, Q., Li, X.-B., Jiang, W.-R., Lin, Z.-Y., Li, G.-L., Chen, K.: Survey of mapreduce frame operation in bioinformatics. Brief. Bioinf. 15(4), 637–647 (2014)
Article Google Scholar
Linstedt, D., Inmon, W.H.: Data Architecture: A Primer for the Data Scientist, Big Data, Data Warehouse and Data Vault. OReilly Media, Sebastopol (2014)
Google Scholar
Dimitrov, D.V.: Medical internet of things and big data in healthcare. Healthc. Inf. Res. 22(3), 156–163 (2016)
Article Google Scholar
Coulouris, G., Dollimore, J., Kindberg, T., Blair, G.: Distributed Systems: Concepts and Design, 5th edn. Addison-Wesley Publishing Company, Boston (2011)
MATH Google Scholar
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, New York (2011)
Book Google Scholar
Berman, F., Fox, G., Hey, A.J.G.: Grid Computing: Making the Global Infrastructure a Reality. Wiley, New York (2003)
Book Google Scholar
Mohammed, E.A., Far, B.H., Naugler, C.: Applications of the mapreduce programming framework to clinical big data analysis: current landscape and future trends. BioData Min. 7(1), 22 (2014)
Article Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP 2003, pp. 29–43. ACM, New York (2003)
Google Scholar
White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media Inc., Sebastopol (2009)
Google Scholar
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2) (2008)
Google Scholar
Lam, C.: Hadoop in Action, 1st edn. Manning Publications Co., Greenwich (2010)
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
Article Google Scholar
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, Berkeley, CA, USA, p. 10. USENIX Association (2010)
Google Scholar
Larus, J.R.: The cloud will change everything. SIGPLAN Not. 46(3), 1–2 (2011)
Article Google Scholar
Juan, H.F., Huang, H.C.: Bioinformatics. Humana Press, Totowa (2007). pp. 405–416
Book Google Scholar
Hoogendoorn, M., Szolovits, P., Moons, L.M.G., Numans, M.E.: Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer. Artif. Intell. Med. 69, 53–61 (2016)
Article Google Scholar
Siuly, S., Li, Y., Zhang, Y.: EEG Signal Analysis and Classification - Techniques and Applications. Health Information Science. Springer, Heidelberg (2016)
Book Google Scholar
Kafkas, S., Kim, J.H., Pi, X., McEntyre, J.R.: Database citation in supplementary data linked to europe pubmed central full text biomedical articles. J. Biomed. Semant. 6, 1 (2015)
Article Google Scholar
Benabderrahmane, S., Smaïl-Tabbone, M., Poch, O., Napoli, A., Devignes, M.-D.: IntelliGO: a new vector-based semantic similarity measure including annotation origin. BMC Bioinform. 11, 588 (2010)
Article Google Scholar
Yu, N., Li, B., Pan, Y.: A cloud-assisted application over apache spark for investigating epigenetic markers on DNA genome sequences. In: 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom), BDCloud-SocialCom-SustainCom 2016, Atlanta, GA, USA, 8–10 October 2016, pp. 67–74 (2016)
Google Scholar
Ahmed, Z., Saman, Z., Dandekar, T.: Mining biomedical images towards valuable information retrieval in biomedical and life sciences. Database 2016 (2016)
Google Scholar
Fiore, S., DAnca, A., Palazzo, C., Foster, I., Williams, D.N., Aloisio, G.: Ophidia: towardbig data analytics for escience. Procedia Comput. Sci. 18, 2376–2385 (2013)
Article Google Scholar
Schumacher, A., Pireddu, L., Niemenmaa, M., Kallio, A., Korpelainen, E., Zanetti, G., Heljanko, K.: SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop. Bioinformatics 30(1), 119–120 (2014)
Article Google Scholar
Pireddu, L., Leo, S., Soranzo, N., Zanetti, G.: A Hadoop-galaxy adapter for user-friendly and scalable data-intensive bioinformatics in galaxy. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2014, Newport Beach, California, USA, 20–23 September 2014, pp. 184–191 (2014)
Google Scholar
Leo, S., Santoni, F., Zanetti, G.: Biodoop: bioinformatics on hadoop. In: International Conference on Parallel Processing Workshops, ICPPW 2009, Vienna, Austria, 22–25 September 2009, pp. 415–422 (2009)
Google Scholar
ODriscoll, A., Daugelaite, J., Sleator, R.D.: Big data, Hadoop and cloud computing in genomics. J. Biomed. Inf. 46(5), 774–781 (2013)
Article Google Scholar
Matsunaga, A.M., Tsugawa, M.O., Fortes, J.A.B.: Cloudblast: combining mapreduce and virtualization on distributed resources for bioinformatics applications. In: e-Science 2008 Fourth International Conference on e-Science, Indianapolis, IN, USA, 7–12 December 2008, pp. 222–229 (2008)
Google Scholar
Schatz, M.C.: Cloudburst: highly sensitive read mapping with mapreduce. Bioinformatics 25(11), 1363–1369 (2009)
Article Google Scholar
Venkata, V., Prasad, S., Loshma, G.: HPC-MAQ: a parallel short-read reference assembler
Google Scholar
Langmead, B., Hansen, K.D., Leek, J.T.: Cloud-scale RNA-sequencing differential expression analysis with myrna. Genome Biol. 11(8), R83 (2010)
Article Google Scholar
Berrada, G., Keulen, M., Habib, M.B.: Hadoop for EEG storage and processing: a feasibility study. In: Ślȩzak, D., Tan, A.-H., Peters, J.F., Schwabe, L. (eds.) BIH 2014. LNCS (LNAI), vol. 8609, pp. 218–230. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09891-3_21
Google Scholar
Markonis, D., Schaer, R., Eggel, I., Müller, H., Depeursinge, A.: Using mapreduce for large-scale medical image analysis. In: 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012, La Jolla, CA, USA, 27–28 September 2012, p. 1 (2012)
Google Scholar
Mangla, S., Raghava, N.S.: Iris recognition on hadoop: a biometrics system implementation on cloud computing. In: 2011 IEEE International Conference on Cloud Computing and Intelligence Systems, pp. 482–485, September 2011
Google Scholar
McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., Garimella, K., Altshuler, D., Gabriel, S., Daly, M., DePristo, M.A.: The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20(9), 1297–1303 (2010)
Article Google Scholar
Gurtowski, J., Schatz, M.C., Langmead, B.: Genotyping in the cloud with crossbow (2002)
Google Scholar
Brock, M., Goscinski, A.: Execution of compute intensive applications on hybrid clouds (case study with mpiblast). In: Sixth International Conference on Complex, Intelligent, and Software Intensive Systems, CISIS 2012, Palermo, Italy, 4–6 July 2012, pp. 995–1000 (2012)
Google Scholar
Vouzis, P.D., Sahinidis, N.V.: GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27(2), 182–188 (2011)
Article Google Scholar
Benabderrahmane, S.: Enhancing transcriptomic data mining with semantic ranking: towards a new functional spectral representation. In: Rojas, I., Guzman, F.M.O.(eds.) Proceedings of the International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2013, Granada, Spain, 18–20 March 2013, pp. 721–730. Copicentro Editorial (2013)
Google Scholar
Hong, D., Rhie, A., Park, S.S., Lee, J., Ju, Y.S., Kim, S., Yu, S.B., Bleazard, T., Park, H.S., Rhee, H., Chong, H., Yang, K.S., Lee, Y.S., Kim, I.H., Lee, J.S., Kim, J.I., Seo, J.S.: FX: an RNA-Seq analysis tool on the cloud. Bioinformatics 28(5), 721–723 (2012)
Article Google Scholar
Wang, L., Chen, D., Ranjan, R., Khan, S.U., Kolodziej, J., Wang, J.: Parallel processing of massive EEG data with mapreduce. In: 18th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2012, Singapore, 17–19 December 2012, pp. 164–171 (2012)
Google Scholar
Markonis, D., Schaer, R., Eggel, I., Müller, H., Depeursinge, A.: Using mapreduce for large-scale medical image analysis. CoRR, abs/1510.06937 (2015)
Google Scholar
Alyass, A., Turcotte, M., Meyre, D.: From big data analysis to personalized medicine for all: challenges and opportunities. BMC Med. Genomics 8(1), 33 (2015)
Article Google Scholar
Naseer, A., Alkazemi, B.Y., Waraich, E.U.: A big data approach for proactive healthcare monitoring of chronic patients. In: 2016 Eighth International Conference on Ubiquitous and Future Networks (ICUFN), pp. 943–945, July 2016
Google Scholar

Download references

Author information

Authors and Affiliations

Paris Dauphine University, Place du Marchal de Lattre de Tassigny, 75016, Paris, France
Sidahmed Benabderrahmane

Authors

Sidahmed Benabderrahmane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sidahmed Benabderrahmane .

Editor information

Editors and Affiliations

Universidad de Granada, Granada, Spain
Ignacio Rojas
Universidad de Granada, Granada, Spain
Francisco Ortuño

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Benabderrahmane, S. (2017). What Can the Big Data Eco-System and Data Analytics Do for E-Health? A Smooth Review Study. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10208. Springer, Cham. https://doi.org/10.1007/978-3-319-56148-6_56

Download citation

DOI: https://doi.org/10.1007/978-3-319-56148-6_56
Published: 01 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56147-9
Online ISBN: 978-3-319-56148-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics