Skip to main content

BioHIPI: Biomedical Hadoop Image Processing Interface

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Big Data (MOD 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10710))

Included in the following conference series:

  • 2973 Accesses

Abstract

Nowadays, the importance of collecting large amounts of data is becoming increasingly crucial, along with the application of efficient and effective analysis techniques, in many areas. One of the most important field in which Big Data is becoming of fundamental importance is the biomedical domain, also due to the decreasing cost of acquiring and analyzing biomedical data. Furthermore, the emergence of more accessible technologies and the increasing speed-up of algorithms, also thanks to parallelization techniques, is helping at making the application of Big Data in healthcare a fast-growing field.

This paper presents a novel framework, Biomedical Hadoop Image Processing Interface (BioHIPI), capable of storing biomedical image collections in a Distributed File System (DFS) for exploiting the parallel processing of Big Data on a cluster of machines. The work is based on the Apache Hadoop technology and makes use of the Hadoop Distributed File System (HDFS) for storing images, the MapReduce libraries for parallel programming for processing, and Yet Another Resource Negotiator (YARN) to run processes on the cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Source code is available at https://github.com/memoclaudio/BioHipi.

  2. 2.

    http://hipi.cs.virginia.edu/.

References

  1. Henschen, D.: Emerging Options: MapReduce, Hadoop: Young, But Impressive. Information Week (2010). 24

    Google Scholar 

  2. Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP 2003), pp. 29–43 (2003)

    Google Scholar 

  3. Schindelin, J., Rueden, C.T., Hiner, M.C., Eliceiri, K.W.: The ImageJ ecosystem: an open platform for biomedical image analysis. Mol. Reprod. Dev. 82(7–8), 518–529 (2015)

    Article  Google Scholar 

  4. Margolis, R., Derr, L., Dunn, M., Huerta, M., Larkin, J., Sheehan, J., Mark, G., Green, E.D.: The National Institutes of Health’s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data. J. Am. Med. Inform. Assoc. 21(6), 957–958 (2014)

    Article  Google Scholar 

  5. Luo, J., Wu, M., Gopukumar, D., Zhao, Y.: Big data application in biomedical research and health care: a literature review. Biomed. Inf. Insights 8, 1–10 (2016)

    Google Scholar 

  6. Sweeney, C., Liu, L., Arietta, S., Lawrence, J.: HIPI: a Hadoop image processing interface for image-based MapReduce tasks. University of Virginia (2011)

    Google Scholar 

  7. Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinf. 11(Suppl 12), S1 (2010)

    Article  MathSciNet  Google Scholar 

  8. White, T.: Hadoop: The Definitive Guide. O’Reilly Media Inc., Newton (2012)

    Google Scholar 

  9. Dean, J., Sanjay, G.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  10. Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., Baldeschwieler, E.: Apache Hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC 2013), Article 5 (2013)

    Google Scholar 

Download references

Acknowledgments

Claudio Stamile is funded by an EU MC ITN TRANSACT 2012 (316679) project. Francesco Calimeri has been partially supported by the Italian Ministry for Economic Development (MISE) under project “PIUCultura – Paradigmi Innovativi per l’Utilizzo della Cultura” (n. F/020016/01-02/X27), and by the EU under project “Smarter Solutions in the Big Data World (S2BDW)” (n. F/050389/01-03/X32) funded within the call “HORIZON2020” PON I&C 2014-2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aldo Marzullo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Calimeri, F., Caracciolo, M., Marzullo, A., Stamile, C. (2018). BioHIPI: Biomedical Hadoop Image Processing Interface. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R. (eds) Machine Learning, Optimization, and Big Data. MOD 2017. Lecture Notes in Computer Science(), vol 10710. Springer, Cham. https://doi.org/10.1007/978-3-319-72926-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-72926-8_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-72925-1

  • Online ISBN: 978-3-319-72926-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics