Skip to main content
Log in

SEIP: System for Efficient Image Processing on Distributed Platform

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Nowadays, there exist numerous images in the Internet, and with the development of cloud computing and big data applications, many of those images need to be processed for different kinds of applications by using specific image processing algorithms. Meanwhile, there already exist many kinds of image processing algorithms and their variations, while new algorithms are still emerging. Consequently, an ongoing problem is how to improve the efficiency of massive image processing and support the integration of existing implementations of image processing algorithms into the systems. This paper proposes a distributed image processing system named SEIP, which is built on Hadoop, and employs extensible innode architecture to support various kinds of image processing algorithms on distributed platforms with GPU accelerators. The system also uses a pipeline-based framework to accelerate massive image file processing. A demonstration application for image feature extraction is designed. The system is evaluated in a small-scale Hadoop cluster with GPU accelerators, and the experimental results show the usability and efficiency of SEIP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Tanenbaum A S, Van Steen M. Distributed Systems: Principles and Paradigms. Upper Saddle River, NJ: Prentice Hall, 2007, pp.7–8.

  2. Fleischmann A. Distributed Systems: Software Design and Implementation. Springer-Verlag Berlin Heidelberg, 2012, pp.4–5.

  3. Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107–113.

  4. Zaharia M, Chowdhury M, Franklin M J et al. Spark: Cluster computing with working sets. In Proc. the 2nd USENIX Conference on Hot Topics in Cloud Computing, Jun. 2010.

  5. White T. Hadoop: The Definitive Guide (1st edition). O’Reilly Media, Jun. 2009.

  6. Zaharia M, Chowdhury M, Das T et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. the 9th USENIX Conference on Networked Systems Design and Implementation, Apr. 2012, pp.15–28.

  7. Ojala T, Pietikainen M, Harwood D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proc. the 12th International Conference on Pattern Recognition (ICPR), Oct. 1994, Volume 1, pp.582–585.

  8. Ojala T, Pietikainen M, Mäenpää T. Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971–987.

  9. Bay H, Tuytelaars T, Van Gool L. SURF: Speeded-up robust features. In Proc. the 9th ECCV, May 2006, pp.404–417.

  10. Ng P C, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Research, 2003, 31(13): 3812–3814.

  11. Tola E, Lepetit V, Fua P. DAISY: An efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 815–830.

  12. Juan L, Gwun O. A comparison of SIFT, PCA-SIFT and SURF. International Journal of Image Processing (IJIP), 2009, 3(4): 143–152.

  13. Lewis J, Alghamdi M, Assaf M A et al. An automatic prefetching and caching system. In Proc. the 29th IEEE International on Performance Computing and Communications Conference (IPCCC), Dec. 2010, pp.180–187.

  14. Shvachko K, Kuang H, Radia S et al. The Hadoop distributed file system. In Proc. the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST), May 2010.

  15. Lindholm E, Nickolls J, Oberman S et al. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro, 2008, 28(2): 39–55.

  16. Hartley T D R, Catalyurek U V, Ruiz A et al. Author’s retrospective for biomedical image analysis on a cooperative cluster of gpus and multicores. In Proc. the 25th ACM International Conference on Supercomputing Anniversary Volume, Jun. 2014, pp.82–84.

  17. McGaffin M G, Fessler J. Edge-preserving image denoising via group coordinate descent on the GPU. IEEE Transactions on Image Processing, 2015, 24(4): 1273–1281.

  18. Zhu L, Jin H, Zheng R et al. Effective naive Bayes nearest neighbor based image classification on GPU. Journal of Supercomputing, 2014, 68(2): 820–848.

  19. Cornelis N, van Gool L. Fast scale invariant feature detection and matching on programmable graphics hardware. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2008, pp.1–8.

  20. Wu C. SiftGPU: A GPU implementation of scale invariant feature transform (SIFT). http://cs.unc.edu/∼ccwu/siftgpu, Oct. 2015.

  21. Prisacariu V, Reid I. fastHOG — A real-time GPU implementation of HOG. Technical Report 2310/09, Department of Engineering Science, University of Oxford, January 2012.

  22. Jiang D, Chen G, Ooi B C et al. epiC: An extensible and scalable system for processing big data. Proceedings of the VLDB Endowment, 2014, 7(7): 541–552.

  23. Zhang X, Yang L T, Liu C et al. A scalable two-phase top-down specialization approach for data anonymization using MapReduce on cloud. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(2): 363–373.

  24. Ranger C, Raghuraman R, Penmetsa A et al. Evaluating MapReduce for multi-core and multiprocessor systems. In Proc. the 13th IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2007, pp.13–24.

  25. Moise D, Shestakov D, Gudmundsson G et al. Terabyte-scale image similarity search: Experience and best practice. In Proc. IEEE International Conference on Big Data, Oct. 2013, pp.674–682.

  26. Mills S, Eyers D, Leung K C et al. Large-scale feature matching with distributed and heterogeneous computing. In Proc. the 28th IEEE International Conference of Image and Vision Computing New Zealand (IVCNZ), Nov. 2013, pp.208–213.

  27. Teodoro G, Kurç T M, Pan T et al. Accelerating large scale image analyses on parallel, CPU-GPU equipped systems. In Proc. the 26th IEEE International on Parallel and Distributed Processing Symposium (IPDPS), May 2012, pp.1093–1104.

  28. Teodoro G, Pan T F, Kurç T M et al. High-throughput analysis of large microscopy image datasets on CPU-GPU cluster platforms. In Proc. the 27th IEEE International on Parallel and Distributed Processing Symposium (IPDPS), May 2013, pp.103–114.

  29. Hua Y, Jiang H, Feng D. FAST: Near real-time searchable data analytics for the cloud. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Nov. 2014, pp.754–765.

  30. Liu J, Huang Z, Cheng H et al. Presenting diverse location views with real-time near-duplicate photo elimination. In Proc. the 29th IEEE International Conference on Data Engineering (ICDE), Apr. 2013, pp.505–516.

  31. Fang W, He B, Luo Q et al. Mars: Accelerating MapReduce with graphics processors. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(4): 608–620.

  32. Hong C, Chen D, Chen W et al. MapCG: Writing parallel program portable between CPU and GPU. In Proc. the 19th ACM International Conference on Parallel Architectures and Compilation Techniques (PACT), Sept. 2010, pp.217–226.

  33. Zhai Y, Mbarushimana E, Li W et al. Lit: A high performance massive data computing framework based on CPU/GPU cluster. In Proc. IEEE International Conference on Cluster Computing (CLUSTER), Sept. 2013.

  34. Jiang H, Chen Y, Qiao Z et al. Accelerating MapReduce framework on multi-GPU systems. Cluster Computing, 2014, 17(2): 293–301.

  35. Jiang H, Chen Y, Qiao Z et al. Scaling up MapReduce-based big data processing on multi-GPU systems. Cluster Computing, 2015, 18(1): 369–383.

  36. Wittek P, Darányi S N. Accelerating text mining workloads in a MapReduce-based distributed GPU environment. Journal of Parallel and Distributed Computing, 2013, 73(2): 198–206.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Liu.

Additional information

Special Section on Networking and Distributed Computing for Big Data

The work was supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61133004, the National High Technology Research and Development 863 Program of China under Grant No. 2012AA01A302, and the NSFC Projects of International Cooperation and Exchanges under Grant No. 61361126011.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, T., Liu, Y., Li, Q. et al. SEIP: System for Efficient Image Processing on Distributed Platform. J. Comput. Sci. Technol. 30, 1215–1232 (2015). https://doi.org/10.1007/s11390-015-1595-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-015-1595-1

Keywords

Navigation