Skip to main content

Parallelization Strategies for the Points of Interests Algorithm on the Cell Processor

  • Conference paper
Parallel and Distributed Processing and Applications (ISPA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4742))

Abstract

The Cell processor is a typical example of a heterogeneous multiprocessor-on-chip architecture that uses several levels of parallelism to deliver high performance. Closing the gap between peak performance and sustained performance is the challenge for software tool developers and the application developers. Image processing and media applications are typical ”main stream” applications. In this paper, we use the Harris algorithm for detection of points of interest (PoI) in an image as a benchmark to compare the performance of several parallel schemes on a Cell processor. The impact of the DMA controlled data transfers and the synchronizations between SPEs explains the differences between the performance of the different parallelization schemes. These results will be used to design a tool for an efficient mapping of image processing applications on multi-core architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pham, D., Aipperspach, T., Boerstler, D., Bolliger, M., Chaudhry, R., Cox, D., Harvey, P., Harvey, P., Hofstee, H., Johns, C., Kahle, J., Kameyama, A., Keaty, J., Masubuchi, Y., Pham, M., Pille, J., Posluszny, S., Riley, M., Stasiak, D., Suzuoki, M., Takahashi, O., Warnock, J., Weitzel, S., Wendel, D., Yazawa, K.: Overview of the Architecture, Circuit Design, and Physical Implementation of a First-generation Cell Processor. IEEE Journal of Solid-State Circuits 41, 179–196 (2006)

    Article  Google Scholar 

  2. IBM: Cell Broadband Engine Programming Handbook. Version 1.0 edn. IBM (2006)

    Google Scholar 

  3. Petrini, F., Fossum, G., Fernández, J., Varbanescu, A.L., Kistler, M., Perrone, M.: Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine. In: IEEE/ACM International Parallel and Distributed Processing Symposium (2007)

    Google Scholar 

  4. Greene, J., Cooper, R.: A Parallel 64k Complex fft Algorithm for the IBM/Sony/Toshiba Cell Broadband Engine Processor. In: Global Signal Processing Expo. (2005)

    Google Scholar 

  5. Eichenberger, A.E., O’Brien, J.K., O’Brien, K.M., Wu, P., Chen, T., Oden, P.H., Prener, D.A., Shepherd, J.C., So, B., Sura, Z., Zhang, A.W.T., Zhao, P., Gschwind, M.K., Archambault, R., Gao, Y., Koo, R.: Using Advanced Compiler Technology to Exploit the Performance of the Cell Broadband Engine Architecture. IBM Ssystems Journal 45 (2006)

    Google Scholar 

  6. Benthin, C., Wald, I., Scherbaum, M., Friedrich, H.: Ray Tracing on the CELL Processor. In: IEEE Symposium on Interactive Ray Tracing (2006)

    Google Scholar 

  7. Kistler, M., Perrone, M., Petrini, F.: Cell Multiprocessor Communication Network: Built for Speed. IEEE Micro 26, 10–23 (2006)

    Article  Google Scholar 

  8. Fatahalian, K., Knight, T.J., Houston, M., Erez, M., Horn, D.R., Leem, L., Park, J.Y., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: Programming the Memory Hierarchy. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (2006)

    Google Scholar 

  9. Knight, T.J., Park, J.Y., Ren, M., Houston, M., Erez, M., Fatahalian, K., Aiken, A., Dally, W.J., Hanrahan, P.: Compilation for Explicitly Managed Memory Hierarchies. In: Proceedings of the 2007 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ivan Stojmenovic Ruppa K. Thulasiram Laurence T. Yang Weijia Jia Minyi Guo Rodrigo Fernandes de Mello

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saidani, T., Lacassagne, L., Bouaziz, S., Khan, T.M. (2007). Parallelization Strategies for the Points of Interests Algorithm on the Cell Processor. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74742-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74741-3

  • Online ISBN: 978-3-540-74742-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics