Skip to main content
Log in

Analyzing the Performance-Hardware Trade-off of an ASIP-based SIFT Feature Extraction

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

One of the key problems in the field of Computer Vision is recovering the geometry from multiple views of the same scene. Once the homography of two images is known, the motion of a stereo camera system can be determined, images can be rectified or image registration can be performed. A feature-based approach to determine the homography between two images bases on the extraction and matching of SIFT features (SIFT, Scale-Invariant Feature Transform). By extracting image features from varying images of one scene and finding corresponding image features in both images, the homography of the scene can be determined. The extraction of image features, which provide sufficient quality for computation of the homography of a scene, leads to an algorithm complexity, that prevents real-time applications on conventional CPUs. Therefore, we present and discuss an application-specific instruction-set extensions for a Tensilica Xtensa LX5 ASIP to accelerate a SIFT feature extraction (ASIP, Application-Specific Instruction-set Processor). In total, the complete SIFT feature extraction, executed on an extended processor is accelerated by a factor of x125 compared to the baseline processor. At the same time, the accuracy of the SIFT features is preserved. In addition, the proposed processor extensions maintain the full flexibility of an ASIP for a fast integration of further feature extractors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10

Similar content being viewed by others

Notes

  1. PGM - Partable GrayMap; image file format for storing image data without any compression

References

  1. Alahi, A., Ortiz, R., & Vandergheynst, P. (2012). Freak: fast retina keypoint. In 2012 IEEE conference on computer vision and pattern recognition (CVPR), IEEE (pp. 510–517).

  2. Alvarez, J.S. (2012). Streamlining digital signal processing: a tricks of the trade guidebook: Wiley - IEEE Press.

  3. Banz, C., Dolar, C., Cholewa, F., & Blume, H. (2011). Instruction set extension for high throughput disparity estimation in stereo image processing. In Application-specific Systems, architectures and processors (ASAP), IEEE (pp. 169–175).

  4. Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 346–359.

    Article  Google Scholar 

  5. Beucher, N., Blanger, N., Savaria, Y., & Bois, G. (2009). High acceleration for video processing applications using specialized instruction set based on parallelism and data reuse. Signal Processing Systems, 56(2-3), 155–165.

    Article  Google Scholar 

  6. Bonato, V., Marques, E., & Constantinides, G.A. (2008). A parallel hardware architecture for scale and rotation invariant feature detection. IEEE Transactions on Circuits and Systems for Video Technology, 18(12), 1703–1712.

    Article  Google Scholar 

  7. Calonder, M., Lepetit, V., Strecha, C., & Fua, P. (2010). BRIEF: Binary Robust Independent Elementary Features. In Proceedings of the 11th European conference on computer vision: part IV, ECCV’10 (pp. 778–792). Berlin, Heidelberg: Springer.

    Google Scholar 

  8. Chiu, L.C., Chang, T.S., Chen, J.Y., & Chang, N.Y.C. (2013). Fast SIFT design for real-time visual feature extraction. IEEE Transactions on Image Processing, 22(8), 3158–3167.

    Article  Google Scholar 

  9. Crenshaw, J.W. (1998). Integer square roots. http://www.embedded.com/electronics-blogs/programmer-s-toolbox/4219659/Integer-Square-Roots.

  10. Deng, W., Zhu, Y., Feng, H., & Jiang, Z. (2012). An efficient hardware architecture of the optimised SIFT descriptor generation. In D. Koch, S. Singh, & J. Trresen (Eds.), FPL, IEEE (pp. 345–352).

  11. DESERVE: DEvelopment platform for safe and efficient dRiVE (2012). http://www.deserve-project.eu.

  12. Duan, L.Y., Gao, F., Chen, J., Lin, J., & Huang, T. (2013). Compact descriptors for mobile visual search and mpeg cdvs standardization. In 2013 IEEE international symposium on circuits and systems (ISCAS), IEEE (pp. 885–888).

  13. Ekmekcioglu, E., Worrall, S., Velisavljevic, V., De Silva, D., & Kondoz, A. International Organisation for Standardisation ISO/IEC JTC1/SC29/WG11 Coding of Moving Pictures and Audio.

  14. Fenzi, M., Ostermann, J., Mentzer, N., Payá-Vayá, G., Blume, H., Nguyen, T.N., & Risse, T. (2014). Asev?automatic situation assessment for event-driven video analysis. In 2014 11th IEEE international conference on advanced video and signal based surveillance (AVSS), IEEE (pp. 37–43).

  15. Fontaine, S., Goyette, S., Langlois, J.M.P., & Bois, G. (2008). Acceleration of a 3D target tracking algorithm using an application specific instruction set processor. In ICCD, IEEE (pp. 255–259).

  16. Gonzalez, R.E. (2000). Xtensa: a configurable and extensible processor. IEEE Micro, 20(2), 60–70.

    Article  Google Scholar 

  17. Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision, 2nd edn. New York: Cambridge University Press.

    MATH  Google Scholar 

  18. Huang, F.C., Huang, S.Y., Ker, J.W., & Chen, Y.C. (2012). High-Performance SIFT hardware accelerator for real-time image feature extraction. IEEE Transactions on Circuits and Systems for Video Technology, 22(3), 340–351.

    Article  Google Scholar 

  19. Ke, Y., & Sukthankar, R. (2004). Pca-sift: A more distinctive representation for local image descriptors. In Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, CVPR 2004, IEEE, (Vol. 2 pp. II–506).

  20. Leutenegger, S., Chli, M., & Siegwart, R.Y. (2011). BRISK: Binary Robust Invariant Scalable Keypoints. In Proceedings of the 2011 international conference on computer visionx, ICCV ’11 (pp. 2548–2555). Washington: IEEE Computer Society.

    Chapter  Google Scholar 

  21. Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  22. Mentzer, N., Pay-Vay, G., Blume, H., von Eglofftsein, N., & Ritter, W. (2014). Instruction-Set extension for an ASIP-based SIFT feature extraction. In International conference on embedded computer systems: architectures, modeling, and simulation (SAMOS XIV), 2014.

  23. Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.

    Article  Google Scholar 

  24. Ravasi, M., Tenze, L., & Mattavelli, M. (2002). A scalable and programmable architecture for 2-d dwt decoding. IEEE Transactions on Circuits and Systems for Video Technology, 12(8), 671–677.

    Article  Google Scholar 

  25. Rosten, E., Porter, R., & Drummond, T. (2010). Faster and better: a machine learning approach to corner detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(1), 105–119.

    Article  Google Scholar 

  26. Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). Orb: an efficient alternative to sift or surf. In 2011 IEEE International conference on computer vision (ICCV), IEEE (pp. 2564–2571).

  27. Systems, C.D. (2014). Xtensa LX5 microprocessor data book. Cadence Design Systems.

  28. Wang, G., Rister, B., & Cavallaro, J.R. (2013). Workload analysis and efficient opencl-based implementation of sift algorithm on a smartphone. In Proceedings in IEEE global conference signal and information processing (GlobalSIP) (pp. 759–762).

  29. Wang, W., Zhang, Y., Guoping, L., Yan, S., & Jia, H. (2013). Clsift: An optimization study of the scale invariance feature transform on gpus. IEEE, (pp. 93–100).

  30. Wu, C. (2007). SiftGPU: a GPU implementation of scale invariant feature transform (SIFT). http://cs.unc.edu/ccwu/siftgpu.

  31. Yonglong, Z., Kuizhi, M., Xiang, J., & Peixiang, D. (2013). Parallelization and optimization of sift on gpu using cuda. In 2013 IEEE 10th international conference on high performance computing and communications & 2013 IEEE international conference on embedded and ubiquitous computing (HPCC_EUC), IEEE (pp. 1351–1358).

  32. Zhang, Q., Chen, Y., Zhang, Y., & Xu, Y. (2008). Sift implementation and optimization for multi-core systems. In IEEE international symposium on parallel and distributed processing, IPDPS 2008, IEEE (pp. 1–8).

Download references

Acknowledgments

This work was partially supported by the European Commission under the ECSEL Joint Undertaking in the scope of the DESERVE [11] project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nico Mentzer.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mentzer, N., Payá-Vayá, G. & Blume, H. Analyzing the Performance-Hardware Trade-off of an ASIP-based SIFT Feature Extraction. J Sign Process Syst 85, 83–99 (2016). https://doi.org/10.1007/s11265-015-0986-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-015-0986-4

Keywords

Navigation