Abstract
Local binary pattern (LBP) is a texture operator that is used in several different computer vision applications requiring, in many cases, real-time operation in multiple computing platforms. The irruption of new video standards has increased the typical resolutions and frame rates, which need considerable computational performance. Since LBP is essentially a pixel operator that scales with image size, typical straightforward implementations are usually insufficient to meet these requirements. To identify the solutions that maximize the performance of the real-time LBP extraction, we compare a series of different implementations in terms of computational performance and energy efficiency, while analyzing the different optimizations that can be made to reach real-time performance on multiple platforms and their different available computing resources. Our contribution addresses the extensive survey of LBP implementations in different platforms that can be found in the literature. To provide for a more complete evaluation, we have implemented the LBP algorithms in several platforms, such as graphics processing units, mobile processors and a hybrid programming model image coprocessor. We have extended the evaluation of some of the solutions that can be found in previous work. In addition, we publish the source code of our implementations.










Similar content being viewed by others
References
Abbo, A., Jeanne, V., Ouwerkerk, M., Shan, C., Braspenning, R., Ganesh, A., Corporaal, H.: Mapping facial expression recognition algorithms on a low-power smart camera. In: Second ACM/IEEE international conference on distributed smart cameras, 2008. ICDSC 2008, pp. 1–7, Sept 2008
Akenine-Möller, T., Strom, J.: Graphics processing units for handhelds. Proc. IEEE. 96(5), 779–789 (2008)
Bordallo López, M., Nykänen, H., Hannuksela, J., Silvén, O., Vehviläinen, M.: Accelerating image recognition on mobile devices using gpgpu. In: Proceeding of SPIE Electronic Imaging, p. 7872 (2011)
Boutellier, J., Lundbom, I., Janhunen, J., Ylimäinen, J., Hannuksela, J.: Application-specific instruction processor for extracting local binary patterns. In: Conference on design and architectures for signal and image processing (2012)
Brookwood, N. : Amd fusion family of apus: enabling a superior, immersive pc experience. Insight. 64(1), 1–8 (2010)
Chang, C.: Hardware design and implementation of the local binary pattern algorithm. In: M.S. Thesis, Department of Electrical Engineering, National Taiwan Ocean University (2008)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
Chen, B., Shen, J., Sun, H.: A fast face recognition system on mobile phone. In: 2012 international conference on systems and informatics (ICSAI), , pp. 1783–1786, May 2012
Cheng, J., Deng, Y., Meng, H., Wang, Z.: A facial expression based continuous emotional state monitoring system with gpu acceleration. In: 2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp. 1–6 (2013)
Cheng, K.-T., Wang Y.-C.: Using mobile gpu for general-purpose computing; a case study of face recognition on smartphones. In: 2011 international symposium on VLSI design, automation and test (VLSI-DAT), pp. 1–4, April 2011
Corke, P., Dunn, P.: Frame-rate stereopsis using non-parametric transforms and programmable logic. In: Proceedings 1999 IEEE international conference on robotics and automation. vol. 3, pp. 1928–1933 (1999)
Corporaal, H.: Microprocessor Architectures : From VLIW to TTA. John Wiley & Sons, Dec 1997
Elhassan, I.: Fast texture downloads and readbacks using pixl buffer objects in opengl. In: nVidia Technical Brief. nVidia (2005)
Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C.S., Takala, J., Martinez, J.I.: Customized exposed datapath soft-core design flow with compiler support. In: 20th international conference on field programmable logic and applications, pp. 217–222, Milano, Italy (2010)
Esmaeilzadeh, H., Blem, E., St Amant, R., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. Micro IEEE. 32(3), 122–134 (2012)
Hadid, A., Heikkilä, J., Silvén, O., Pietikäinen, M.: Face and eye detection for person authentication in mobile phones. In: First ACM/IEEE international conference on distributed smart cameras. ICDSC ‘07 , pp. 101–108, Sept 2007
Hartley, T.D., Catalyurek, U., Ruiz, A., Igual, F., Mayo, R., Ujaldon, M.: Biomedical image analysis on a cooperative cluster of gpus and multicores. In: Proceedings of the 22nd annual international conference on supercomputing, ICS ‘08, pp. 15–25, New York, NY, USA, ACM (2008)
Herout, A., Joth, R., Jurnek, R., Havel, J., Hradi, M., Zemk, P.: Real-time object detection on cuda. J. Real Time Image Process. 6, 159–170 (2011). doi:10.1007/s11554-010-0179-0.
Hiers, T., Webster, M.: TMS320C6414T/15T/16T power consumption summary, revision A. Technical report, Texas Instruments (2008)
Humenberger, M., Zinner, C., Kubinger, W.: Performance evaluation of a census-based stereo matching algorithm on embedded and multi-core hardware. In: Proceedings of 6th international symposium on image and signal processing and analysis, 2009. ISPA 2009, pp. 388–393, Sept (2009)
Humenberger, M., Zinner, C., Weber, M., Kubinger, W., Vincze, M.: A fast stereo matching algorithm suitable for embedded real-time systems. Comp. Vis. Image Underst. 114(11), 1180–1202 (2010)
Ibarra-Manzano, M., Almanza-Ojeda, D.-L., Devy, M., Boizard, J.-L., Fourniols, J.-Y.: Stereo vision algorithm implementation in fpga using census transform for effective resource optimization. In: 12th Euromicro conference on digital system design, architectures, methods and tools, 2009. DSD ‘09, pp. 799–805, Aug 2009
T. Instruments. Omap3530 power estimation spreadsheet. Technical report, Texas Instruments, 2011
Jääskeläinen, P., de La Lama, C., Huerta, P., Takala, J.: Opencl-based design methodology for application-specific processors. In: 2010 international conference on embedded computer systems (SAMOS), pp. 223–230, July 2010
Juránek, R., Herout, A., Zemĉik, P.: Implementing local binary patterns with simd instructions of cpu. In: Proceedings of winter seminar on computer graphics, p. 5 (2010)
Kannala, J., Rahtu, E.: Bsif: Binarized statistical image features. In: Pattern recognition (ICPR), 2012. 21st international conference on, pp. 1363–1366. IEEE (2012)
Kim, J., Park, C., Cho, J.: Hardware implementation for real-time census 3d disparity map using dynamic search range. In: Software practice advancement conference (2011)
Kim, N., Austin, T., Baauw, D., Mudge, T., Flautner, K., Hu, J., Irwin, M., Kandemir, M., Narayanan, V.: Leakage current: Moore’s law meets static power. Computer. 36(12), 68–75 (2003)
Kristof, P., Yu, H., Li, Z., Tian, X.: Performance study of SIMD programming models on intel multicore processors. In: Parallel and distributed processing symposium workshops PhD Forum (IPDPSW), IEEE 26th international, pp. 2423–2432, May 2012
Lahdenoja, O., Laiho, M., Maunu, J., Paasio, A.: A massively parallel face recognition system. EURASIP J. Embed. Syst. 2007(1), 31–31 (2007)
Laiho, M., Lahdenoja, O., Paasio, A.: Dedicated hardware for parallel extraction of local binary pattern feature vectors. In: 2005 9th international workshop on cellular neural networks and their applications, pp. 27– 30, May 2005
LBP implementation source code. http://www.ee.oulu.fi/miguelbl/LBP-Software/ (2013)
Leibstein, J., Findt, A., Nel, A.: Efficient texture classification using local binary patterns on graphics processing unit. In: Proceedings of the twenty-first annual symposium of the pattern recognition association of South Africa, pp. 147–152 (2010)
Leskelä, J., Nikula, J., Salmela, M.: Opencl embedded profile prototype in mobile device. In: IEEE workshop on signal processing systems. SiPS 2009, pp. 279–284, Oct 2009
Li, J.-J., Kuan, C.-B., Wu, T.-Y., Lee, J.K.: Enabling an opencl compiler for embedded multicore dsp systems. 2012 41st international conference on parallel processing workshops 0, pp. 545–552, (2012)
Mäenpää, T., Turtinen, M., Pietikäinen, M.: Real-time surface inspection by texture. Real Time Imaging. 9(5), 289-296 (2003)
McCaffey, J.: Exploring mobile vs. desktop opengl performance. In: Cozzi, P., Riccio, C. (eds.) OpenGL Insights, pp. 337–351. CRC Press (2012). http://www.openglinsights.com/
Meyer-Baese, U.: Digital signal processing with field programmable gate arrays. In: Signals and Communication Technology. Springer (2007)
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with cuda. Queue. 6(2), 40–53 (2008)
Nieto, A., López Vilariño, D., Brea, V.: SIMD/MIMD dynamically-reconfigurable architecture for high-performance embedded vision systems. In: 23rd IEEE international conference on application-specific systems, architectures and processors (ASAP 2012), July 2012
Nieto, A., López Vilariño, D., Brea V.: Towards the optimal hardware architecture for computer vision. In: Machine Vision. InTech (2011)
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29(1), 51–59 (1996)
Ojansivu, V., Heikkilä, J.: Blur insensitive texture classification using local phase quantization. In: Image and signal processing, pp. 236–243. Springer (2008)
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum. 26(1), (2007)
Pauwels, K., Tomasi, M., Diaz Alonso, J., Ros, E., Van Hulle, M.: A comparison of fpga and gpu for real-time phase-based optical flow stereo, and local image features. IEEE Trans. on Comput. 61(7), 999–1012 (2012)
Pietikäinen, M., Hadid, A., Zhao, G., Ahonen, T.: LBP in different applications. In: Computer Vision using Local Binary Patterns, Computational Imaging and Vision, vol. 40 , pp. 193–204. Springer, London (2011)
Shen, L., Wu, S., Zheng, S., Ji, Z.: Embedded palmprint recognition system using omap 3530. Sensors. 12(2), 1482–1493 (2012)
Silvén, O., Rintaluoma, T.: Energy efficiency of video decoder implementations. In: Mobile Phone Programming, pp. 421–439. Springer (2007)
Singh, D.: Implementing fpga design with the opencl standard. Altera whitepaper (2011)
Singhal, N., Park, I., Cho, S.: Implementation and optimization of image processing algorithms on handheld gpu. In: IEEE international conference on image processing (ICIP), pp. 4481–4484 (2010)
Sloss, A., Symes, D., Wright ,C.: ARM system developer’s guide: designing and optimizing system software. Morgan Kaufmann (2004)
Suominen, O.: Transform-based methods for stereo matching and dense depth estimation. Masters Thesis, Tampere University of Technology (2012)
Tek, S., G”okmen, M.: Gpu accelerated real-time object detection on high resolution videos using modified census transform. In: Proceedings of the international conference on computer vision theory and applications, pp. 685–688, (2012)
Vazquez-Fernandez, E., Garcia-Pardo, H., Gonzalez-Jimenez, D., Perez-Freire, L.: Built-in face recognition for smart photo sharing in mobile devices. In: 2011 IEEE international conference on multimedia and expo (ICME), pp. 1 –4, July 2011
Volkov, V.: Better performance at lower occupancy. In: Proceedings of the GPU technology conference, GTC, vol. 10, (2010)
Vuduc, R., Czechowski, K.: What gpu computing means for high-end systems. Micro. IEEE. 31(4), 74–78 (2011)
Woodfill, J., Von Herzen, B.: Real-time stereo vision on the parts reconfigurable computer. In: Proceedings of the 5th annual IEEE symposium on FPGAs for custom computing machines, pp. 201–210, Apr 1997
Xilinx Inc. http://www.xilinx.com (2013)
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh,J.-O. (ed.) Computer Vision ECCV ’94. Lecture Notes in Computer Science, vol. 801, pp. 151–158. Springer, Berlin (1994)
Zabih, R., Woodfill, J.: A non-parametric approach to visual correspondence. In: IEEE transactions on pattern analysis and machine intelligence (1996)
Zinner, C., Humenberger, M., Ambrosch, K., Kubinger, W.: An optimized software-based implementation of a census-based stereo matching algorithm. In: Bebis,G., Boyle,R., Parvin, B., Koracin, D., Remagnino, P., Porikli, F., Peters,J., Klosowski, J., Arns, L., Chun, Y., Rhyne, T.-M., Monroe, L. (eds.) Advances in Visual Computing. Lecture Notes in Computer Science, vol. 5358, pp. 216–227. Springer, Berlin (2008)
Zolynski, G., Braun, T., Berns, K.: Local binary pattern based texture analysis in real time using a graphics processing unit. In: VDI wissenforum GmbH—Proceedings of Robotik (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bordallo López, M., Nieto, A., Boutellier, J. et al. Evaluation of real-time LBP computing in multiple architectures. J Real-Time Image Proc 13, 375–396 (2017). https://doi.org/10.1007/s11554-014-0410-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-014-0410-5