Real-time object recognition using local features on a DSP-based embedded system

Arth, Clemens; Bischof, Horst

doi:10.1007/s11554-008-0083-z

Real-time object recognition using local features on a DSP-based embedded system

Original Research Paper
Published: 06 May 2008

Volume 3, pages 233–253, (2008)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Clemens Arth¹ &
Horst Bischof¹

470 Accesses
10 Citations
Explore all metrics

Abstract

In the last few years, object recognition has become one of the most popular tasks in computer vision. In particular, this was driven by the development of new powerful algorithms for local appearance based object recognition. So-called “smart cameras” with enough power for decentralized image processing became more and more popular for all kinds of tasks, especially in the field of surveillance. Recognition is a very important tool as the robust recognition of suspicious vehicles, persons or objects is a matter of public safety. This simply makes the deployment of recognition capabilities on embedded platforms necessary. In our work we investigate the task of object recognition based on state-of-the-art algorithms in the context of a DSP-based embedded system. We implement several powerful algorithms for object recognition, namely an interest point detector together with an region descriptor, and build a medium-sized object database based on a vocabulary tree, which is suitable for our dedicated hardware setup. We carefully investigate the parameters of the algorithm with respect to the performance on the embedded platform. We show that state-of-the-art object recognition algorithms can be successfully deployed on nowadays smart cameras, even with strictly limited computational and memory resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

Face detection techniques: a review

Article 04 August 2018

YOLO with adaptive frame control for real-time object detection applications

Article Open access 18 September 2021

Notes

The algorithm described is complex \({{\mathcal{O}}}(N{\rm log}({\rm log}(N)))\). A more efficient algorithm based on the use of component tree analysis is to be found in [10].
Note that we also presented an application of our approach for vehicle reacquisition in smart camera networks in [2].

References

Arth, C., Bischof, H., Leistner, C.: TRICam: An embedded platform for remote traffic surveillance. In: Embedded Computer Vision Workshop (held in conjunction with CVPR) (2006)
Arth, C., Leistner, C., Bischof, H.: Object reacquisition and tracking in large-scale smart camera networks. In: Proceedings of the IEEE International Conference on Distributed Smart Cameras (ICDSC), pp. 156–163 (2007)
Arth, C., Leistner, C., Bischof, H.: Robust local features and their application in self-calibration and object recognition on embedded systems. In: Embedded Computer Vision Workshop (held in conjunction with CVPR), pp. 1–8 (2007)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4):509–522 (2002)
Article Google Scholar
Bishnu, A., Bhattacharya, B.B., Kundu, M.K., Murthy, C.A., Acharya, T.: A pipeline architecture for computing the Euler number of a binary image. J. Syst. Archit. 51(8):470–487 (2005)
Article Google Scholar
Bishnu, A., Bhunre, P.K., Bhattacharya, B.B., Kundu, M.K., Murthy, C.A., Acharya, T.: Content based image retrieval: related issues using Euler vector. In: Proc. of the ICIP, vol. 2, pp. II-585–II-588 (2002)
Brown, M., Szeliski, R., Winder, S.: Multi-image matching using multi-scale oriented patches. In: Proc. of the CVPR, vol. 1, pp. 510–517, 20–25 June 2005
Carson, C., Belongie, S., Greenspan, H., Malik, J.: Region-based image querying. In: Workshop on Content-Based Access of Image and Video Libraries (held in conjunction with CVPR) (1997)
Dey, S., Bhattacharya, B.B., Kundu, M.K., Acharya, T.: A fast algorithm for computing the Euler number of an image and its VLSI implementation. In: Proc. of the 13th International Conference on VLSI Design, pp. 330–335 (2000)
Donoser, M., Bischof, H.: Efficient maximally stable extremal region (MSER) tracking. In: Proc CVPR, vol. 1, pp. 553–560 (2006)
Estevez, L., Kehtarnavaz N.: A real-time histographic approach to road sign recognition. In: Southwest Symposium on Image Analysis and Interpretation, pp. 95–100 (1996)
Freeman, W.T., Adelson, E.H.: The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell. 13(9):891–906 (1991)
Article Google Scholar
Geusebroek, J.-M., Burghouts, G.J., Smeulders, A.W.M.: The Amsterdam library of object images. Int. J. Comput. Vis. 61(1):103–112 (2005)
Article Google Scholar
Harris, C., Stephens M.J.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–152 (1988)
Helmbold, D.P., Schapire R.E.: Predicting nearly as well as the best pruning of a decision tree. In: Computational Learing Theory, pp. 61–68 (1995)
Kadir, T., Zisserman, A., Brady M.: An affine invariant salient region detector. In: Proc. of the ECCV, vol. 1, pp. 228–241 (2004)
Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: Proc. of the CVPR, vol. 2, pp. 506–513 (2004)
Kuo, S.M., Lee, B.H., Tian, W.: Real-Time Digital Signal Processing: Implementations and Applications. Wiley, New York (2006)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: A sparse texture representation using local affine regions. IEEE Trans. Pattern Anal. Mach. Intell. 27(8):1265–1278 (2005)
Article Google Scholar
Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: Proc. CVPR, vol. 2, pp. 775–781 (2005)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2):91–110 (2004)
Article Google Scholar
Mansour, Y.: Pessimistic decision tree pruning based on tree size. In: Proc. of the International Conference on Machine Learning, Morgan Kaufmann, pp. 195–201 (1997)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Rosin, Paul, L., Marshall, D. (eds.) Proc. of the BMVC, London, UK, vol. 1, pp. 384–393, September 2002. BMVA
Mikolajczyk, K.: Interest Point Detection Invariant to Affine Transformations. Ph.D. thesis, Institut National Polytechnique de Grenoble (2002)
Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proc. of the ICCV, pp. 525–531 (2001)
Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Proc. of the ECCV, vol. 1, pp. 128–142 (2002)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10):1615–1630 (2005)
Article Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1–2):43–72 (2005)
Article Google Scholar
Munich, M.E., Pirjanian, P., DiBernardo, E., Goncalves, L., Karlsson, N., Lowe, D.: Break-through visual pattern recognition for robotics and automation. In: IEEE International Conference on Robotics and Automation (2005)
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: Proc. of the CVPR, vol. 2, pp. 2161–2168 (2006)
Obdržálek, S., Matas J.: Sub-linear indexing for large scale object recognition. In Proc. of the BMVC, vol. 2 (2005)
Ober, S., Winter, M., Arth, C., Bischof, H.: Dual-layer visual vocabulary tree hypotheses for object recognition. In: Proc. of the ICIP (2007)
Ortmann, V., Eckmiller, R.: Real-time object recognition based on active vision and sequential analysis. In: Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE Comp. Society, Washington, DC, USA, pp. 3325–3328 (1999)
Samet, H.: The quadtree and related hierarchical data structures. ACM Comput. Surv. 16(2):187–260 (1984)
Article MathSciNet Google Scholar
Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or “How Do I Organize My Holiday Snaps?”. In: Proc. of the ECCV, vol. 1, pp. 414 (2002)
Schiele B., Crowley, J.L.: Object recognition using multidimensional receptive field histograms. In: Proc. of the ECCV, vol. 1, pp. 610–619 (1996)
Schiele B., Crowley J.L.: Recognition without correspondence using multidimensional receptive field histograms. Int. J. Comput. Vis. 36(1):31–50 (2000)
Article Google Scholar
Schmid C., Mohr R.: Local grayvalue invariants for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 19(5):530–535 (1997)
Article Google Scholar
Sivic, J., Zisserman, A.: Video google: a text retrieval aproach to object matching in videos. In: Proc. of the ICCV, IEEE Computer Society, Los Alamitos, CA, USA, vol. 02, p. 1470 (2003)
Squire, D., Muller, W., Muller, H., Raki, J.: Content-based query of image databases, inspirations from text retrieval: inverted files, frequency-based weights and relevance feedback. In: Proc. of the Scandinavian Conference on Image Analysis (1999)
Tuytelaars, T., Van Gool, L.J.: Matching widely separated views based on affine invariant regions. Int. J. Comput. Vis. 59(1):61–85 (2004)
Article Google Scholar
Wolf, W., Ozer, B., Lv, T.: Smart cameras as embedded systems. Computer 35(9):48–53 (2002)
Article Google Scholar
Yeh, T., Grauman, K., Tollmar, K., Darrell, T.: A picture is worth a thousand keywords: image-based object search on a mobile platform. In: CHI Extended Abstracts, pp. 2025–2028 (2005)

Download references

Author information

Authors and Affiliations

Graz University of Technology, Institute for Computer Graphics and Vision, Inffeldgasse 16/2, 8010, Graz, Austria
Clemens Arth & Horst Bischof

Authors

Clemens Arth
View author publications
You can also search for this author in PubMed Google Scholar
Horst Bischof
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Clemens Arth.

Additional information

This work was done in the scope of the VM-GPU Project No. 813396, financed by the Austrian Research Promotion Agency (http://www.ffg.at), and has been supported by the Austrian Joint Research Project Cognitive Vision under projects S9103-N04 and S9104-N04.

Appendix

In Table 5 the IDs of the 250 objects selected from the ALOI database for our experiments are listed. These objects have been selected because they deliver the highest number of DoG points on the resized ALOI images (352 × 288 pixels). To illustrate this, in Fig. 25 the number of DoG points for the top 500 ALOI images is depicted.

Table 5 Object IDs selected for experiments

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arth, C., Bischof, H. Real-time object recognition using local features on a DSP-based embedded system. J Real-Time Image Proc 3, 233–253 (2008). https://doi.org/10.1007/s11554-008-0083-z

Download citation

Received: 09 November 2007
Accepted: 09 April 2008
Published: 06 May 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s11554-008-0083-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time object recognition using local features on a DSP-based embedded system

Abstract

Access this article

Similar content being viewed by others