Skip to main content
Log in

Semantic superpixel extraction via a discriminative sparse representation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The superpixel extraction algorithm is becoming increasingly significant for pattern recognition applications. Different superpixel generation methods have different properties and lead to various over-segmentation results. In this paper, we treat the over-segmentation as an image decomposition problem, and propose a novel discriminative sparse coding (DSC) algorithm to effectively extract the semantic superpixels. Specifically, the DSC algorithm incorporates a new discriminative regularization term in the traditional sparse representation model. Then the new regularization term is combined with the reconstruction error and sparse constraint to form a unified objective function. The extracted superpixels not only respect the local image boundaries, but also are dissimilar between each other. Meanwhile, the quantity of segments is sparse. These properties benefit for the semantic superpixel extraction. The final refined superpixels are generated based on an effective Bayesian-classification criterion in a post-processing step. Experimental results show that the over-segmentation quality of DSC algorithm outperforms the state of the art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Amri S, Barhoumi W, Zagrouba E (2010) A robust framework for joint background/foreground segmentation of complex video scenes filmed with freely moving camera. Multimed Tools Appl 46:175–205

    Article  Google Scholar 

  2. Ayvaci A, Soatto S (2009) Motion segmentation with occlusions on the superpixel graph. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 727–734

  3. Chen Y, Chan A, Wang G (2012) Adaptive figure-ground classification. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 654–661

  4. Dong W, Zhang D, Shi G (2011) Centralized sparse representation for image restoration. In: IEEE International Conference on Computer Vision, ICCV 2011, pp 1259–1266

  5. Donoho DL (2004) For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution. In: Communications on pure and applied mathematics, pp 907–934

  6. Elad M (2010) Sparse and redundant representations: from theory to appplications in signal and image processing. Springer

  7. Elad M, Figueiredo MAT, Ma Y (2010) On the role of sparse and redundant representations in image processing. Proc IEEE 98:972–982

    Article  Google Scholar 

  8. Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59:167–181

    Article  Google Scholar 

  9. Fragkiadaki K, Zhang G, Shi J (2012) Video segmentation by tracing discontinuities in a trajectory embedding. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 1846–1853

  10. Fulkerson B, Vedaldi A, Soatto S (2009) Class segmentation and object localization with superpixel neighborhoods. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 670–677

  11. Gkalelis N, Mezaris V, Kompatsiaris I, Stathaki T (2013) Mixture subclass discriminant analysis link to restricted gaussian model and other generalizations. IEEE Trans Neural Netw Learn Syst 24:8–21

    Article  Google Scholar 

  12. Golub GH, Hansen PC, O’Leary DP (1999) Tikhonov regularization and total least squares. SIAM J Matrix Anal Appl 21:185–194

    Article  MATH  MathSciNet  Google Scholar 

  13. Huang K, Aviyente S (2006) Sparse representation for signal classiffication. In: Adv. NIPS, pp 609–616

  14. Huang S, Lee Y, Bell G, Ou Z (2010) An efficient segmentation algorithm for captchas with line cluttering and character warping. Multimed Tools Appl 48:267–289

    Article  Google Scholar 

  15. Jing G, Shi Y, Kong D, Ding W, Yin B (2012) Image super-resolution based on multi-space sparse representation. Multimed Tools Appl. doi:10.1007/s11042-011-0953-4

  16. Kalantidis Y, Tolias G, Avrithis Y, Phinikettos M, Spyrou E, Mylonas P, Kollias S (2011) Viral: visual image retrieval and localization. Multimed Tools Appl 51:555–592

    Article  Google Scholar 

  17. Kawulok M (2010) Energy-based blob analysis for improving precision of skin segmentation. Multimed Tools Appl 49:463–481

    Article  Google Scholar 

  18. Lee H, Battle A, Raina R, Ng AY (2006) Efficient sparse coding algorithms. In: NIPS, pp 801–808

  19. Levinshtein A, Dickinson S, Sminchisescu C (2009) Multiscale symmetric part detection and grouping. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 2162–2169

  20. Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, Siddiqi K (2009) Turbopixels: fast superpixels using geometric flows. IEEE Trans Pattern Anal Mach Intell 31:2290–2297

    Article  Google Scholar 

  21. Li H, Ngan KN (2007) Unsupervised video segmentation with low depth of field. IEEE Trans Circuits Syst Video Technol 17:1742–1751

    Article  Google Scholar 

  22. Li H, Ngan K (2008) Saliency model based face segmentation in head-and-shoulder video sequences. J Vis Commun Image Represent 19:320–333

    Article  Google Scholar 

  23. Li H, Ngan K, Liu Q (2009) Faceseg: automatic face segmentation for real-time video. IEEE Trans Multimedia 11:77–88

    Article  Google Scholar 

  24. Li H, Ngan KN (2011) A co-saliency model of image pairs. IEEE Trans Image Process 20:3365–3375

    Article  MathSciNet  Google Scholar 

  25. Li H, Ngan K (2011) Learning to extract focused objects from low dof images. IEEE Trans Circuits Syst Video Technol 21:1571–1580

    Article  MATH  Google Scholar 

  26. Li Z, Wu X, Chang S (2012) Segmentation using superpixels: a bipartite graph partitioning approach. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 789–796

  27. Liu Q, Han T, Sun Y, Chu Z, Shen B (2012) A two step salient objects extraction framework based on image segmentation and saliency detection. Multimed Tools Appl 67(1):231–247

    Google Scholar 

  28. Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2009) Non-local sparse models for image restoration. In: IEEE International Conference on Computer Vision, ICCV 2009, pp 2272–2279

  29. Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: IEEE International Conference on Computer Vision, ICCV 2001, pp 416–423

  30. Meng F, Li H, Liu G, Ngan KN (2012) Object co-segmentation based on shortest path algorithm and saliency model. IEEE Trans Multimedia 14:1429–1441

    Article  Google Scholar 

  31. Nowozin S, Gehler PV, Lampert CH (2010) On parameter learning in crf-based approaches to object class image segmentation. In: European Conference on Computer Vision, ECCV 2010, pp 98–111

  32. Olshausen BA, Fieldt DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis Res 37:3311–3325

    Article  Google Scholar 

  33. Pati YC, Rezaiifar R, Rezaiifar YCPR, Krishnaprasad PS (2012) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th annual asilomar conference on signals, systems, and computers, pp. 40–44

  34. Radhakrishna A, Appu S, Kevin S, Aurelien L, Pascal F, Sabine S (2010) SLIC Superpixels, EPFL Technical Report no. 149300

  35. Ren X, Malik J (2003) Learning a classification model for segmentation. In: IEEE International Conference on Computer Vision, ICCV 2003, pp 10–17

  36. Shi J, Malik J (1997) Normalized cuts and image segmentation. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 1997, pp 731–737

  37. Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81:2–23

    Article  Google Scholar 

  38. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288

    MATH  MathSciNet  Google Scholar 

  39. Tighe J, Lazebnik S (2010) Superparsing: scalable nonparametric image parsing with superpixels. In: European Conference on Computer Vision, ECCV 2010, pp 352–365

  40. Tropp J, Wright S (2010) Computational methods for sparse solution of linear inverse problems. Proc IEEE 98:948–958

    Google Scholar 

  41. Vazquez-Reina A, Avidan S, Pfister H, Miller E (2010) Multiple hypothesis video segmentation from superpixel flows. In: European Conference on Computer Vision, ECCV 2010, pp 268–281

  42. Vedaldi A, Fulkerson B (2008) VLFeat: an open and portable library of computer vision algorithms. http://www.vlfeat.org/

  43. Vedaldi A, Soatto S (2008) Quick shift and kernel methods for mode seeking. In: European Conference on Computer Vision, ECCV 2008, pp 705–718

  44. Vieux R, BenoisPineau J, Domenger J, Braquelaire A (2012) Segmentation-based multi-class semantic object detection. Multimed Tools Appl 60:305–326

    Article  Google Scholar 

  45. Wright J, Yang A, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31:210–227

    Article  Google Scholar 

  46. Wright J, Ma Y, Mairal J, Sapiro G, Huang T, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98:1031–1044

    Article  Google Scholar 

  47. Xu C, Corso J (2012) Evaluation of super-voxel methods for early video processing. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 1202–1209

  48. Yang J, Wright J, Huang T, Ma Y (2008) Image super-resolution as sparse representation of raw image patches. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2008, pp 1–8

  49. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classiffication. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2009, pp 1794–1801

  50. Yang M, Zhang D, Zhang D, Wang S (2012) Relaxed collaborative representation for pattern classiffication. In: IEEE conference on Computer Vision and Pattern Recognition, CVPR 2012, pp 2224–2231

  51. Zhang H, Yang J, Zhang Y, Nasrabadi N, Huang T (2011) Close the loop: joint blind image restoration and recognition with sparse representation prior. In: IEEE international Conference on Computer Vision, ICCV 2011, pp 770–777

  52. Zhang D, Zhu P, Hu Q, Zhang D (2011) A linear subspace learning approach via sparse coding. In: IEEE international Conference on Computer Vision, ICCV 2011, pp 755–761

  53. Zhang D, Yang M, Feng X (2011) Sparse representation or collaborative representation: which helps face recognition? In: IEEE International Conference on Computer Vision (ICCV), ICCV 2011, pp 471–478

  54. Zhao J, Ching S, Cheung S (2012) Human segmentation by geometrically fusing visible-light and thermal imageries Multimed Tools Appl. doi:10.1007/s11042-012-1299-2

  55. Zhu Y, Papademetris X, Sinusas A, Duncan J (2010) Segmentation of the left ventricle from cardiac mr images using a subject-specific dynamical model. IEEE Trans Med Imaging 29:669–687

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by NSFC (No. 61179060, and 61101091), National High Technology Research and Development Program of China (863 Program, No. 2012AA011503), and Fundamental Research Funds for the Central Universities (ZYGX2012J019).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yurui Xie.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, Y., Huang, C. & Xu, L. Semantic superpixel extraction via a discriminative sparse representation. Multimed Tools Appl 73, 1247–1268 (2014). https://doi.org/10.1007/s11042-013-1626-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1626-2

Keywords

Navigation