Skip to main content
Log in

Online Dictionary Learning Based Intra-frame Video Coding

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In this paper, we propose an online learning based intra-frame video coding approach, exploiting the texture sparsity of natural images. The proposed method is capable of learning the basic texture elements from previous frames with convergence guaranteed, leading to effective dictionaries for sparser representation of incoming frames. Benefiting from online learning, the proposed online dictionary learning based codec (ODL codec) is able to achieve a goal that the more video frames are being coded, the less non-zero coefficients are required to be transmitted. Then, these non-zero coefficients for image patches are further quantized and coded combined with dictionary synchronization. The experimental results demonstrate that the number of non-zero coefficients of each frame decreases rapidly while more frames are encoded. Compared to the off-line mode training, the proposed ODL codec, learning from video on the fly, is able to reduce the computational complexity with fast convergence. Finally, the rate distortion performance shows improvement in terms of PSNR compared with the K-SVD dictionary based compression and H.264/AVC for intra-frame video at low bit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The state-of-the-art K-SVD dictionary is well learned off-line from large numbers of training data.

References

  1. Aharon, M., & Elad, M. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311–4322.

    Article  Google Scholar 

  2. Bross, B., Han, W. J., Ohm, J. R., & Sullivan, G. (2012). High efficiency video coding (HEVC) text specification draft 8. document JCTVC-J1003.

  3. Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4), 270–282.

    Article  Google Scholar 

  4. Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509.

    Article  MATH  MathSciNet  Google Scholar 

  5. Cisco visual networking index (VNI). (2012). Global mobile data traffic forecast 2012–2017.

  6. Dai, W., & Milenkovic, O. (2009). Subspace pursuit for compressive sensing signal reconstruction. IEEE Transactions on Information Theory, 55(5), 2230–2249.

    Article  MathSciNet  Google Scholar 

  7. ISO/IEC 15444–1 (2000). JPEG 2000 Part I Final Committee Draft Version 1.0.

  8. Kang, J. W., Kuo, C. C., Cohen, R., & Vetro, A. (2011). Efficient dictionary based video coding with reduced side information. In 2011 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 109–112).

  9. Karklin, Y., & Lewicki, M. (2008). Emergence of complex cell properties by learning to generalize in natural scenes. Nature, 457(7225), 83–86.

    Google Scholar 

  10. Lee, H., Battle, A., Raina, R., & Ng, A.Y. (2006). Efficient sparse coding algorithms. In Advances in neural information processing systems (NIPS’06) (pp. 801–808).

  11. Mairal, J., & Bach, F. (2009). Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML’09 (pp. 689–696). ACM.

  12. Mairal, J., & Bach, F. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.

    MATH  MathSciNet  Google Scholar 

  13. Mallat, S., & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.

    Article  MATH  Google Scholar 

  14. Marpe, D. (2006). The H.264/MPEG4 advanced video coding standard and its applications. IEEE Communications Magazine, 44(8), 134–143.

    Article  Google Scholar 

  15. Needell, D., & Tropp, J. (2009). Cosamp: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis, 26(3), 301–321.

    Article  MATH  MathSciNet  Google Scholar 

  16. Neff, R., & Zakhor, A. (1997). Very low bit-rate video coding based on matching pursuits. Circuits and Systems for Video Technology, IEEE Transactions on, 7(1), 158–171.

    Article  Google Scholar 

  17. Neff, R., & Zakhor, A. (2002). Matching pursuit video coding. i. Dictionary approximation. Circuits and Systems for Video Technology, IEEE Transactions on, 12(1), 13–26.

    Article  Google Scholar 

  18. Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.

    Article  Google Scholar 

  19. Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37, 3311–3325.

    Article  Google Scholar 

  20. Pati, Y., & Rezaiifar, R. (1993). Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals Systems and Computers (Vol. 1, pp. 40–44).

  21. Rubinstein, R. (2010). Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6), 1045–1057.

    Article  Google Scholar 

  22. Skretting, K., & Engan, K. (2011). Image compression using learned dictionaries by RLS-DLA and compared with K-SVD. In 2011 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (pp. 1517–1520).

  23. Skretting, K., & Engan, K. (2010). Recursive least squares dictionary learning algorithm. IEEE Transactions on Signal Processing, 58(4), 2121–2130.

    Article  MathSciNet  Google Scholar 

  24. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.

    MATH  MathSciNet  Google Scholar 

  25. Trevor, B. E., & Hastie, T. (2002). Least angle regression. Annals of Statistics, 32, 407–499.

    Google Scholar 

  26. Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications, 109(3), 475–494.

    Google Scholar 

  27. Turkan, M, & Guillemot, C. (2011). Online dictionaries for image prediction. In 2011 18th IEEE International Conference on Image Processing (ICIP) (pp. 293–296).

  28. Wiegand, T., Sullivan, G., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.

    Article  Google Scholar 

  29. Zepeda, J., & Guillemot, C. (2011). Image compression using sparse representations and the iteration-tuned and aligned dictionary. IEEE Journal of Selected Topics in Signal Processing, 5(5), 1061–1073.

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviews for their valuable comments and suggestions that highly improve the quality of the paper. The authors would also like to thank Xuan Dong, the Ph.D candidate at the Dept. of Computer Science, Tsinghua University, for giving a helping hand. This work was supported by the National Basic Research Project of China (973) (2013CB329000, 2013CB329006), National Natural Science Foundation of China (NSFC, No.61101071, 61021001, 60972021, 61202139) and Tsinghua-Qualcomm Joint Research Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoming Tao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, Y., Xu, M., Tao, X. et al. Online Dictionary Learning Based Intra-frame Video Coding. Wireless Pers Commun 74, 1281–1295 (2014). https://doi.org/10.1007/s11277-013-1577-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-013-1577-y

Keywords

Navigation