Online Dictionary Learning Based Intra-frame Video Coding

Sun, Yipeng; Xu, Mai; Tao, Xiaoming; Lu, Jianhua

doi:10.1007/s11277-013-1577-y

Online Dictionary Learning Based Intra-frame Video Coding

Published: 09 January 2014

Volume 74, pages 1281–1295, (2014)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Yipeng Sun¹,
Mai Xu²,
Xiaoming Tao¹ &
…
Jianhua Lu¹

446 Accesses
11 Citations
Explore all metrics

Abstract

In this paper, we propose an online learning based intra-frame video coding approach, exploiting the texture sparsity of natural images. The proposed method is capable of learning the basic texture elements from previous frames with convergence guaranteed, leading to effective dictionaries for sparser representation of incoming frames. Benefiting from online learning, the proposed online dictionary learning based codec (ODL codec) is able to achieve a goal that the more video frames are being coded, the less non-zero coefficients are required to be transmitted. Then, these non-zero coefficients for image patches are further quantized and coded combined with dictionary synchronization. The experimental results demonstrate that the number of non-zero coefficients of each frame decreases rapidly while more frames are encoded. Compared to the off-line mode training, the proposed ODL codec, learning from video on the fly, is able to reduce the computational complexity with fast convergence. Finally, the rate distortion performance shows improvement in terms of PSNR compared with the K-SVD dictionary based compression and H.264/AVC for intra-frame video at low bit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Block Matching Video Compression Based on Sparse Representation and Dictionary Learning

Article 23 November 2017

A Dictionary Learning Method Based on Self-adaptive Locality-Sensitive Sparse Representation

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Article 25 July 2019

Notes

The state-of-the-art K-SVD dictionary is well learned off-line from large numbers of training data.

References

Aharon, M., & Elad, M. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311–4322.
Article Google Scholar
Bross, B., Han, W. J., Ohm, J. R., & Sullivan, G. (2012). High efficiency video coding (HEVC) text specification draft 8. document JCTVC-J1003.
Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4), 270–282.
Article Google Scholar
Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509.
Article MATH MathSciNet Google Scholar
Cisco visual networking index (VNI). (2012). Global mobile data traffic forecast 2012–2017.
Dai, W., & Milenkovic, O. (2009). Subspace pursuit for compressive sensing signal reconstruction. IEEE Transactions on Information Theory, 55(5), 2230–2249.
Article MathSciNet Google Scholar
ISO/IEC 15444–1 (2000). JPEG 2000 Part I Final Committee Draft Version 1.0.
Kang, J. W., Kuo, C. C., Cohen, R., & Vetro, A. (2011). Efficient dictionary based video coding with reduced side information. In 2011 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 109–112).
Karklin, Y., & Lewicki, M. (2008). Emergence of complex cell properties by learning to generalize in natural scenes. Nature, 457(7225), 83–86.
Google Scholar
Lee, H., Battle, A., Raina, R., & Ng, A.Y. (2006). Efficient sparse coding algorithms. In Advances in neural information processing systems (NIPS’06) (pp. 801–808).
Mairal, J., & Bach, F. (2009). Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML’09 (pp. 689–696). ACM.
Mairal, J., & Bach, F. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.
MATH MathSciNet Google Scholar
Mallat, S., & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.
Article MATH Google Scholar
Marpe, D. (2006). The H.264/MPEG4 advanced video coding standard and its applications. IEEE Communications Magazine, 44(8), 134–143.
Article Google Scholar
Needell, D., & Tropp, J. (2009). Cosamp: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis, 26(3), 301–321.
Article MATH MathSciNet Google Scholar
Neff, R., & Zakhor, A. (1997). Very low bit-rate video coding based on matching pursuits. Circuits and Systems for Video Technology, IEEE Transactions on, 7(1), 158–171.
Article Google Scholar
Neff, R., & Zakhor, A. (2002). Matching pursuit video coding. i. Dictionary approximation. Circuits and Systems for Video Technology, IEEE Transactions on, 12(1), 13–26.
Article Google Scholar
Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.
Article Google Scholar
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37, 3311–3325.
Article Google Scholar
Pati, Y., & Rezaiifar, R. (1993). Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals Systems and Computers (Vol. 1, pp. 40–44).
Rubinstein, R. (2010). Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6), 1045–1057.
Article Google Scholar
Skretting, K., & Engan, K. (2011). Image compression using learned dictionaries by RLS-DLA and compared with K-SVD. In 2011 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (pp. 1517–1520).
Skretting, K., & Engan, K. (2010). Recursive least squares dictionary learning algorithm. IEEE Transactions on Signal Processing, 58(4), 2121–2130.
Article MathSciNet Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
MATH MathSciNet Google Scholar
Trevor, B. E., & Hastie, T. (2002). Least angle regression. Annals of Statistics, 32, 407–499.
Google Scholar
Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications, 109(3), 475–494.
Google Scholar
Turkan, M, & Guillemot, C. (2011). Online dictionaries for image prediction. In 2011 18th IEEE International Conference on Image Processing (ICIP) (pp. 293–296).
Wiegand, T., Sullivan, G., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.
Article Google Scholar
Zepeda, J., & Guillemot, C. (2011). Image compression using sparse representations and the iteration-tuned and aligned dictionary. IEEE Journal of Selected Topics in Signal Processing, 5(5), 1061–1073.
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous reviews for their valuable comments and suggestions that highly improve the quality of the paper. The authors would also like to thank Xuan Dong, the Ph.D candidate at the Dept. of Computer Science, Tsinghua University, for giving a helping hand. This work was supported by the National Basic Research Project of China (973) (2013CB329000, 2013CB329006), National Natural Science Foundation of China (NSFC, No.61101071, 61021001, 60972021, 61202139) and Tsinghua-Qualcomm Joint Research Program.

Author information

Authors and Affiliations

Tsinghua National Laboratory for Information Science and Technology (TNList), State Key Laboratory on Microwave and Digital Communications, Department of Electronic Engineering, Tsinghua University, Beijing, People’s Republic of China
Yipeng Sun, Xiaoming Tao & Jianhua Lu
School of Electronic and Information Engineering, Beihang University, Beijing, People’s Republic of China
Mai Xu

Authors

Yipeng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Mai Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Tao
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoming Tao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, Y., Xu, M., Tao, X. et al. Online Dictionary Learning Based Intra-frame Video Coding. Wireless Pers Commun 74, 1281–1295 (2014). https://doi.org/10.1007/s11277-013-1577-y

Download citation

Published: 09 January 2014
Issue Date: February 2014
DOI: https://doi.org/10.1007/s11277-013-1577-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Dictionary Learning Based Intra-frame Video Coding

Abstract

Access this article

Similar content being viewed by others

Block Matching Video Compression Based on Sparse Representation and Dictionary Learning

A Dictionary Learning Method Based on Self-adaptive Locality-Sensitive Sparse Representation

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online Dictionary Learning Based Intra-frame Video Coding

Abstract

Access this article

Similar content being viewed by others

Block Matching Video Compression Based on Sparse Representation and Dictionary Learning

A Dictionary Learning Method Based on Self-adaptive Locality-Sensitive Sparse Representation

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation