Abstract
Wavelet packet transform is an effective texture analysis approach by sub-band filtering. Different texture patterns have distinctive responses to the sub-bands of wavelet packets. The responses are valuable for texture description. Utilizing all the responses of the sub-bands of different resolutions can improve texture pattern discrimination power. In this paper, effective texture descriptors based on hierarchical wavelet packet (HWVP) transform are proposed. The subtle sub-bands of wavelet packet transform improve the discrimination power of HWVP descriptors for the images in different categories. Scene categorization performances of the HWVP descriptors under various decomposition levels and wavelet bases are discussed. Performances of HWVP descriptors of global and local images with different partition patterns are also analyzed. The advantages of HWVP descriptors attribute to the following two aspects. Firstly sub-band filtering is helpful for improving the discrimination power of HWVP descriptors to capture the subtle differences of texture patterns. Secondly hierarchical feature representation makes the HWVP descriptors robust to resolution variations. Comparisons are made with some existing robust descriptors on scene categorization and semantic concept retrieval. Experimental results on the widely used OT, Scene-13, Sport Event, and TRECVID 2007 datasets show the effectiveness of the proposed HWVP descriptors.
Similar content being viewed by others
Notes
TRECVID 2009 Website: http://www-nlpir.nist.gov/projects/tv2009/tv9.hlf.for.eval.txt
References
Blei D, Ng A, Jordan M (2003) “Latent dirichlet allocation.” J Mach Learn Res (3): 993–1022
Bosch A, Zisserman A, Munoz X (2007) “Representing shape with a spatial pyramid kernel.” In: Proc. CIVR
Bosch A, Zisserman A, Munoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell 30(4):712–727
Cai D, He X, Li Z, Ma W, Wen J (2004) “Hierarchical clustering of WWW image search results using visual, textual and link information.” In: Proc. ACM Multimedia, pp. 952–959
Campbell M, Haubold A, Liu M, Natsev A, Smith JR, Tesic J, Xie L, Yan R, Yang J (2007) “IBM research TRECVID-2007 video retrieval system.” In: NIST TRECVID Workshop
Cao L, Li F (2007) “Spatially coherent latent topic model for concurrent object segmentation and classification.” In: Proc. ICCV
Chang C, Lin C (2008) “LIBSVM: a library for support vector machines”. http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) “Visual categorization with bags of keypoints.” In: Proc. ECCV
Fidler S, Boben M, Leonardis A (2008) “Similarity-based cross-layered hierarchical representation for object categorization.” In Proc. CVPR
Freud Y, Schapire R (1996) “Experiments with a new boosting algorithms.” Machine Learning: Proceedings of the 13th International Conference
Garcia C, Zikos G, Tziritas G (2000) Wavelet packet analysis for face recognition. Image Vision Comput 18:289–297
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1):177–196
Holub A, Perona P (2005) “A discriminative framework for modeling object classes.” In: Proc. ICCV
Laine A, Fan J (1993) Texture classification by wavelet packet signatures. IEEE Trans Pattern Anal Mach Intell 15(11):1186–1193
Larlus D, Jurie F (2008) “Combining appearance models and markov random fields for category level object segmentation.” In: Proc. CVPR
Lazebnik S, Schmid C, Ponce J (2006) “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories.” In: Proc. CVPR
Li L, Li F (2007) “What, where and who? classifying events by scene and object recognition.” In: Proc. ICCV
Li F, Perona P (2005) “A Bayesian hierarchy model for learning natural scene categories.” In: Proc. CVPR
Li J, Wang J (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans Pattern Anal Mach Intell 25(9):1075–1088
Lowe D (2004) Distinctive image features from scale-invariant keypoints. ICCV 60(2):91–110
Mutch J, Lowe D (2006) “Multiclass object recognition using sparse, localized features.” In: Proc. CVPR
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Qian X, Hua X, Chen P, Ke L (2011) PLBP: an effective local binary patterns texture descriptor with pyramid representation. Pattern Recogn 44:2502–2515
Qian X, Liu G, Guo D, Li Z, Wang Z, Wang H (2009) “Object categorization using hierarchical wavelet packet texture descriptors.” In: Proc. ISM, pp. 44–51
Qian X, Yan Z, Hang K (2011) “Boosted scene categorization approach by adjusting inner structures and outer weights of weak classifiers”. In: Proc. MMM, pp. 413–423
Quattoni A, Collins M, Darrell T (2004) “Conditional random fields for object recognition.” In: NIPS
Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) “Object in context.” In: Proc. ICCV
Ro Y, Kim M, Kang H, Manjunath B, Kim J (2001) MPEG-7 homogeneous texture descriptor. ETRI J 23(2):41–51
Serre T, Wolf L, Poggio T (2005) “Object recognition with features inspired by visual cortex.” In: Proc. CVPR
Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Sudderth E, Torralba A, Freeman W, Willsky A (2005) “Describing visual scenes using transformed dirichlet processes.” In: NIPS
Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099
Teh Y, Jordan M, Beal M, Blei D (2006) “Hierarchical Dirichlet processes.” J Am Stat Assoc
Torralba A, William K, Freeman T, Rubin M (2003) “Context-based vision system for place and object recognition.” In: Proc. ICCV
Wang G, Zhang Y, Li F (2006) “Using dependent regions for object categorization in a generative framework.” In: Proc. CVPR
Wu L, Hu Y, Li M, Yu N, Hua X (2009) Scale-invariant visual language modeling for object categorization. IEEE Trans Multimedia 11(2):286–294
Yuan J, Wu Y, Yang M (2007) “Discovery of collocation patterns: from visual words to visual phrases.” In: Proc. CVPR
Zhang H, Berg A, Maire M, Malik J (2006) “Svm-knn: discriminative nearest neighbor classification for visual category recognition.” In: Proc. CVPR
Zhang J, MarszaÃlek M, Lazebnik S, Schmid C (2007) “Local features and kernels for classification of texture and object categories: a comprehensive study.” Int J Comput Vis
Zheng Y, Zhao M, Neo S, Chua T, Tian Q (2008) “Visual synset: towards a higher-level visual representation.” In: Proc. CVPR
Zhou X, Wang M, Zhang Q, Zhang J, Shi B (2007) “Automatic image annotation by an iterative approach incorporating keyword correlations and region matching.” In: Proc. CIVR, pp. 25–32
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported in part by the National Natural Science Foundation of China (NSFC) Project No.60903121,No.61173109, and Foundations of Microsoft Research Asia
Rights and permissions
About this article
Cite this article
Qian, X., Guo, D., Hou, X. et al. HWVP: hierarchical wavelet packet descriptors and their applications in scene categorization and semantic concept retrieval. Multimed Tools Appl 69, 897–920 (2014). https://doi.org/10.1007/s11042-012-1151-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1151-8