Abstract
As lovely as bunnies are, your sketched version would probably not do them justice (Fig. 1). This paper recognises this very problem and studies sketch quality assessment for the first time—letting you find these badly drawn ones. Our key discovery lies in exploiting the magnitude (\(L_2\) norm) of a sketch feature as a quantitative quality metric. We propose Geometry-Aware Classification Layer (GACL), a generic method that makes feature-magnitude-as-quality-metric possible and importantly does it without the need for specific quality annotations from humans. GACL sees feature magnitude and recognisability learning as a dual task, which can be simultaneously optimised under a neat cross-entropy classification loss with theoretic guarantee. This gives GACL a nice geometric interpretation (the better the quality, the easier the recognition), and makes it agnostic to both network architecture changes and the underlying sketch representation. Through a large scale human study of 160,000 trials, we confirm the agreement between our GACL-induced metric and human quality perception. We further demonstrate how such a quality assessment capability can for the first time enable three practical sketch applications. Interestingly, we show GACL not only works on abstract visual representations such as sketch but also extends well to natural images on the problem of image quality assessment (IQA). Last but not least, we spell out the general properties of GACL as general-purpose data re-weighting strategy and demonstrate its applications in vertical problems such as noisy label cleansing. Code will be made publicly available at https://github.com/yanglan0225/SketchX-Quantifying-Sketch-Quality.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
For notation simplicity, we use \(q_i\) and \(\theta _{y_i}\) to represent \(\parallel f(x_i)\parallel \) and \(\theta _{i,y_i}\) respectively.
We apply a linear scaling on \(q_i\) in practice to make it work in the proper value range \([l_q,u_q]\), which is omitted here for simplicity.
(i) \({{\,\textrm{LogSumExp}\,}}(x)\) for \(\max (x)\); (ii) \({{\,\textrm{SoftPlus}\,}}(x)\) for \(\max (x,0)\).
It is theoretically verified that when \(q_i>\beta _{i}\) and \(q_i<\beta _{i+1}\), \(o_i\) will likely fall into the interval \((\beta _i, \beta _{i+1})\) in the form of a one-hot categorical encoding as \(\tau \) approaches 0.
Modelling each sketch point as a Gaussian Mixture Model is adopted in most existing sketch generations works (Ha & Eck, 2018; Song et al., 2018; Su et al., 2020a) This is in contrast to the single-modal normal distribution that corresponds to common \(L_2\) regression loss for maximum likelihood estimation.
\({\textit{Gumbel}}(0,1)\) is sampled by first drawing \(u \sim {{\,\textrm{Uniform}\,}}(0,1)\) and computing \(g_i=-\log (-\log (u))\).
Admittedly without an exhaustive search, we do conduct some ablation on the number of cut points and its impact on quality discovery. We find that setting the right number for the first few epochs matters greatly, with 5 being a reasonable choice (over 3, 7). Progressively climbing up to a larger bin number in the later epochs is also shown slightly superior to that of an abrupt change, i.e., 5 to 20 without transitions in between.
Random scribbles that do not conform to any semantic concept.
In Appendix, we showcase more applications of GACL as a general data reweighting method, including filtering out ambiguous and destructive benchmark data and withstanding an ethical check when using face recognition as an example.
References
Ahn, S., Choi, Y., & Yoon, K. (2021). Deep learning-based distortion sensitivity prediction for full-reference image quality assessment. In CVPR (pp. 344–353).
Arpit, D., Jastrzebski, S., Ballas, N., Ballas, N., Krueger, D., Bengio, E., Kanwal, M. S., Maharaj, T., Fischer, A., Courville, A., Bengio, Y., & Lacoste-Julien, S. (2017). A closer look at memorization in deep networks. In ICML (pp. 233–242).
Baldock, R., Maennel, H., & Neyshabur, B. (2021). Deep learning through the lens of example difficulty. In NeurIPS (pp. 10876–10889).
Bell, S., & Bala, K. (2015). Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (Proc SIGGRAPH), 34(4), 1–10.
Bengio, Y., Léonard, N., & Courville, A. (2013). Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv:1308.3432
Bhunia, A. K., Das, A., Muhammad, U. R., Yang, Y., Hospedales, T. M., Xiang, T., Gryaditskaya, Y., & Song, Y.-Z. (2020). Pixelor: A competitive sketching AI agent. So you think you can sketch? ACM Transactions on Graphics (Proc SIGGRAPH Asia), 39(6), 1–15.
Bhunia, A. K., Koley, S., Khilji, A. F. U. R., Sain, A., Chowdhury, P. N., Xiang, T., & Song, Y.-Z., (2022). Sketching without worrying: Noise-tolerant sketch-based image retrieval. In CVPR (pp. 999–1008).
Bhunia, A. K., Yang, Y., Hospedales, T. M., Xiang, T., & Song, Y.-Z., (2020b). Sketch less for more: On-the-fly fine-grained sketch-based image retrieval. In CVPR (pp. 9779–9788).
Bosse, S., Maniry, D., Müller, K. R., Wiegand, T., & Samek, W. (2017). Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on Image Processing, 27(1), 206–219.
Caron, M., Bojanowski, P., Joulin, A., & Douze, M. (2018). Deep clustering for unsupervised learning of visual features. In ECCV (pp. 132–149).
Chang, J., Lan, Z., Cheng, C., & Wei, Y. (2020). Data uncertainty learning in face recognition. In CVPR (pp. 5710–5719).
Chen, K. T., Wu, C. C., Chang, Y. C., & Lei, C.-L. (2009). A crowdsourceable QoE evaluation framework for multimedia content. In ACM MM (pp. 491–500).
Chen, S. Y., Su, W., Gao, L., Xia, S., & Fu, H. (2020). Deepfacedrawing: Deep generation of face images from sketches. ACM Transactions on Graphics (Proc SIGGRAPH), 39(4), 72–1.
Chiang, C. H., & Lee, H. (2023). Can large language models be an alternative to human evaluations? arXiv:2305.01937
Chun, S., Oh, S. J., de Rezende, R. S., Kalantidis, Y., & Larlus, D. (2021). Probabilistic embeddings for cross-modal retrieval. In CVPR (pp. 8415–8424).
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
Cui, Y., Jia, M., Lin, T. Y., Song, Y., & Belongie, S. (2019). Class-balanced loss based on effective number of samples. In CVPR (pp. 9268–9277).
Das, A., Yang, Y., Hospedales, T. M., Xiang, T., & Song, Y.-Z. (2021). Cloud2curve: Generation and vectorization of parametric sketches. In CVPR (pp. 7088–7097).
Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). Arcface: Additive angular margin loss for deep face recognition. In CVPR (pp. 4690–4699).
Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In ICML (pp. 194–202).
Dukler, Y., Achille, A., Paolini, G., Ravichandran, A., Polito, M., & Soatto, S. (2022). Diva: Dataset derivative of a learning task. In ICLR.
Dutta, A., & Akata, Z. (2019). Semantically tied paired cycle consistency for zero-shot sketch-based image retrieval. In CVPR (pp. 5089–5098).
Eitz, M., Hays, J., & Alexa, M. (2012). How do humans sketch objects? ACM Transactions on Graphics (Proc SIGGRAPH), 31(4), 1–10.
Fu, Y., Hospedales, T. M., Xiang, T., Xiong, J., Gong, S., Wang, Y., & Yao, Y. (2015). Robust subjective visual property prediction from crowdsourced pairwise labels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3), 563–577.
Gao, F., Tao, D., Gao, X., & Li, X. (2015). Learning to rank for blind image quality assessment. IEEE Transactions on Neural Networks and Learning Systems, 26(10), 2275–2290.
Ge, S., Goswami, V., Zitnick, C. L., & Parikh, D. (2020). Creative sketch generation. In ICLR.
Ghadiyaram, D., & Bovik, A. C. (2015). Massive online crowdsourced study of subjective and objective picture quality. IEEE Transactions on Image Processing, 25(1), 372–387.
Golestaneh, S. A., Dadsetan, S., & Kitani, K. M. (2022). No-reference image quality assessment via transformers, relative ranking, and self-consistency. In WACV (pp. 1220–1230).
Gryaditskaya, Y., Sypesteyn, M., Hoftijzer, J. W., Pont, S. C., Durand, F., & Bousseau, A. (2019). Opensketch: A richly-annotated dataset of product design sketches. ACM Transactions on Graphics (Proc SIGGRAPH Asia), 38(6), 232–1.
Ha, D., & Eck, D. (2018). A neural representation of sketch drawings. In ICLR.
Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In CVPR (pp. 1735–1742).
Hammou, D., Fezza, S. A., & Hamidouche, W. (2021). EGB: Image quality assessment based on ensemble of gradient boosting. In CVPR (pp. 541–549).
He, Y., Gan, Q., Wipf, D., Reinert, G. D., Yan, J., & Cucuringu, M. (2022). GNNRank: Learning global rankings from pairwise comparisons via directed graph neural networks. In ICML (pp. 8581–8612).
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Hu, W., Xiao, L., Adlam, B., & Pennington, J. (2020). The surprising simplicity of the early-time learning dynamics of neural networks. In NeurIPS (pp. 17116–17128).
Jang, E., Gu, S., & Poole, B. (2017). Categorical reparameterization with gumbel-softmax. In ICLR.
Jo, Y., & Park, J. (2019). Sc-fegan: Face editing generative adversarial network with user’s sketch and color. In CVPR (pp. 1745–1753).
Kalimeris, D., Kaplun, G., Nakkiran, P., Edelman, B., Yang, T., Barak, B., & Zhang, H. (2019). SGD on neural networks learns functions of increasing complexity. In NeurIPS.
Kao, Y., Wang, C., & Huang, K. (2015). Visual aesthetic quality assessment with a regression model. In ICIP (pp. 1583–1587).
Kendall, M. G., & Smith, B. B. (1940). On the method of paired comparisons. Biometrika, 31(3/4), 324–345.
Kim, J., & Lee, S. (2016). Fully deep blind image quality predictor. IEEE Journal of Selected Topics in Signal Processing, 11(1), 206–220.
Kim, J., & Lee, S. (2017). Deep learning of human visual sensitivity in image quality assessment framework. In CVPR (pp. 1676–1684).
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In ICLR.
Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In ICLR.
Koh, P. W., & Liang, P. (2017). Understanding black-box predictions via influence functions. In ICML (pp. 1885–1894).
Lee, Y. J., Zitnick, C. L., & Cohen, M. F. (2011). Shadowdraw: Real-time user guidance for freehand drawing. ACM Transactions on Graphics (Proc SIGGRAPH), 30(4), 1–10.
Li, K., Pang, K., Song, J., Song, Y.-Z., Xiang, T., Hospedales, T. M., & Zhang, H. (2018). Universal sketch perceptual grouping. In ECCV (pp. 582–597).
Li, Y., Du, Y., Zhou, K., Wang, J., Zhao, W. X., & Wen, J. R. (2023). Evaluating object hallucination in large vision-language models. arXiv:2305.10355
Liang, L., & Grauman, K. (2014). Beyond comparing image pairs: Setwise active learning for relative attributes. In CVPR (pp. 208–215).
Lin, H., Fu, Y., Xue, X., & Jiang, Y.-G. (2020). Sketch-bert: Learning sketch bidirectional encoder representation from transformers by self-supervised learning of sketch gestalt. In CVPR (pp. 6758–6767).
Liu, B., Deng, W., Zhong, Y., Wang, M., Hu, J., Tao, X., & Huang, Y. (2019a). Fair loss: Margin-aware reinforcement learning for deep face recognition. In: ICCV (pp. 10052–10061).
Liu, F., Deng, X., Lai, Y. K., Liu, Y.-J., Ma, C., & Wang, H. (2019b). Sketchgan: Joint sketch completion and recognition with generative adversarial network. In CVPR (pp. 5830–5839).
Liu, F., Zou, C., Deng, X., Zuo, R., Lai, Y.-K., Ma, C., Liu, Y.-J., & Wang, H. (2020). Scenesketcher: Fine-grained image retrieval with scene sketches. In ECCV (pp. 718–734).
Liu, H., Li, C., Wu, Q., & Lee, Y. J. (2023). Visual instruction tuning. arXiv:2304.08485
Liu, T. Y., et al. (2009). Learning to rank for information retrieval. Foundations and Trends ® in Information Retrieval, 3(3), 225–331.
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017a). Sphereface: Deep hypersphere embedding for face recognition. In CVPR (pp. 212–220).
Liu, X., Van De Weijer, J., & Bagdanov, A. D. (2017b). Rankiqa: Learning from rankings for no-reference image quality assessment. In ICCV (pp. 1040–1049).
Liu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., & Xie, S. (2022). A convnet for the 2020s. In CVPR (pp. 11976–11986).
Loshchilov, I., & Hutter, F. (2017). Sgdr: Stochastic gradient descent with warm restarts. In ICLR.
Ma, K., Liu, W., Liu, T., Wang, Z., & Tao, D. (2017). dipiq: Blind image quality assessment by learning-to-rank discriminable image pairs. IEEE Transactions on Image Processing, 26(8), 3951–3964.
Ma, K., Liu, W., Zhang, K., Duanmu, Z., Wang, Z., & Zuo, W. (2017). End-to-end blind image quality assessment using deep neural networks. IEEE Transactions on Image Processing, 27(3), 1202–1213.
Ma, Y., Xiong, T., Zou, Y., & Wang, K. (2011). Person-specific age estimation under ranking framework. In ACM ICMR (pp. 1–7).
Matsui, Y., Shiratori, T., & Aizawa, K. (2016). Drawfromdrawings: 2d drawing assistance via stroke interpolation with a sketch database. IEEE Transactions on Visualization and Computer Graphics, 23(7), 1852–1862.
Meng, Q., Zhao, S., Huang, Z., & Zhou, F. (2021). Magface: A universal representation for face recognition and quality assessment. In CVPR (pp. 14225–14234).
Mittal, A., Moorthy, A. K., & Bovik, A. C. (2012). No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing, 21(12), 4695–4708.
Muhammad, U. R., Yang, Y., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018). Learning deep sketch abstraction. In CVPR (pp. 8014–8023).
Murray, N., Marchesotti, L., & Perronnin, F. (2012). AVA: A large-scale database for aesthetic visual analysis. In CVPR (pp. 2408–2415).
Pang, K., Li, K., Yang, Y., Zhang, H., Hospedales, T. M., Xiang, T, & Song, Y.-Z. (2019). Generalising fine-grained sketch-based image retrieval. In CVPR (pp. 677–686).
Pang, K., Yang, Y., Hospedales, T. M., Xiang, T., & Song, Y.-Z. (2020). Solving mixed-modal jigsaw puzzle for fine-grained sketch-based image retrieval. In CVPR (pp. 10347–10355).
Powers, D. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2(1), 37–63.
Qu, Z., Gryaditskaya, Y., Li, K., Pang, K., Xiang, T., & Song, Y.-Z., (2023). Sketchxai: A first look at explainability for human sketches. In CVPR (pp. 23327–23337).
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., & Krueger, G. (2021). Learning transferable visual models from natural language supervision. In ICML (pp. 8748–8763).
Ren, M., Zeng, W., Yang, B., & Urtasun, R. (2018). Learning to reweight examples for robust deep learning. In ICML (pp. 4334–4343).
Ribeiro, L. S. F., Bui, T., Collomosse, J., & Ponti, M. (2020). Sketchformer: Transformer-based representation for sketched structure. In CVPR (pp. 14153–14162).
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In CVPR (pp. 10684–10695).
Saad, M. A., Bovik, A. C., & Charrier, C. (2012). Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Transactions on Image Processing, 21(8), 3339–3352.
Sain, A., Bhunia, A. K. , Yang, Y., Xiang, T., & Song, Y.-Z. (2021). Stylemeup: Towards style-agnostic sketch-based image retrieval. In CVPR (pp. 8504–8513).
Sangkloy, P., Burnell, N., Ham, C., & Hays, J. (2016). The sketchy database: Learning to retrieve badly drawn bunnies. ACM Transactions on Graphics (Proc SIGGRAPH), 35(4), 1–12.
Sangkloy, P., Lu, J., Fang, C., Yu, F., & Hays, J. (2017). Scribbler: Controlling deep image synthesis with sketch and color. In CVPR (pp. 5400–5409).
Sarvadevabhatla, R. K., & Kundu, J. (2016). Enabling my robot to play pictionary: Recurrent neural networks for sketch recognition. In ACM MM (pp. 247–251).
Sarvadevabhatla, R. K., Dwivedi, I., Biswas, A., & Manocha, S. (2017). Sketchparse: Towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In ACM MM (pp. 10–18).
Schneider, R. G., & Tuytelaars, T. (2014). Sketch classification and classification-driven analysis using fisher vectors. ACM Transactions on Graphics (Proc SIGGRAPH Asia), 33(6), 1–9.
She, D., Lai, Y. K., Yi, G., & Xu, K. (2021). Hierarchical layout-aware graph convolutional network for unified aesthetics assessment. In CVPR (pp. 8475–8484).
Shi, Y., Cao, N., Ma, X., Chen, S., & Liu, P. (2020). Emog: Supporting the sketching of emotional expressions for storyboarding. In CHI (pp. 1–12).
Shu, J., Xie, Q., Yi, L., Zhao, Q., Zhou, S., Xu, Z., & Meng, D. (2019). Meta-weight-net: Learning an explicit mapping for sample weighting. In NeurIPS.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Song, J., Pang, K., Song, Y. Z., Xiang, T., & Hospedales, T. M. (2018). Learning to sketch with shortcut cycle consistency. In CVPR (pp. 801–810).
Su, G., Qi, Y., Pang, K., Yang, J., & Song, Y.-Z. (2020a). Sketchhealer: A graph-to-sequence network for recreating partial human sketches. In BMVC (pp. 1–14).
Su, S., Yan, Q., Zhu, Y., Zhang, C., Ge, X., Sun, J., & Zhang, Y. (2020b). Blindly assess image quality in the wild guided by a self-adaptive hyper network. In CVPR (pp. 3667–3676).
Talebi, H., & Milanfar, P. (2018). NIMA: Neural image assessment. IEEE Transactions on Image Processing, 27(8), 3998–4011.
Thurstone, L. L. (1927). A law of comparative judgment. Psychology Review, 101(2), 266.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In NeurIPS.
Wan, W., Yang, Y., & Tu, W. (2021). Residual enhancement network for realistic face sketch-photo synthesis. In ICCEAI (pp. 191–195).
Wang, A., Ren, M., & Zemel, R. (2021a). Sketchembednet: Learning novel concepts by imitating drawings. In ICML (pp. 10870–10881).
Wang, F., Xiang, X., Cheng, J., & Yuille, A. L. (2017). Normface: L2 hypersphere embedding for face verification. In ACM MM (pp. 1041–1049).
Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., & Liu, W. (2018). Cosface: Large margin cosine loss for deep face recognition. In CVPR (pp. 5265–5274).
Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., & Wu, Y. (2014). Learning fine-grained image similarity with deep ranking. In CVPR (pp. 1386–1393).
Wang, S. Y., Bau, D., & Zhu, J. Y. (2021b). Sketch your own GAN. In ICCV (pp. 14050–14060).
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Wang, Z., Wang, H., Chen, T., Wang, Z., & Ma, K. (2021c). Troubleshooting blind image quality models in the wild. In CVPR (pp. 16256–16265).
Wu, J., Ma, J., Liang, F., Dong, W., Shi, G., & Lin, W. (2020). End-to-end blind image quality prediction with cascaded deep neural network. IEEE Transactions on Image Processing, 29, 7414–7426.
Wu, J., Zeng, J., Liu, Y., Shi, G., & Lin, W. (2017). Hierarchical feature degradation based blind image quality assessment. In ICCV Workshop.
Wu, O., Hu, W., & Gao, J. (2011). Learning to predict the perceived visual quality of photos. In ICCV (pp. 225–232).
Wu, Q., Liu, Y., Zhao, H., Kale, A., Bui, T., Yu, T., Lin, Z., Zhang, Y., & Chang, S. (2023). Uncovering the disentanglement capability in text-to-image diffusion models. In CVPR (pp. 1900–1910).
Xiao, T., Xia, T., Yang, Y., Huang, C., & Wang, X. (2015). Learning from massive noisy labeled data for image classification. In CVPR (pp. 2691–2699).
Xie, J., Hertzmann, A., Li, W., & Winnemöller, H. (2014). Portraitsketch: Face sketching assistance for novices. In ACM UIST (pp. 407–417).
Xu, P., Huang, Y., Yuan, T., Pang, K., Song, Y.-Z., Xiang, T., Hospedales, T. M., Ma, Z, & Guo, J. (2018). Sketchmate: Deep hashing for million-scale human sketch retrieval. In CVPR (pp. 8090–8098).
Yan, Q., Gong, D., & Zhang, Y. (2019). Two-stream convolutional networks for blind image quality assessment. IEEE Transactions on Image Processing, 28(5), 2200–2211.
Yang, L., Pang, K., Zhang, H., Song, Y.-Z. (2021a). Sketchaa: Abstract representation for abstract sketch. In ICCV (pp. 10097–10106).
Yang, L., Pang, K., Zhang, H., & Song, Y.-Z. (2022). Finding badly drawn bunnies. In CVPR (pp. 7482–7491).
Yang, L., Zhuang, J., Fu, H., Wei, X., Zhou, K., & Zheng, Y. (2021). Sketchgnn: Semantic sketch segmentation with graph neural networks. ACM Transactions on Graphics, 40(3), 1–13.
Yang, Y., Morillo, I. G., & Hospedales, T. M. (2018). Deep neural decision trees. In ICML Workshop.
Yelamarthi, S. K., Reddy, S. K., Mishra, A., & Mittal, A. (2018). A zero-shot framework for sketch based image retrieval. In ECCV (pp. 300–317).
Ying, Z., Niu, H., Gupta, P., Mahajan, D., Ghadiyaram, D., & Bovik, A. (2020). From patches to pictures (paq-2-piq): Mapping the perceptual space of picture quality. In CVPR (pp. 3575–3585).
You, J., & Korhonen, J. (2021). Transformer for image quality assessment. In ICIP (pp. 1389–1393).
Yu, Q., Yang, Y., Liu, F., Song, Y.-Z., Xiang, T., & Hospedales, T. M. (2017). Sketch-a-net: A deep neural network that beats humans. International Journal of Computer Vision, 122, 411–425.
Zeng, H., Zhang, L., & Bovik, A. C. (2018). Blind image quality assessment with a probabilistic quality representation. In ICIP (pp. 609–613).
Zhang, L., Rao, A., & Agrawala, M. (2023). Adding conditional control to text-to-image diffusion models. In ICCV (pp. 3836–3847).
Zhang, L., Zhang, L., & Bovik, A. C. (2015). A feature-enriched completely blind image quality evaluator. IEEE Transactions on Image Processing, 24(8), 2579–2591.
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018a). The unreasonable effectiveness of deep features as a perceptual metric. In CVPR (pp. 586–595).
Zhang, W., Ma, K., Yan, J., Deng, D., & Wang, Z. (2018). Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1), 36–47.
Zhang, W., Ma, K., Zhai, G., & Yang, X. (2021). Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30, 3474–3486.
Zhang, X., Lin, W., & Huang, Q. (2021). Fine-grained image quality assessment: A revisit and further thinking. IEEE Transactions on Circuits and Systems for Video Technology, 32(5), 2746–2759.
Zhu, H., Li, L., Wu, J., Dong, W., & Shi, G. (2020a). Metaiqa: Deep meta-learning for no-reference image quality assessment. In CVPR (pp. 14143–14152).
Zhu, J. Y., Krähenbühl, P., Shechtman, E., & Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In ECCV (pp. 597–613).
Zhu, M., Li, J., Wang, N., & Gao, X. (2020). Knowledge distillation for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, 33(2), 893–906.
Zhu, M., Li, J., Wang, N., & Gao, X. (2021). Learning deep patch representation for probabilistic graphical model-based face sketch synthesis. International Journal of Computer Vision, 129, 1820–1836.
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (NSFC) under # 62076034 and the STI2030-Major Projects under # 2021ZD0200600. Special thanks go to the China Scholarship Council (CSC) for funding the first author’s entire project at SketchX Lab, under # 202006470075.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Communicated by Gang Hua.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, L., Pang, K., Zhang, H. et al. Annotation-Free Human Sketch Quality Assessment. Int J Comput Vis 132, 2743–2764 (2024). https://doi.org/10.1007/s11263-024-02001-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-024-02001-1