Abstract
Visual dictionaries (Bag of Visual Words - BoVW) can be a very powerful technique for image description whenever exists a reduced number of training images, being an attractive alternative to deep learning techniques. Nevertheless, models for BoVW learning are usually unsupervised and rely on the same set of visual words for all images in the training set. We present a method that works with small supervised training sets. It first generates superpixels from multiple images of a same class, for interest point detection, and then builds one visual dictionary per class. We show that the detected interest points can be more relevant than the traditional ones (e.g., grid sampling) in the context of a given application—the classification of intestinal parasite images. The study uses three image datasets, with a total of 15 different species of parasites, and a diverse class, namely impurity, which makes the problem difficult with examples similar to all the remaining classes of parasites.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018
Cortés, X., Conte, D., Cardot, H.: A new bag of visual words encoding method for human action recognition. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2480–2485, August 2018. https://doi.org/10.1109/ICPR.2018.8545886
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 524–531, June 2005. https://doi.org/10.1109/CVPR.2005.16
Gong, X., Yuanyuan, L., Xie, Z.: An improved bag-of-visual-word based classification method for high-resolution remote sensing scene. In: 2018 26th International Conference on Geoinformatics, pp. 1–5, June 2018. https://doi.org/10.1109/GEOINFORMATICS.2018.8557124
Gwet, K.L.: Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters, 4th edn. Advanced Analytics, LLC, Gaithersburg (2014)
Haas, S., Donner, R., Burner, A., Holzer, M., Langs, G.: Superpixel-based interest points for effective bags of visual words medical image retrieval. In: Müller, H., Greenspan, H., Syeda-Mahmood, T. (eds.) MCBR-CDS 2011. LNCS, vol. 7075, pp. 58–68. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28460-1_6
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)
Li, Z., Zhang, Z., Qin, J., Zhang, Z., Shao, L.: Discriminative fisher embedding dictionary learning algorithm for object recognition. IEEE Trans. Neural Netw. Learn. Syst. 1–15 (2019). https://doi.org/10.1109/TNNLS.2019.2910146
Liu, Y., Caselles, V.: Supervised visual vocabulary with category information. In: Blanc-Talon, J., Kleihorst, R., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2011. LNCS, vol. 6915, pp. 13–21. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23687-7_2
Mikulik, A., Perdoch, M., Chum, O., Matas, J.: Learning vocabularies over a fine quantization. Int. J. Comput. Vis. 103(1), 163–175 (2013). https://doi.org/10.1007/s11263-012-0600-1
Minaee, S., et al.: MTBI identification from diffusion MR images using bag of adversarial visual features. IEEE Trans. Med. Imaging (2019, to appear). https://doi.org/10.1109/TMI.2019.2905917
Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006). https://doi.org/10.1007/11744085_38
Peixinho, A.Z., Benato, B.C., Nonato, L.G., Falcão, A.X.: Delaunay triangulation data augmentation guided by visual analytics for deep learning. In: 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 384–391, October 2018. https://doi.org/10.1109/SIBGRAPI.2018.00056
Rocha, L.M., Cappabianco, F.A.M., Falcão, A.X.: Data clustering as an optimum-path forest problem with applications in image analysis. Int. J. Imaging Syst. Technol. 19(2), 50–68 (2009). https://doi.org/10.1002/ima.v19:2
Silva, F.B., de O. Werneck, R., Goldenstein, S., Tabbone, S., da S. Torres, R.: Graph-based bag-of-words for classification. Pattern Recogn. 74, 266–285 (2018). https://doi.org/10.1016/j.patcog.2017.09.018
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv abs/1409.1556 (2014)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV 2003, vol. 2, pp. 1470–1477. IEEE Computer Society, Washington, DC (2003). https://doi.org/10.1109/ICCV.2003.1238663
de Souza, L.A., et al.: Learning visual representations with optimum-path forest and its applications to Barrett’s esophagus and adenocarcinoma diagnosis. Neural Comput. Appl. (2019) https://doi.org/10.1007/s00521-018-03982-0
Stehling, R.O., Nascimento, M.A., Falcão, A.X.: A compact and efficient image retrieval approach based on border/interior pixel classification. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, CIKM 2002, pp. 102–109. ACM, New York (2002). https://doi.org/10.1145/584792.584812
Suzuki, C.T.N., Gomes, J.F., Falcão, A.X., Papa, J.P., Hoshino-Shimizu, S.: Automatic segmentation and classification of human intestinal parasites from microscopy images. IEEE Trans. Biomed. Eng. 60(3), 803–812 (2013). https://doi.org/10.1109/TBME.2012.2187204
Tian, L., Wang, S.: Improved bag-of-words model for person re-identification. Tsinghua Sci. Technol. 23(2), 145–156 (2018). https://doi.org/10.26599/TST.2018.9010060
Vargas-Muñoz, J., Chowdhury, A., Alexandre, E., Galvão, F., Miranda, P., Falcão, A.: An iterative spanning forest framework for superpixel segmentation. IEEE Trans. Image Process. (2019, to appear). https://doi.org/10.1109/TIP.2019.2897941
Acknowledgments
The authors thank FAPESP (grants 2014/12236-1 and 2017/03940-5) and CNPq (grant 303808/2018-7) for the financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Castelo-Fernández, C., Falcão, A.X. (2019). Learning Visual Dictionaries from Class-Specific Superpixel Segmentation. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-29888-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29887-6
Online ISBN: 978-3-030-29888-3
eBook Packages: Computer ScienceComputer Science (R0)