Abstract
The iris flower dataset is a ubiquitous benchmark task in machine learning literature. With its 150 instances, four continuous features, and three balanced classes, of which one is linearly separable from the others, iris is generally considered an easy problem. Hence researchers usually rely on other datasets when they need more challenging benchmarks. A similar situation happens with computer vision datasets such as MNIST and ImageNet, which have been widely explored. The state of the art models essentially solves these problems, motivating the search for more challenging tasks. Therefore, this paper introduces a new computer vision toy dataset featuring iris flowers. Users of a nature photography application took the pictures, thus they include noisy background information. Additionally, certain desirable features are not guaranteed, such as single, similarly-sized objects at the center of each picture, which makes the task more challenging. Our benchmark results show that the dataset can be challenging for traditional machine learning algorithms without any pre-processing steps, while state of the art deep learning architectures achieve around 82% accuracy, which means some effort will be necessary to drive this accuracy closer to what has been accomplished for MNIST and ImageNet.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
See Tensorflow documentation for more details.
References
Anderson, E.: The species problem in iris. Ann. Mo. Bot. Gard. 23(3), 457–509 (1936). http://www.jstor.org/stable/2394164
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x, https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1469-1809.1936.tb02137.x
Goodfellow, I., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/software available from tensorflow.org
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Heredia, I.: Large-scale plant classification with deep neural networks. In: Proceedings of the Computing Frontiers Conference, pp. 259–262 (2017)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Loarie, S.: A community for naturalists (2008). https://www.inaturalist.org/
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
R. Al-Qurran, M.A.A., Shatnawi, A.: Plant classification in the wild: a transfer learning approach. International Arab Conference on Information Technology (ACIT) (2018)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Van Horn, G., et al.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8769–8778 (2018)
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: Openml: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013). https://doi.org/10.1145/2641190.2641198, http://doi.acm.org/10.1145/2641190.2641198
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017). arXiv preprint arXiv:1708.07747
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
de Paiva Rocha Filho, I. et al. (2021). Iris-CV: Classifying Iris Flowers Is Not as Easy as You Thought. In: Britto, A., Valdivia Delgado, K. (eds) Intelligent Systems. BRACIS 2021. Lecture Notes in Computer Science(), vol 13074. Springer, Cham. https://doi.org/10.1007/978-3-030-91699-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-91699-2_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91698-5
Online ISBN: 978-3-030-91699-2
eBook Packages: Computer ScienceComputer Science (R0)