Abstract
Over the years, due to the enrichment of paired-label datasets, supervised machine learning has become an important part of any problem-solving process. Active Learning gains importance when, given a large amount of freely available data, there’s a lack of expert’s manual labels. This paper proposes an active learning algorithm for selective choice of training samples in remote sensing image scene classification. Here, the classifier ranks the unlabeled pixels based on predefined heuristics and automatically selects those that are considered the most valuable for improvement; the expert then manually labels the selected pixels and the process is repeated. The system builds the optimal set of samples from a small and non-optimal training set, achieving a predefined classification accuracy. The experimental findings demonstrate that by adopting the proposed methodology, 0.02% of total training samples are required for Sentinel-2 Image Scene Classification while still reaching the same level of accuracy reached by complete training data sets. The advantages of the proposed method is highlighted by a comparison with the state-of-the-art active learning method named entropy sampling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Springer, Boston (2013). https://doi.org/10.1007/978-1-4757-0450-1
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. arXiv preprint arXiv:1309.0238 (2013)
Carbonneau, M.A., Granger, E., Gagnon, G.: Bag-level aggregation for multiple-instance active learning in instance classification problems. IEEE Trans. Neural Netw. Learn. Syst. 30(5), 1441–1451 (2018)
Chakraborty, S., Balasubramanian, V., Panchanathan, S.: Adaptive batch mode active learning. IEEE Trans. Neural Netw. Learn. Syst. 26(8), 1747–1760 (2015). https://doi.org/10.1109/TNNLS.2014.2356470
Citovsky, G., et al.: Batch active learning at scale. Adv. Neural Inf. Process. Syst. 34 (2021)
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. CoRR cs.AI/9603104 (1996). https://arxiv.org/abs/cs/9603104
Dao, P.D., Liou, Y.A.: Object-based flood mapping and affected rice field estimation with Landsat 8 OLI and MODIS data. Remote Sens. 7(5), 5077–5097 (2015)
Dasgupta, S., Langford, J.: Active learning tutorial, icml 2009.” (2009)
De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.: The mahalanobis distance. Chemom. Intell. Lab. Syst. 50(1), 1–18 (2000)
Devonport, A., Saoud, A., Arcak, M.: Symbolic abstractions from data: a PAC learning approach. arXiv preprint arXiv:2104.13901 (2021)
Fu, Y., Li, B., Zhu, X., Zhang, C.: Active learning without knowing individual instance labels: a pairwise label homogeneity query approach. IEEE Trans. Knowl. Data Eng. 26(4), 808–822 (2014). https://doi.org/10.1109/TKDE.2013.165
Fu, Y., Zhu, X., Li, B.: A survey on instance selection for active learning. Knowl. Inf. Syst. 35(2), 249–283 (2013)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)
Gu, Y., Jin, Z., Chiu, S.C.: Active learning with maximum density and minimum redundancy. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8834, pp. 103–110. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12637-1_13
Gui, X., Lu, X., Yu, G.: Cost-effective batch-mode multi-label active learning. Neurocomputing 463, 355–367 (2021)
Hanneke, S.: Theory of disagreement-based active learning. Found. Trends® Mach. Learn. 7(2–3), 131–309 (2014). https://doi.org/10.1561/2200000037
Hashem, N., Balakrishnan, P.: Change analysis of land use/land cover and modelling urban growth in greater Doha, Qatar. Ann. GIS 21(3), 233–247 (2015)
Hauptmann, A.G., Lin, W.H., Yan, R., Yang, J., Chen, M.Y.: Extreme video retrieval: joint maximization of human and computer performance. In: Proceedings of the 14th ACM International Conference on Multimedia, pp. 385–394 (2006)
Hoi, S.C., Jin, R., Lyu, M.R.: Large-scale text categorization by batch mode active learning. In: Proceedings of the 15th international conference on World Wide Web, pp. 633–642 (2006)
Hüllermeier, E., Waegeman, W.: Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach. Learn. 110(3), 457–506 (2021)
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Croft, B.W., van Rijsbergen, C.J. (eds.) SIGIR 1994, pp. 3–12. Springer, London (1994). https://doi.org/10.1007/978-1-4471-2099-5_1
Liu, Y., Zhong, Y., Fei, F., Zhu, Q., Qin, Q.: Scene classification based on a deep random-scale stretched convolutional neural network. Remote Sens. 10(3), 444 (2018)
Liu, Y.: Active learning with support vector machine applied to gene expression data for cancer classification. J. Chem. Inf. Comput. Sci. 44(6), 1936–1941 (2004)
Lourentzou, I., Gruhl, D., Welch, S.: Exploring the efficiency of batch active learning for human-in-the-loop relation extraction. In: Companion Proceedings of the Web Conference 2018, pp. 1131–1138 (2018)
Mohajerani, S., Krammer, T.A., Saeedi, P.: A cloud detection algorithm for remote sensing images using fully convolutional neural networks. In: 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–5 (2018). https://doi.org/10.1109/MMSP.2018.8547095
Nguyen, K.A., Liou, Y.A.: Mapping global eco-environment vulnerability due to human and nature disturbances. MethodsX 6, 862–875 (2019)
Opitz, J., Burst, S.: Macro F1 and macro F1. CoRR abs/1911.03347 (2019). http://arxiv.org/abs/1911.03347
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Raiyani, K., Gonçalves, T., Rato, L., Barão, M.: Mahalanobis distance based accuracy prediction models for sentinel-2 image scene classification. Int. J. Remote Sens. 1–26 (2022). https://doi.org/10.1080/01431161.2021.2013575
Raiyani, K., Gonçalves, T., Rato, L., Salgueiro, P., Marques da Silva, J.R.: Sentinel-2 image scene classification: a comparison between sen2cor and a machine learning approach. Remote Sens. 13(2) (2021). https://doi.org/10.3390/rs13020300, https://www.mdpi.com/2072-4292/13/2/300
Roy, N., McCallum, A.: Toward optimal active learning through monte Carlo estimation of error reduction. ICML, Williamstown 2, 441–448 (2001)
Seifert, C., Granitzer, M.: User-based active learning. In: 2010 IEEE International Conference on Data Mining Workshops, pp. 418–425. IEEE (2010)
Settles, B.: Active learning literature survey. Computer Sciences Technical report 1648, University of Wisconsin-Madison (2009). http://axon.cs.byu.edu/~martinez/classes/778/Papers/settles.activelearning.pdf
Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 1070–1079 (2008)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Shao, J., Wang, Q., Liu, F.: Learning to sample: an active learning framework. CoRR abs/1909.03585 (2019). http://arxiv.org/abs/1909.03585
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2(Nov), 45–66 (2001)
Tur, G., Hakkani-Tür, D., Schapire, R.E.: Combining active and semi-supervised learning for spoken language understanding. Speech Commun. 45(2), 171–186 (2005)
Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)
Wang, Z., Brenning, A.: Active-learning approaches for landslide mapping using support vector machines. Remote Sens. 13(13), 2588 (2021)
Yang, Y., Yin, X., Zhao, Y., Lei, J., Li, W., Shu, Z.: Batch mode active learning based on multi-set clustering. IEEE Access 9, 51452–51463 (2021). https://doi.org/10.1109/ACCESS.2021.3053003
Yu, G., et al.: CMAL: cost-effective multi-label active learning by querying subexamples. IEEE Trans. Knowl. Data Eng. 1 (2020). https://doi.org/10.1109/TKDE.2020.3003899
Yuan, W., Han, Y., Guan, D., Lee, S., Lee, Y.K.: Initial training data selection for active learning. In: Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, pp. 1–7 (2011)
Zhang, C., Chen, T.: An active learning framework for content-based information retrieval. IEEE Trans. Multimedia 4(2), 260–268 (2002)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Raiyani, K., Gonçalves, T., Rato, L. (2022). Abbreviating Labelling Cost for Sentinel-2 Image Scene Classification Through Active Learning. In: Pinho, A.J., Georgieva, P., Teixeira, L.F., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2022. Lecture Notes in Computer Science, vol 13256. Springer, Cham. https://doi.org/10.1007/978-3-031-04881-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-04881-4_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04880-7
Online ISBN: 978-3-031-04881-4
eBook Packages: Computer ScienceComputer Science (R0)