Skip to main content

Abbreviating Labelling Cost for Sentinel-2 Image Scene Classification Through Active Learning

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2022)

Abstract

Over the years, due to the enrichment of paired-label datasets, supervised machine learning has become an important part of any problem-solving process. Active Learning gains importance when, given a large amount of freely available data, there’s a lack of expert’s manual labels. This paper proposes an active learning algorithm for selective choice of training samples in remote sensing image scene classification. Here, the classifier ranks the unlabeled pixels based on predefined heuristics and automatically selects those that are considered the most valuable for improvement; the expert then manually labels the selected pixels and the process is repeated. The system builds the optimal set of samples from a small and non-optimal training set, achieving a predefined classification accuracy. The experimental findings demonstrate that by adopting the proposed methodology, 0.02% of total training samples are required for Sentinel-2 Image Scene Classification while still reaching the same level of accuracy reached by complete training data sets. The advantages of the proposed method is highlighted by a comparison with the state-of-the-art active learning method named entropy sampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html.

  2. 2.

    https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html.

References

  1. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Springer, Boston (2013). https://doi.org/10.1007/978-1-4757-0450-1

    Book  MATH  Google Scholar 

  2. Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. arXiv preprint arXiv:1309.0238 (2013)

  3. Carbonneau, M.A., Granger, E., Gagnon, G.: Bag-level aggregation for multiple-instance active learning in instance classification problems. IEEE Trans. Neural Netw. Learn. Syst. 30(5), 1441–1451 (2018)

    Article  Google Scholar 

  4. Chakraborty, S., Balasubramanian, V., Panchanathan, S.: Adaptive batch mode active learning. IEEE Trans. Neural Netw. Learn. Syst. 26(8), 1747–1760 (2015). https://doi.org/10.1109/TNNLS.2014.2356470

    Article  MathSciNet  Google Scholar 

  5. Citovsky, G., et al.: Batch active learning at scale. Adv. Neural Inf. Process. Syst. 34 (2021)

    Google Scholar 

  6. Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. CoRR cs.AI/9603104 (1996). https://arxiv.org/abs/cs/9603104

  7. Dao, P.D., Liou, Y.A.: Object-based flood mapping and affected rice field estimation with Landsat 8 OLI and MODIS data. Remote Sens. 7(5), 5077–5097 (2015)

    Article  Google Scholar 

  8. Dasgupta, S., Langford, J.: Active learning tutorial, icml 2009.” (2009)

    Google Scholar 

  9. De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.: The mahalanobis distance. Chemom. Intell. Lab. Syst. 50(1), 1–18 (2000)

    Article  Google Scholar 

  10. Devonport, A., Saoud, A., Arcak, M.: Symbolic abstractions from data: a PAC learning approach. arXiv preprint arXiv:2104.13901 (2021)

  11. Fu, Y., Li, B., Zhu, X., Zhang, C.: Active learning without knowing individual instance labels: a pairwise label homogeneity query approach. IEEE Trans. Knowl. Data Eng. 26(4), 808–822 (2014). https://doi.org/10.1109/TKDE.2013.165

    Article  Google Scholar 

  12. Fu, Y., Zhu, X., Li, B.: A survey on instance selection for active learning. Knowl. Inf. Syst. 35(2), 249–283 (2013)

    Article  Google Scholar 

  13. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  Google Scholar 

  14. Gu, Y., Jin, Z., Chiu, S.C.: Active learning with maximum density and minimum redundancy. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8834, pp. 103–110. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12637-1_13

    Chapter  Google Scholar 

  15. Gui, X., Lu, X., Yu, G.: Cost-effective batch-mode multi-label active learning. Neurocomputing 463, 355–367 (2021)

    Google Scholar 

  16. Hanneke, S.: Theory of disagreement-based active learning. Found. Trends® Mach. Learn. 7(2–3), 131–309 (2014). https://doi.org/10.1561/2200000037

  17. Hashem, N., Balakrishnan, P.: Change analysis of land use/land cover and modelling urban growth in greater Doha, Qatar. Ann. GIS 21(3), 233–247 (2015)

    Article  Google Scholar 

  18. Hauptmann, A.G., Lin, W.H., Yan, R., Yang, J., Chen, M.Y.: Extreme video retrieval: joint maximization of human and computer performance. In: Proceedings of the 14th ACM International Conference on Multimedia, pp. 385–394 (2006)

    Google Scholar 

  19. Hoi, S.C., Jin, R., Lyu, M.R.: Large-scale text categorization by batch mode active learning. In: Proceedings of the 15th international conference on World Wide Web, pp. 633–642 (2006)

    Google Scholar 

  20. Hüllermeier, E., Waegeman, W.: Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach. Learn. 110(3), 457–506 (2021)

    Article  MathSciNet  Google Scholar 

  21. Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Croft, B.W., van Rijsbergen, C.J. (eds.) SIGIR 1994, pp. 3–12. Springer, London (1994). https://doi.org/10.1007/978-1-4471-2099-5_1

    Chapter  Google Scholar 

  22. Liu, Y., Zhong, Y., Fei, F., Zhu, Q., Qin, Q.: Scene classification based on a deep random-scale stretched convolutional neural network. Remote Sens. 10(3), 444 (2018)

    Article  Google Scholar 

  23. Liu, Y.: Active learning with support vector machine applied to gene expression data for cancer classification. J. Chem. Inf. Comput. Sci. 44(6), 1936–1941 (2004)

    Article  Google Scholar 

  24. Lourentzou, I., Gruhl, D., Welch, S.: Exploring the efficiency of batch active learning for human-in-the-loop relation extraction. In: Companion Proceedings of the Web Conference 2018, pp. 1131–1138 (2018)

    Google Scholar 

  25. Mohajerani, S., Krammer, T.A., Saeedi, P.: A cloud detection algorithm for remote sensing images using fully convolutional neural networks. In: 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–5 (2018). https://doi.org/10.1109/MMSP.2018.8547095

  26. Nguyen, K.A., Liou, Y.A.: Mapping global eco-environment vulnerability due to human and nature disturbances. MethodsX 6, 862–875 (2019)

    Article  Google Scholar 

  27. Opitz, J., Burst, S.: Macro F1 and macro F1. CoRR abs/1911.03347 (2019). http://arxiv.org/abs/1911.03347

  28. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

  29. Raiyani, K., Gonçalves, T., Rato, L., Barão, M.: Mahalanobis distance based accuracy prediction models for sentinel-2 image scene classification. Int. J. Remote Sens. 1–26 (2022). https://doi.org/10.1080/01431161.2021.2013575

  30. Raiyani, K., Gonçalves, T., Rato, L., Salgueiro, P., Marques da Silva, J.R.: Sentinel-2 image scene classification: a comparison between sen2cor and a machine learning approach. Remote Sens. 13(2) (2021). https://doi.org/10.3390/rs13020300, https://www.mdpi.com/2072-4292/13/2/300

  31. Roy, N., McCallum, A.: Toward optimal active learning through monte Carlo estimation of error reduction. ICML, Williamstown 2, 441–448 (2001)

    Google Scholar 

  32. Seifert, C., Granitzer, M.: User-based active learning. In: 2010 IEEE International Conference on Data Mining Workshops, pp. 418–425. IEEE (2010)

    Google Scholar 

  33. Settles, B.: Active learning literature survey. Computer Sciences Technical report 1648, University of Wisconsin-Madison (2009). http://axon.cs.byu.edu/~martinez/classes/778/Papers/settles.activelearning.pdf

  34. Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 1070–1079 (2008)

    Google Scholar 

  35. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

    Article  MathSciNet  MATH  Google Scholar 

  36. Shao, J., Wang, Q., Liu, F.: Learning to sample: an active learning framework. CoRR abs/1909.03585 (2019). http://arxiv.org/abs/1909.03585

  37. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2(Nov), 45–66 (2001)

    Google Scholar 

  38. Tur, G., Hakkani-Tür, D., Schapire, R.E.: Combining active and semi-supervised learning for spoken language understanding. Speech Commun. 45(2), 171–186 (2005)

    Article  Google Scholar 

  39. Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)

    Article  Google Scholar 

  40. Wang, Z., Brenning, A.: Active-learning approaches for landslide mapping using support vector machines. Remote Sens. 13(13), 2588 (2021)

    Article  Google Scholar 

  41. Yang, Y., Yin, X., Zhao, Y., Lei, J., Li, W., Shu, Z.: Batch mode active learning based on multi-set clustering. IEEE Access 9, 51452–51463 (2021). https://doi.org/10.1109/ACCESS.2021.3053003

    Article  Google Scholar 

  42. Yu, G., et al.: CMAL: cost-effective multi-label active learning by querying subexamples. IEEE Trans. Knowl. Data Eng. 1 (2020). https://doi.org/10.1109/TKDE.2020.3003899

  43. Yuan, W., Han, Y., Guan, D., Lee, S., Lee, Y.K.: Initial training data selection for active learning. In: Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, pp. 1–7 (2011)

    Google Scholar 

  44. Zhang, C., Chen, T.: An active learning framework for content-based information retrieval. IEEE Trans. Multimedia 4(2), 260–268 (2002)

    Article  Google Scholar 

  45. Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  Google Scholar 

  46. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kashyap Raiyani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raiyani, K., Gonçalves, T., Rato, L. (2022). Abbreviating Labelling Cost for Sentinel-2 Image Scene Classification Through Active Learning. In: Pinho, A.J., Georgieva, P., Teixeira, L.F., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2022. Lecture Notes in Computer Science, vol 13256. Springer, Cham. https://doi.org/10.1007/978-3-031-04881-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04881-4_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04880-7

  • Online ISBN: 978-3-031-04881-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics