skip to main content
10.1145/3388818.3389155acmotherconferencesArticle/Chapter ViewAbstractPublication PagesivspConference Proceedingsconference-collections
research-article

An End-to-end System for Pests and Diseases Identification

Authors Info & Claims
Published:18 May 2020Publication History

ABSTRACT

The traditional pests and diseases identification methods do not work well for massive high-resolution remote sensing image data. Thus, we are expected to find an efficient way to automatically learn the presentations from the massive image data, and find the relationships among the data. This paper proposes an end-to-end system for pests and diseases identification in massive high-resolution remote sensing data based on deep learning. To achieve good performance on pests and diseases identification, this hierarchical model jointly learns the parameters of a neural network and the cluster assignments of the features. Our network named ClusterNet iteratively groups the features with a standard clustering algorithm k-means, and uses the subsequent assignments as supervision to update the weights of the network. Qualitatively, we only need to provide the remote sensing image of target area, and the system will automatically identify pests and diseases. This is more accurate and convenient compared to the traditional method of manual detection. Quantitatively the resulting model outperforms the traditional convolutional neutral networks on our pests and diseases remote sensing dataset.

References

  1. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026--1034).Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700--4708).Google ScholarGoogle ScholarCross RefCross Ref
  3. Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2011, June). Stacked convolutional auto-encoders for hierarchical feature extraction. In International Conference on Artificial Neural Networks (pp. 52--59). Springer, Berlin, Heidelberg.Google ScholarGoogle ScholarCross RefCross Ref
  4. Erhan, D., Bengio, Y., Courville, A., & Vincent, P. (2009). Visualizing higher-layer features of a deep network. University of Montreal, 1341(3), 1.Google ScholarGoogle Scholar
  5. Dumoulin, V., Belghazi, I., Poole, B., Mastropietro, O., Lamb, A., Arjovsky, M., & Courville, A. (2016). Adversarially learned inference. arXiv preprint arXiv:1606.00704.Google ScholarGoogle Scholar
  6. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics.Google ScholarGoogle Scholar
  7. Xu, L., Neufeld, J., Larson, B., & Schuurmans, D. (2005). Maximum margin clustering. In Advances in neural information processing systems (pp. 1537--1544).Google ScholarGoogle Scholar
  8. Yang, J., Parikh, D., & Batra, D. (2016). Joint unsupervised learning of deep representations and image clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5147--5156).Google ScholarGoogle ScholarCross RefCross Ref
  9. Lin, F., & Cohen, W. W. (2010). Power iteration clustering.Google ScholarGoogle Scholar
  10. Agrawal, P., Carreira, J., & Malik, J. (2015). Learning to see by moving. In Proceedings of the IEEE International Conference on Computer Vision (pp. 37--45).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Malisiewicz, T., Gupta, A., & Efros, A. (2011). Ensemble of exemplar-svms for object detection and beyond.Google ScholarGoogle Scholar
  12. Turk, M. A., & Pentland, A. P. (1991, June). Face recognition using eigenfaces. In Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 586--591). IEEE.Google ScholarGoogle Scholar
  13. Larsson, G., Maire, M., & Shakhnarovich, G. (2016, October). Learning representations for automatic colorization. In European Conference on Computer Vision (pp. 577--593). Springer, Cham.Google ScholarGoogle Scholar
  14. Noroozi, M., Pirsiavash, H., & Favaro, P. (2017). Representation learning by learning to count. In Proceedings of the IEEE International Conference on Computer Vision (pp. 5898--5906).Google ScholarGoogle ScholarCross RefCross Ref
  15. Van De Sande, K., Gevers, T., & Snoek, C. (2009). Evaluating color descriptors for object and scene recognition. IEEE transactions on pattern analysis and machine intelligence, 32(9), 1582--1596.Google ScholarGoogle Scholar
  16. Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007, June). Object retrieval with large vocabularies and fast spatial matching. In 2007 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1--8). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  17. de Sa, V. R. (1994). Learning classification with unlabeled data. In Advances in neural information processing systems (pp. 112--119).Google ScholarGoogle Scholar
  18. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2536--2544).Google ScholarGoogle ScholarCross RefCross Ref
  19. Noroozi, M., & Favaro, P. (2016, October). Unsupervised learning of visual representations by solving jigsaw puzzles. In European Conference on Computer Vision (pp. 69--84). Springer, Cham.Google ScholarGoogle Scholar
  20. Doersch, C., Gupta, A., & Efros, A. A. (2015). Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1422--1430).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zhang, R., Isola, P., & Efros, A. A. (2017). Split-brain autoencoders: Unsupervised learning by cross-channel prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1058--1067).Google ScholarGoogle ScholarCross RefCross Ref
  22. Zhang, R., Isola, P., & Efros, A. A. (2016, October). Colorful image colorization. In European conference on computer vision (pp. 649--666). Springer, Cham.Google ScholarGoogle Scholar
  23. Owens, A., Wu, J., McDermott, J. H., Freeman, W. T., & Torralba, A. (2016, October). Ambient sound provides supervision for visual learning. In European conference on computer vision (pp. 801--816). Springer, Cham.Google ScholarGoogle Scholar
  24. Wang, X., He, K., & Gupta, A. (2017). Transitive invariance for self-supervised visual representation learning. In Proceedings of the IEEE international conference on computer vision (pp. 1329--1338).Google ScholarGoogle ScholarCross RefCross Ref
  25. Doersch, C., & Zisserman, A. (2017). Multi-task self-supervised visual learning. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2051--2060).Google ScholarGoogle ScholarCross RefCross Ref
  26. Tao, C., Tan, Y., Cai, H. J., Du, B., & Tian, J. W. (2010). Object-oriented method of hierarchical urban building extraction from high-resolution remote-sensing imagery. Acta Geodaetica et Cartographica Sinica, 39(1), 39--45.Google ScholarGoogle Scholar
  27. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278--2324.Google ScholarGoogle ScholarCross RefCross Ref
  28. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105)Google ScholarGoogle Scholar
  29. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google ScholarGoogle Scholar
  30. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1--9).Google ScholarGoogle ScholarCross RefCross Ref
  31. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770--778).Google ScholarGoogle ScholarCross RefCross Ref
  32. Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 6848--6856).Google ScholarGoogle ScholarCross RefCross Ref
  33. Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492--1500).Google ScholarGoogle ScholarCross RefCross Ref
  34. Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7132--7141).Google ScholarGoogle ScholarCross RefCross Ref
  35. Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578.Google ScholarGoogle Scholar
  36. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929--1958.Google ScholarGoogle Scholar
  37. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2), 91--110.Google ScholarGoogle Scholar
  38. Bay, H., Tuytelaars, T., & Van Gool, L. (2006, May). Surf: Speeded up robust features. In European conference on computer vision (pp. 404--417). Springer, Berlin, Heidelberg.Google ScholarGoogle Scholar
  39. Wang, X., He, K., & Gupta, A. (2017). Transitive invariance for self-supervised visual representation learning. In Proceedings of the IEEE international conference on computer vision (pp. 1329--1338).Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An End-to-end System for Pests and Diseases Identification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        IVSP '20: Proceedings of the 2020 2nd International Conference on Image, Video and Signal Processing
        March 2020
        168 pages
        ISBN:9781450376952
        DOI:10.1145/3388818

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 May 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)3
        • Downloads (Last 6 weeks)1

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader