research-article

ViST: A Ubiquitous Model with Multimodal Fusion for Crop Growth Prediction

Authors:

Jinshan TangAuthors Info & Claims

ACM Transactions on Sensor Networks, Volume 20, Issue 1

Article No.: 23, Pages 1 - 23

https://doi.org/10.1145/3627707

Published: 07 December 2023 Publication History

Abstract

Crop growth prediction can help agricultural workers to make accurate and reasonable decisions on farming activities. Existing crop growth prediction models focus on one crop and train a single model for each crop. In this article, we develop a ubiquitous growth prediction model for multiple crops, aiming at training a single model for multiple crops. A ubiquitous vision and sensor transformer (ViST) model for crop growth prediction with image and sensor data is developed to achieve the goals. In the proposed model, a cross-attention mechanism is proposed to facilitate the fusion of multimodal feature maps to reduce computational costs and balance the interactive effects among features. To train the model, we combine the data from multiple crops to create a single (ViST) model. A sensor network system is established for data collection on the farm where rice, soybean, and maize are cultivated. Experimental results show that the proposed ViST model has an excellent ubiquitous ability for crop growth prediction with multiple crops.

References

[1]

Donald Gaydon and Christian Roth. 2014. SAC monograph: The SAARC-australia project-developing capacity in cropping systems modelling for south asia. SAARC Agriculture Centre Monograph, SAARC Agriculture Centre (SAC), BARC Campus, Farm Gate, Dhaka-1215, Bangladesh. 259 (2014)

[2]

N. Brisson, C. Gary, E. Justes, R. Roche, B. Mary, D. Ripoche, D. Zimmer, J. Sierra, P. Bertuzzi, P. Burger, F. Bussière, Y. M. Cabidoche, P. Cellier, P. Debaeke, J. P. Gaudillère, C. Hénault, F. Maraux, B. Seguin, and H. Sinoquet. 2003. An overview of the crop model stics. European Journal of Agronomy 18, 3 (2003), 309–332. DOI:

[3]

C. A. van Van Diepen, J. van Wolf, H. Van Keulen, and C. Rappoldt. 1989. WOFOST: A simulation model of crop production. Soil Use and Management 5, 1 (1989), 16-24.

[4]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.

[5]

Zilong Hu, Jinshan Tang, Ping Zhang, and Jingfeng Jiang. 2020. Deep learning for the identification of bruised apples by fusing 3D deep features for apple grading systems. Mechanical Systems and Signal Processing 145 (2020), 106922.

[6]

Kaili Wang, Keyu Chen, Huiyu Du, Shuang Liu, Jingwen Xu, Junfang Zhao, Houlin Chen, Yujun Liu, and Yang Liu. 2022. New image dataset and new negative sample judgment method for crop pest recognition based on deep learning models. Ecological Informatics 69 (2022), 101620. DOI:

[7]

Jibo Yue, Guijun Yang, Qingjiu Tian, Haikuan Feng, Kaijian Xu, and Chengquan Zhou. 2019. Estimate of winter-wheat above-ground biomass based on UAV ultrahigh-ground-resolution image textures and vegetation indices. ISPRS Journal of Photogrammetry and Remote Sensing 150 (2019), 226–244. DOI:

[8]

Mehmet Ozgur Turkoglu, Stefano D’Aronco, Gregor Perich, Frank Liebisch, Constantin Streit, Konrad Schindler, and Jan Dirk Wegner. 2021. Crop mapping from image time series: Deep learning with multi-scale label hierarchies. Remote Sensing of Environment 264 (2021), 112603. DOI:

[9]

Miroslav Trnka, Josef Eitzinger, Pavel Kapler, Martin Dubrovský, Daniela Semerádová, Zdeněk Žalud, and Herbert Formayer. 2007. Effect of estimated daily global solar radiation data on the results of crop growth models. Sensors 7, 10(2007), 2330–2362. DOI:

[10]

Tanhim Islam, Tanjir Alam Chisty, and Amitabha Chakrabarty. 2018. A deep neural network approach for crop selection and yield prediction in bangladesh. In Proceedings of the 2018 IEEE Region 10 Humanitarian Technology Conference. 1–6. DOI:

[11]

Omolola M. Adisa, Joel O. Botai, Abiodun M. Adeola, Abubeker Hassen, Christina M. Botai, Daniel Darkey, and Eyob Tesfamariam. 2019. Application of artificial neural network for predicting maize production in south africa. Sustainability 11, 4 (2019), 1145.

[12]

Jing Liu, C. E. Goering, and Lei Tian. 2001. A neural network for setting target corn yields. Transactions of the ASAE 44, 3 (2001), 705.

[13]

Kanichiro Matsumura, Carlos F. Gaitan, Kenji Sugimoto, Alex J. Cannon, and William W. Hsieh. 2015. Maize yield forecasting by linear regression and artificial neural networks in jilin, China. The Journal of Agricultural Science 153, 3 (2015), 399–410.

[14]

Zhiyuan Pei Bangjie Yang. 1999. The definition of crop growth and remote sensing monitoring. Transactions of the CSAE03 (1999), 214–218.

[15]

Toby N. Carlson and David A. Ripley. 1997. On the relation between NDVI, fractional vegetation cover, and leaf area index. Remote Sensing of Environment 62, 3 (1997), 241–252.

[16]

Zhe Guo, Xiang Li, Heng Huang, Ning Guo, and Quanzheng Li. 2019. Deep learning-based image segmentation on multimodal medical imaging. IEEE Transactions on Radiation and Plasma Medical Sciences 3, 2 (2019), 162–169.

[17]

Yi Xiao, Felipe Codevilla, Akhil Gurram, Onay Urfalioglu, and Antonio M. López. 2022. Multimodal end-to-end autonomous driving. IEEE Transactions on Intelligent Transportation Systems 23, 1 (2022), 537–547. DOI:

Digital Library

[18]

Zhongxin Chen, Huajun Tang Jianqiang Ren, Pei Leng Yun Shi, Limin Wang Jia Liu, Yanmin Yao Wenbin Wu, and Hasituya. 2016. Progress and prospect of agricultural remote sensing research and application. Journal of Remote Sensing 20, 05 (2016), 748–767.

[19]

Jun Xu, Yong Yan Cao, Sun Youxian, and Jinshan Tang. 2008. Absolute exponential stability of recurrent neural networks with generalized activation function. IEEE Transactions on Neural Networks 19, 6 (2008), 1075-1089.

Digital Library

[20]

Jinshan Tang, Xiaoming Liu, Huaining Cheng, and Kathleen M. Robinette. 2011. Gender recognition using 3-D human body shapes. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 41 (2011), 898–908. DOI:

Digital Library

[21]

Chaonan Chen, Kai Zhang, and Jinshan Tang. 2022. A COVID-19 detection algorithm using deep features and discrete social learning particle swarm optimization for edge computing device. ACM Transaction on Internet Technology 22, 3 (2022), 1-7.

Digital Library

[22]

Michael D. Johnson, William W. Hsieh, Alex J. Cannon, Andrew Davidson, and Frédéric Bédard. 2016. Crop yield forecasting on the canadian prairies by remotely sensed vegetation indices and machine learning methods. Agricultural and Forest Meteorology 218 (2016), 74–84.

[23]

Liheng Zhong, Lina Hu, and Hang Zhou. 2019. Deep learning based multi-temporal crop classification. Remote Sensing of Environment 221 (2019), 430–443.

[24]

Qi Yang, Liangsheng Shi, Jinye Han, Yuanyuan Zha, and Penghui Zhu. 2019. Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images. Field Crops Research 235 (2019), 142–153.

[25]

Ulrich Weiss and Peter Biber. 2011. Plant detection and mapping for agricultural robots using a 3D LIDAR sensor. Robotics and Autonomous Systems 59, 5 (2011), 265–273. DOI:

Digital Library

[26]

Huilin Tao, Haikuan Feng, Liangji Xu, Mengke Miao, Huiling Long, Jibo Yue, Zhenhai Li, Guijun Yang, Xiaodong Yang, and Lingling Fan. 2020. Estimation of crop growth parameters using UAV-based hyperspectral remote sensing data. Sensors 20, 5 (2020), 1296. DOI:

[27]

X. Zhou, H. B. Zheng, X. Q. Xu, J. Y. He, X. K. Ge, X. Yao, T. Cheng, Y. Zhu, W. X. Cao, and Y. C. Tian. 2017. Predicting grain yield in rice using multi-temporal vegetation indices from UAV-based multispectral and digital imagery. ISPRS Journal of Photogrammetry and Remote Sensing 130 (2017), 246–255. DOI:

[28]

Maitiniyazi Maimaitijiang, Abduwasit Ghulam, Paheding Sidike, Sean Hartling, Matthew Maimaitiyiming, Kyle Peterson, Ethan Shavers, Jack Fishman, Jim Peterson, Suhas Kadam, Joel Burken, and Felix Fritschi. 2017. Unmanned aerial system (UAS)-based phenotyping of soybean using multi-sensor data fusion and extreme learning machine. ISPRS Journal of Photogrammetry and Remote Sensing 134 (2017), 43–58. DOI:

[29]

Liang Wan, Haiyan Cen, Jiangpeng Zhu, Jiafei Zhang, Yueming Zhu, Dawei Sun, Xiaoyue Du, Li Zhai, Haiyong Weng, Yijian Li, Xiaoran Li, Yidan Bao, Jianyao Shou, and Yong He. 2020. Grain yield prediction of rice using multi-temporal UAV-based RGB and multispectral images and model transfer – a case study of small farmlands in the south of China. Agricultural and Forest Meteorology 291 (2020), 108096. DOI:

[30]

Jérôme G. Fortin, François Anctil, Léon-Étienne Parent, and Martin A. Bolinder. 2011. Site-specific early season potato yield forecast by neural network in eastern canada. Precision Agriculture 12, 6 (2011), 905–923.

[31]

C. A. Campbell, R. P. Zentner, and P. J. Johnson. 1988. Effect of crop rotation and fertilization on the quantitative relationship between spring wheat yield and moisture use in southwestern saskatchewan. Canadian Journal of Soil Science 68, 1 (1988), 1–16.

[32]

David L. Ehret, Bernard D. Hill, Tom Helmer, and Diane R. Edwards. 2011. Neural network modeling of greenhouse tomato yield, growth and water use from automated crop monitoring data. Computers and Electronics in Agriculture 79, 1 (2011), 82–89.

Digital Library

[33]

Snehal S. Dahikar and Sandeep V. Rode. 2014. Agricultural crop yield prediction using artificial neural network approach. International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering 2, 1 (2014), 683–686.

[34]

Monte R. O'Neal, Bernard A. Engel, Daniel R. Ess, and Jane R. Frankenberger. 2002. AE--Automation and emerging technologies: Neural network prediction of maize yield using alternative data coding algorithms. Biosystems Engineering 83, 1 (2002), 31-45.

[35]

T Morimoto, Y. Ouchi, M. Shimizu, and M. S. Baloch. 2007. Dynamic optimization of watering satsuma mandarin using neural networks and genetic algorithms. Agricultural Water Management 93, 1–2 (2007), 1–10.

[36]

Scott Drummond, Anupam Joshi, and Kenneth A. Sudduth. 1998. Application of neural networks: Precision farming. In Proceedings of the 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98CH36227), Vol. 1. IEEE, 211–215.

[37]

N. R. Kitchen, S. T. Drummond, E. D. Lund, K. A. Sudduth, and G. W. Buchleiter. 2003. Soil electrical conductivity and topography related to yield for three contrasting soil– crop systems. Agronomy Journal 95, 3 (2003), 483–495.

[38]

Francisco M. Padilla, Marisa Gallardo, M. Teresa Peña-Fleitas, Romina De Souza, and Rodney B. Thompson. 2018. Proximal optical sensors for nitrogen management of vegetable crops: A review. Sensors 18, 7 (2018), 2083.

[39]

Saptarshi Sengupta, Sanchita Basak, Pallabi Saikia, Sayak Paul, Vasilios Tsalavoutis, Frederick Atiah, Vadlamani Ravi, and Alan Peters. 2020. A review of deep learning with special emphasis on architectures, applications and recent trends. Knowledge-Based Systems 194 (2020), 105596.

[40]

Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2 (2019), 423–443. DOI:

Digital Library

[41]

Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2018. Challenges and applications in multimodal machine learning. Association for Computing Machinery and Morgan & Claypool, 17-48.

[42]

Ben P. Yuhas, Moise H. Goldstein, and Terrence J. Sejnowski. 1989. Integration of acoustic and visual speech signals using neural networks. IEEE Communications Magazine 27, 11 (1989), 65–71.

Digital Library

[43]

Cees G. M. Snoek and Marcel Worring. 2005. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications 25, 1 (2005), 5–35.

Digital Library

[44]

Jing Chen, Chenhui Wang, Kejun Wang, Chaoqun Yin, Cong Zhao, Tao Xu, Xinyi Zhang, Ziqiang Huang, Meichen Liu, and Tao Yang. 2021. HEU emotion: A large-scale database for multimodal emotion recognition in the wild. Neural Computing and Applications 33, 14 (2021), 8669–8685.

Digital Library

[45]

Zhou Lei and Yiyong Huang. 2021. Video captioning based on channel soft attention and semantic reconstructor. Future Internet 13, 2 (2021), 55.

[46]

Yu Long, Pengjie Tang, Hanli Wang, and Jian Yu. 2021. Improving reasoning with contrastive visual information for visual question answering. Electronics Letters 57, 20 (2021), 758–760.

[47]

Rafael Souza, André Fernandes, Thiago SFX Teixeira, George Teodoro, and Renato Ferreira. 2021. Online multimedia retrieval on CPU–GPU platforms with adaptive work partition. Journal of Parallel and Distributed Computing 148 (2021), 31–45.

[48]

Amir Hossein Yazdavar, Mohammad Saeid Mahdavinejad, Goonmeet Bajaj, William Romine, Amit Sheth, Amir HassanMonadjemi, Krishnaprasad Thirunarayan, John M. Meddar, Annie Myers, Jyotishman Pathak, Jyotishman. 2020. Multimodal mental health analysis in social media. Plos One 15, 4 (2020), e0226248.

[49]

Yu Huang, Chenzhuang Du, Zihui Xue, Xuanyao Chen, Hang Zhao, and Longbo Huang. 2021. What makes multi-modallearning better than single (provably). Advances in Neural Information Processing Systems 34 (2021), 10944-10956.

[50]

Chaoya Dang, Ying Liu, Hui Yue, JiaXin Qian, and Rong Zhu. 2021. Autumn crop yield prediction using data-driven approaches:-Support vector machines, random forest, and deep neural network methods. Canadian Journal of Remote Sensing 47, 2 (2021), 162–181.

[51]

Zheng Chu and Jiong Yu. 2020. An end-to-end model for rice yield prediction using deep learning fusion. Computers and Electronics in Agriculture 174 (2020), 105471. DOI:

[52]

Maitiniyazi Maimaitijiang, Vasit Sagan, Paheding Sidike, Sean Hartling, Flavio Esposito, and Felix B. Fritschi. 2020. Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sensing of Environment 237(2020), 111599. DOI:

[53]

Hind Taud and J. F. Mas. 2018. Multilayer perceptron (MLP). Geomatic approachesfor modeling land change scenarios (2018), 451-455.

[54]

J. G. P. W. Clevers and A. A. Gitelson. 2013. Remote estimation of crop and grass chlorophyll and nitrogen content usingred-edge bands on Sentinel-2 and -3. International Journal of Applied Earth Observation and Geoinformation 23 (2013), 344-351.

[55]

Aditya Prakash, Kashyap Chitta, and Andreas Geiger. 2021. Multi-modal fusion transformer for end-to-end autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7077-7087.

[56]

Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, and Chen Sun. 2021. Attention bottlenecks for Multimodal fusion. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, and J. Wortman Vaughan (Eds.). Vol. 34. Curran Associates, Inc., 14200-14213.

[57]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems. I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30, Curran Associates, Inc.

[58]

Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, et al. 2022. A survey on vision transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 1 (2022), 87-110.

[59]

Frederick N. Fritsch and Ralph E. Carlson. 1980. Monotone piecewise cubic interpolation. SIAM Journal on Numerical Analysis 17, 2 (1980), 238–246.

Digital Library

[60]

Rutuja R. Patil and Sumit Kumar. 2022. Rice-fusion: A multimodality data fusion framework for rice disease diagnosis. IEEE Access : Practical Innovations, Open Solutions 10 (2022), 5207–5222. DOI:

[61]

Rutuja Rajendra Patil and Sumit Kumar. 2022. Rice transformer: A novel integrated management system for controlling rice diseases. IEEE Access : Practical Innovations, Open Solutions 10 (2022), 87698–87714. DOI:

[62]

Zhengtao Li, Guokun Chen, and Tianxu Zhang. 2020. A CNN-transformer hybrid approach for crop classification using multitemporal multisensor images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13 (2020), 847–858. DOI:

Cited By

Wen HSong XChen XWei YNie LChua THui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657727(229-239)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657727

Index Terms

ViST: A Ubiquitous Model with Multimodal Fusion for Crop Growth Prediction
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Unmanned aerial vehicle-based field phenotyping of crop biomass using growth traits retrieved from PROSAIL model
Highlights
- Cab, LAI and CCC were indicative of variability in biomass.
- UAV-based images ...
Abstract
Unmanned aerial vehicle (UAV) platform has been perceived as a useful tool for high-throughput field phenotyping of crop growth traits. While interpretation of UAV image data and retrieval of reliable and accurate phenotypic ...
The use of terrestrial LiDAR to monitor crop growth and account for within-field variability of crop coefficients and water use
Highlights
- A terrestrial LiDAR method was used for crop growth and water use estimates.
- ...
Abstract
Monitoring the spatio-temporal distribution of crop height and biomass is important for crop management in terms of applying irrigation, fertilizers and pesticides. This paper reports the performance of a terrestrial laser scanner (TLS)...
Enhanced biomass prediction by assimilating satellite data into a crop growth model

Complex crop growth models (CGM) require a large number of input parameters, which can cause large errors if they are uncertain. Furthermore, they often lack spatial information. The coupling of a CGM with a radiative transfer model offers the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Sensor Networks

ACM Transactions on Sensor Networks Volume 20, Issue 1

January 2024

717 pages

EISSN:1550-4867

DOI:10.1145/3618078

Editor:
Yunhao Liu
Tsinghua University, China

Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 07 December 2023

Online AM: 28 October 2023

Accepted: 05 October 2023

Revised: 21 September 2023

Received: 28 December 2022

Published in TOSN Volume 20, Issue 1

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

New Generation Artificial Intelligence Program
Heilongjiang NSF
Fundamental Research Funds

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
364
Total Downloads

Downloads (Last 12 months)310
Downloads (Last 6 weeks)43

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wen HSong XChen XWei YNie LChua THui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657727(229-239)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657727

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents