ABSTRACT
DNA sequence contains a lot of genetic information, and a variety of phenotypic information of the organism can be obtained by analyzing its single nucleotide polymorphism (SNP). Experiments have shown that 1000-grain weight of rapeseed is positively correlated with its oil yield. In this paper, the 1000-grain weight of rapeseed at maturity is predicted by the genetic data of rapeseed, so as to control the oil yield of rapeseed. When analyzing and processing high-dimensional genetic data, the author proposes a deep learning method—auto-encoder for dimensionality reduction processing of high-dimensional genetic data, and compares it with the traditional principal component analysis method. As the number of features of genetic data is much higher than the number of samples, the data generated by using the auto-encoder for dimensionality reduction cannot completely present the effective information of the original genetic data, and dimension disasters are prone to occur. In view of the above problems, this paper optimized the auto-encoder network. The experimental results showed that the proposed optimization method could reduce the MAE error by 0.1344, and the MAE error was 0.3824, indicating that there was an error of 0.3824g between the predicted value and the measured value of 1000-grain weight of rapeseed.
Step 2: The thousand-grain weight prediction using neural network mainly includes two parts: 1. Use dimensionality reduction neural network to reduce the dimension of high-dimensional genetic data, and extract the deep relationship in the data at the same time; 2. Use linear regression to perform 1000-grain weight prediction on the new data after dimensionality reduction.
- Wang HZ. Historical review and outlook of the development of oilseed rape industry in China. Chinese Journal of Oilseed Crops, 2010, 32: 300-302.Google Scholar
- Tian X, Li N, Jack D, Li J, Andrei K, Bevan M W, Gao F, Li Y H. The ubiquitin receptor DA1 interacts with the E3 ubiquitin ligase DA2 to regulate seed and organ size in Arabidopsis. plant Cell,2013, 25: 3347-3359.Google Scholar
- Bhat, J.A., Genomic selection in the era of next generation sequencing for complex traits in plant breeding. front Genet 2016; 7:221.Google Scholar
- Jian Hongju, Wei Lijuan, Li Chao, Tang Zhanglin, Li Ghana, Liu Liezhao. SNP-based genetic mapping for locating thousand grain weight QTL loci in kale type oilseed rape. Chinese Agricultural Science, 2014, 47:3953-3961.Google Scholar
- Chen L L, Wan H P, Qian J L, Guo J B, Sun C M, Wen J, Yi B, Ma C Z, Tu J X, Song L Q, Fu T D, Shen J X. Genome-wide association study of cadmium accumulation at the seedling stage in rapeseed (Brassica napus L.). Front Plant Sci, 2018, 9: 375-389.Google ScholarCross Ref
- Endelman, J.B. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 2011;4(3):250-255.Google Scholar
- Van Eeuwijk, F.A., Detection and use of QTL for complex traits in multiple environments. Curr Opin Plant Biol 2010;13(2):193-205.Google Scholar
- Ma Wenlong,Qiu Zhixu,Song Jie,Li Jiajia,Cheng Qian,Zhai Jingjing,Ma Chuang. a deep convolutional neural network approach for predicting phenotypes from genotypes.[J]. Planta,2018,{4}{5}.Google Scholar
- Roorkiwal, M., Genome-enabled prediction models for yield related traits in chickpea. Front Plant Sci 2016; 7:1666.Google Scholar
- Crossa, J., Genomic prediction of gene bank wheat landraces. G3 (Bethesda) 2016;6(7):1819-1834.Google Scholar
- Xiao Pu, Yin Yiming. A SIFT algorithm based on PCA dimensionality reduction[J]. Modern Computer,2020(34):15-18+43.Google Scholar
- Zeng Qingyao, Zheng Qianying, Yu Jinling. A latent variable space decoupling method for variational autoencoders [J]. Journal of Fuzhou University (Natural Science Edition), 2022,50(03):337-344.Google Scholar
Index Terms
- Prediction of 1000-grain Weight of Rapeseed Based on Auto-encoder
Recommendations
A new approach for gene prediction using comparative sequence analysis
SAC '05: Proceedings of the 2005 ACM symposium on Applied computingThe availability of large fragments of genomic DNA makes it possible to apply comparative genomics for identification of protein-coding regions. In this work, a comparative analysis is conducted on homologous genomic sequences of organisms with ...
Auto-encoder based dimensionality reduction
Auto-encoder-a tricky three-layered neural network, known as auto-association before, constructs the "building block" of deep learning, which has been demonstrated to achieve good performance in various domains. In this paper, we try to investigate the ...
Dimensionality reduction strategy based on auto-encoder
ICIMCS '15: Proceedings of the 7th International Conference on Internet Multimedia Computing and ServiceAuto-encoder is a tricky three-layered neural network, which constructs the "building block" of deep learning that has been demonstrated to achieve good performance in various domains. In this paper, we focus on auto-encoder's dimensionality reduction ...
Comments