https://scholars.lib.ntu.edu.tw/handle/123456789/450935
標題: | Training set determination for genomic selection | 作者: | Ou, Jen-Hsiang CHEN-TUO LIAO |
公開日期: | 2019 | 來源出版物: | Theoretical and Applied Genetics | 摘要: | Key message: A new optimality criterion is proposed to determine a training set for genomic selection, which is derived from Pearson’s correlation between GEBVs and phenotypic values of a test set. R functions are provided to generate the optimal training set. Abstract: For a specified test set, we develop a highly efficient algorithm to determine an optimal subset from a large candidate set in which the individuals have been genotyped but not phenotyped yet. The chosen subset serves as a training set to be phenotyped, and then a genomic selection (GS) model is built based on its phenotype and genotype data. In this study, we consider the additive effects whole-genome regression model and adopt ridge regression estimation for marker effects in the GS model. The resulting GS model is then employed to predict genomic estimated breeding values (GEBVs) for the individuals of the test set, which have been genotyped only. We propose a new optimality criterion to determine the required training set, which is derived directly from Pearson’s correlation between GEBVs and phenotypic values of the test set. Pearson’s correlation is the standard measure for prediction accuracy of a GS model. Our proposed methods can be applied to data with the varying degree of population structure. All the R functions for implementing our training set determination algorithms are available from the R package TSDFGS. The algorithms are illustrated with two datasets which have strong (rice genome dataset) and mild (wheat genome dataset) population structures. Our methods are shown to be advantageous over existing ones, mainly because they fully use the genomic relationship between the test set and the training set by taking into account both the variance and bias for predicting GEBVs. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/450935 https://www2.scopus.com/inward/record.uri?eid=2-s2.0-85068777451&doi=10.1007%2fs00122-019-03387-0&partnerID=40&md5=5989e7a9fd9925176dfdb2ec43c2fec0 |
ISSN: | 0040-5752 | DOI: | 10.1007/s00122-019-03387-0 | SDG/關鍵字: | Forecasting; Population statistics; Regression analysis; Testing Additive effects; Determination algorithm; Optimal training; Optimality criteria; Population structures; Prediction accuracy; Regression model; Ridge regression algorithm; biological model; genetic selection; genetics; genomics; genotype; Oryza; phenotype; plant breeding; procedures; quantitative trait locus; wheat |
顯示於: | 農藝學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。