Genomic Selection in Tomato Breeding - University of Florida 2018/BreedingTech/Genomic...F r e m o n...

Post on 15-Aug-2021

2 views 0 download

Transcript of Genomic Selection in Tomato Breeding - University of Florida 2018/BreedingTech/Genomic...F r e m o n...

Genomic Selection in Tomato Breeding

David FrancisThe Ohio State University

(francis.77”at”osu.edu)

Genome-Wide Approaches:Examples from OSU Processing

program

Phenotype DataDistributionsANOVAPartitioning Variation (heritability)BLUPs

StructureQ Matrix (PCA)

Genotype DataMarker Matrix

Association Analysis to establish marker-trait linkage (Fixed effect)

Estimate Breeding Value (Random effect)

Kinship matrix

Y = μ REPy + Qw + Mα + Zv + Error

Marker analysis using “The Unified Mixed Model”

Kinship (marker or pedigree)

Structure

Marker Matrix (SNPs or GBS)

Marker-Trait Association Model

𝜸 = μ + Mα + ∈

Source DF Expected MS

Genotypes N-1 2 + b2(G)

Marker 1 2 + b[2(GQTL) + 4r(1-c)g2] + n(1 –2c)2g2

Gen(marker) N-2 2 + b[2(GQTL) + 4r(1-c)g2]

Error N(b-1) 2

Where b is the number of replicates

c is the recombination fraction separating the marker from the QTL

n is a coefficient related to the population size

g is the genetic effect (in BC pop’s additive and dominance effects

are confounded).

2(GQTL) is the part of the error variance that cannot be explained

by the QTL.

F-test = n(1 –2c)2g2

Significance of marker-trait associations is based

on the population size (n), recombination distance

between marker and QTL (c), and genetic effect

of QTL (g2).

X1[1:5, 1:8]M1 M2 M3 M4 M5 M6 M7 M8

1 1 0 0 1 -1 0 0 1

2 0 0 0 -1 1 0 1 1

3 1 0 0 1 -1 0 1 1

4 1 1 1 0 0 0 0 1

5 0 -1 -1 0 0 0 0 1

[96 x 384] x [384 1] = [96 1] [G x M } x [M] = Prediction for 96

Genotypes

X

• Predict performance• Selection [keep those

at K = 1 (~15% of the population)]

First Example: Disease Resistance (Bacterial Spot)Debora Menicos (Liabeuf)

Examine Contrasting approaches:

“Cornell School” - Many markers; impute missing marker data, optimize statistical model through lengthy analysis and simulation

“Minnesota School” - a few hundred markers well spaced across the genome; RR or Bayesian approaches work equally well (differences are slight)

solcap_snp_sl_600780solcap_snp_sl_150132solcap_snp_sl_204404

solcap_snp_sl_3011610

solcap_snp_sl_3690221

solcap_snp_sl_9751 solcap_snp_sl_2128034solcap_snp_sl_3456835

solcap_snp_sl_223440

solcap_snp_sl_244051

solcap_snp_sl_3177565solcap_snp_sl_1432367CL009293-068168

solcap_snp_sl_428375

Chromosome 1

solcap_snp_sl_584470

solcap_snp_sl_1284116

241_2F_264_241_2b_32solcap_snp_sl_1358134S_42736solcap_snp_sl_1355038SGN-U574837_snp399 solcap_snp_sl_25405SL10346_156

40

solcap_snp_sl_25429 solcap_snp_sl_2541842solcap_snp_sl_3579843solcap_snp_sl_2548545solcap_snp_sl_35955 solcap_snp_sl_35968solcap_snp_sl_66052

49

solcap_snp_sl_36037 CL015660-0224_solca50solcap_snp_sl_3363652Le001778_68_solcap_55Le001778_68_solcap56solcap_snp_sl_846457solcap_snp_sl_1495159solcap_snp_sl_843960solcap_snp_sl_8405 solcap_snp_sl_2032562solcap_snp_sl_838663solcap_snp_sl_1237268

Chromosome 2

solcap_snp_sl_67900

solcap_snp_sl_96638

solcap_snp_sl_968313

solcap_snp_sl_968918solcap_snp_sl_970320

solcap_snp_sl_565626

solcap_snp_sl_572233

solcap_snp_sl_2168544

solcap_snp_sl_2171451solcap_snp_sl_3565053

solcap_snp_sl_7940 solcap_snp_sl_793961solcap_snp_sl_791964solcap_snp_sl_1596066

SL10494_706_CL0091282solcap_snp_sl_20776 solcap_snp_sl_2075784solcap_snp_sl_2072385

Chromosome 3

Optimized set(s) of 384 SNPs for processing and fresh-market germplasm based on PIC and distribution in the genome

Resistance to X. euvesicatoria (T1) X. perforans (T3)

Race non-specific QTLRace Specific

Currently: Testing to see if additive models can be improved by incorporating non-additive effects

+ = better

prediction?

A note on marker numbers: GS models for bacterial spot resistance

Model Location 1 Location 2 Across Locationsrg/rp

Phenotypic Selection - - -

384 markers,Random model 0.81 0.36 0.6

15 linked markersRandom model 1.02 0.89 0.96

Mixed ModelFull marker set 1.11 0.91 1.01Linked = fixed

Second Example: Yield and quality traitsPredicting inbredsPredicting hybrids

Training Populations (Genotype and Phenotype)SolCAP (inbreds)Nested RIL (inbreds)

HybridsPredict genotype from inbred dataPredict performance using GW model(s) developed

for inbredsCompare prediction with actual performance

Prediction in tomato breeding populations1) Unstructured collection140 Advanced inbred-lines (SolCAP collection); 7,700 SNPs

2) A nested RIL: AxB; CXB; AXD (O x H; O x H, O x CA)280 progeny; 384 SNPs

Augmented Experimental designs (2 year, 2 locations)

Traits: Total traits measured: 52Yield, digital phenotyping and chemical meas.

Reduced to 22 most informative (h2, PCA and other methods)

• Yield (total and marketable)• Color and Color uniformity• BRIX• pH• Vitamin C• Fruit Size and Shape

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0 2 4 6 8 10Pro

p. N

o. 1 t

om

ato

es

Hue uniformity

Proportion No. 1 tomatoes VS Hue uniformity

Predicting inbred-line performance from inbred line dataData from SolCAP data (140 varieties, 7,700 markers)

Yield

Fruit Size

Predicting inbred-line performance from inbred line dataResults from nested RIL (280 x 384 markers)

Yield

Fruit Size

Hybrids: How does predicted performance relate to actual performance?

Ripe_kg_(vs)_MktYield P = 4.42E-05 R2 = 0.19958 (r = 0.44)

-0.02 -0.01 0.00 0.01 0.02 0.03 0.04 0.05

-4-2

02

46

8

Prediction

Yie

ld F

rem

ont

P = 0.03731 *R2 = 0.06r = 0.24

Selection Estimate

Phenotype 37.6

GS 1.2

GS + PH 4.3

Checks -8.2

Ph

che

ck

GS

Ph

_G

S

20

40

60

80

100

Yield

Ph

che

ck

GS

Ph

_G

S

40

50

60

70

80

90

100

Fruit Size

Ph

che

ck

GS

Ph

_G

S

4

5

6

7

8

BRIX

Similar results for other traits:Fruit Size was modeled, BRIX was not

Current efforts• Incorporate knowledge of linkage and gene action into models• Begin to incorporate hybrid data into training population and use new models

to predict hybrid performance• Continue work on Multi-trait index

1) Whole genome models have predictive capability for individual performance and hybrid performance; 2) Use of existing knowledge of gene-action and significant associations improves model performance; 3) models with 20-384 markers work well; 4) models are not a replacement, use in off season selection, and as a supplement for breeder knowledge

Thank you for your time.

AcknowledgmentsCollaborators, OSU

Debora Liabeuf

Eka Sari

Eduardo Bernal

Michael Dzakovich

Marcela Carvalho Andrade

Regis de Castro Carvalho

Troy Aldrich

Jihuen Sim

Caleb Orchard

Gabriel Abud

Elisabet Gas Pascal

Heather Merk

Sung-Chur Sim

Matt Robbins

Steve Schwartz

Rachel Kopec

Jessica Cooperstone

Luis Rodrigues-Saona

Sally Miller

Collaborators, CAUWencai Yang

Hui Wang

Collaborators, INRACeMathilde Causse

Collaborators, UIBHipolito Medrano

Pep Cifre

Josefina Bota

Miquel Angel Conesa

Collaborators,

IndustryCindy Lawley, Illumina

Martin Ganal, Trait

Genetics

Hirzel Canning

Red Gold Canning

Collaborators, CornellWalter de Jong

Lucas Mueller

Martha Mutschler

Collaborators, UCDAllen Van Deynze

Kevin Stoffel

Collaborators, MSUDavid Douches

C Robin Buell

John Hamilton

Dan Zarka

Kelly Zarka

Collaborators, UFLSam Hutton

Jay Scott