Session 1 - Review of Basic ConceptsSession 2 - Linear ...€¦ · Session 1 - Review of Basic...

Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11

Linear Model

The classical linear model is defined by

Y = Xβ + ε

where

Y is an observable data (response variable) vector

β is a vector of unknown parameters

X is the design matrix (for factors and regressors)

ε is a vector of random errors and ε ∼ N(0, σ2I)

ThenE(Y) = Xβ and Var(Y) = σ2I

The ordinary least-squares estimator (the same as MLE) of β is

β = (X′X)−1X′Y

Disadvantages

too restrictive for most of typical data sets

the error-structure in real-world experiments is often more complexthan Σ = σ2I

Clarice G.B. Demetrio and Cristian Villegas 1 Modelos Mistos e Componentes de Variancia


Expectation and variance - properties

1. Expected value

Definition

The expected value or mean of a random variable Y , denoted by E(Y ) isdefined by

E(Y ) =

∫ +∞

−∞y fY (y) d y .

PropertiesLet X and Y be two random variables and a and b ∈ R be constants.Then

1 E(a) = a.2 E(aX ) = aE (X ).3 E(aX ± bY ) = aE(X )± bE(Y ).4 E(aX ± b) = aE(X )± b.5 E[(X − a)2] = E(X 2)− 2aE(X ) + a2.6 E(XY ) = E(X )E(Y ), for X and Y independent random variables.7 E

(∑ni=1 Yi

)=∑n

i=1 E(Yi )



2. Variance

Definition

Let Y be a random variable and let assume that µ = E(Y ) exists. Thevariance of Y is the number denoted by Var(Y) and defined by

Var(Y ) = E(Y − µ)2 = E[Y − E(Y )]2 = E(Y 2)− E(Y )2 ≥ 0

Note

The variance for a continuous random variable Y is calculated by

Var(Y ) =

∫ +∞

−∞(y − µ)2 fY (y) d y

or

Var(Y ) =

∫ +∞

−∞y 2 fY (y) d y −

[∫ +∞

−∞y fY (y) d y

]2



PropertiesLet X and Y be two random variables and a and b ∈ R be constants.Then

1 Var(aY + b) = a2Var(Y )

2 Var(a) = 0

3 Var(aY ) = a2Var(Y )

4 Var(−Y ) = Var(Y )

5 Var(X ± Y ) = Var(X )± Var(Y ), for X and Y independent randomvariables.

6 Var

(n∑

i=1

aiYi

)=∑n

i=1 a2i Var(Yi ), for Yi independent random

variables.



3. Covariance

Definition

The covariance between Y and Z is defined by

Cov(Y ,Z ) = E(YZ )− E(Y )E(Z ).

Properties

1 Cov(aY , bZ ) = abCov(Y ,Z )

2

n∑i=1

Cov(aiYi , biZi ) =n∑

i=1

aibiCov(Yi ,Zi )

3 Var

(n∑

i=1

aiYi

)=

n∑i=1

a2i Var(Yi ) + 2

∑i<i ′

aiai ′Cov(Yi ,Yi ′)



Explanatory variables

2 types of explanatory variables:1 factors↪→ interest is in attributing variability in y to various categories ofthe factorExample: corn yields from two replicates of three varieties (A/B/C)in a completely randomized design

Yij = µ+ τi + εij i = 1, 2, 3 j = 1, 2↪→ In matrix notation, this model can be expressed as: y1

y2

y3

=

12

12

12

µ+

12 02 02

02 12 02

02 02 12

τ1

τ2

τ3

+

ε1

ε2

ε3

yi = [yi1, yi2]′ is the vector of observations of variety i ; 12 and 02 are2-dimensional column vectors of 1′s and 0′s, respectively; andεi = [εi1, εi2]′ is the vector of residuals associated with variety i .↪→ parameter values give the impact of factor’s levels on theresponse variablefactors may be crossed or nestedfactors may have main effect and interaction effect



2 regressors↪→ interest is in attributing variability in Y to changes in values of acontinuous covariable

Example: changes due to weight x

Yi = β0 + β1xi + εi

↪→ In matrix notation, this model can be expressed as:y1

y2

· · ·yn

=

1 x1

1 x2

· · · · · ·1 xn

[ β0

β1

]+

ε1

ε2

· · ·εn

↪→ parameter values give the impact of an increase in x on theresponse variable



Terminology :

Multiple Linear Regression/ANOVA/ANCOVA

if matrix X contains only regressors, models are called regressionmodels

if matrix X contains only factors, model are called Analysis ofVariance (ANOVA) (X is a matrix with 1’s and 0’s) models.

if matrix X contains both regressors and factors, models are calledAnalysis of Covariance (ANCOVA) models.



Estimation

Let’s assume a linear model :

Y = Xβ + ε

Parameters to be estimated are β, σIn all the following, X is supposed of full rank: rank(X)= K

Least squares approach : min(||Y − Xβ||2)

βls = (X′X)−1X′Y

best linear unbiased estimator of β

βls ∼ N (β, σ2(X′X)−1)

best quadratic unbiased estimator of σ2

σ2ls =

1

n − K(Y − Xβls)′(Y − Xβls) and σ2

ls ∼σ2

n − Kχ2

(n−K)



Maximum likelihood approach

Likelihood

L(β, σ; y) =n∏

i=1

1√2πσ2

e−1

2σ2 (yi−x′i β)′(yi−x′i β)

Log-likelihood

`(β, σ; y) = −n

2log (2πσ2)− 1

2σ2(y − Xβ)′(y − Xβ)

Maximum log-likelihood

∂β,σ`(β, σ, y) = 0⇒

βml = (X′X)−1X′Y

σ2ml =

1

n(Y − Xβ)′(Y − Xβ)



βls = βml , unbiased

E[βls ] = E[βml ] = E[(X′X)−1X′Y] = β

but σ2ls 6= σ2

ml

σ2ls is unbiasedσ2ls is calculated on the orthogonal space of Xσ2ls takes into account the difference between Y and its projection Xβ

on X and the lost of degrees of freedom due to the estimation of β

σ2ml is biased

joint estimation of σ2 and βit does not take into account the lost in degrees of freedom due tothe estimation of β

Note thatE[(Y − Xβ)′(Y − Xβ)] = E{Y′[I− (X′X)−1X′]Y}σ2 = (n − K )σ2

Then E(σ2ls) = σ2 and E(σ2

ml) = n−Kn σ2



Example 1: Linear Regression

yi = β0 + β1Xi + εi

XTX =

[1 1 . . . 1X1 X2 . . . Xn

]1 X1

1 X2

· · · · · ·1 Xn

=

[n

∑ni=1 Xi∑n

i=1 Xi

∑ni=1 X 2

i

],

(XTX)−1 =1

n∑n

i=1 X 2i − (

∑ni=1 Xi )2

[ ∑ni=1 X 2

i −∑n

i=1 Xi

−∑n

i=1 Xi n

]=

1

n∑n

i=1 x2i

[ ∑ni=1 X 2

i −∑n

i=1 Xi

−∑n

i=1 Xi n

]where n

∑ni=1 X 2

i − (∑n

i=1 Xi )2 = n

∑ni=1 x2

i . Also,

XTY =

[1 1 . . . 1X1 X2 . . . Xn

]Y1

Y2

· · ·Yn

=

[ ∑ni=1 Yi∑n

i=1 XiYi

].



Then the normal equations system is[n

∑ni=1 Xi∑n

i=1 Xi

∑ni=1 X 2

i

] [β0

β1

]=

[ ∑ni=1 Yi∑n

i=1 XiYi

]and the least square estimator θ = (XTX)−1XTY is given by

θ =

[β0

β1

]=

1

n∑n

i=1 x2i

[ ∑ni=1 X 2

i −∑n

i=1 Xi

−∑n

i=1 Xi n

] [ ∑ni=1 Yi∑n

i=1 XiYi

]

=

Y − β1X

n∑n

i=1 XiYi −∑n

i=1 Xi

∑ni=1 Yi

n∑n

i=1 x2i

=

Y − β1X∑n

i=1 xiYi∑ni=1 x2

i

where xi = Xi − X



Example 2: Completely randomized design

Yij = µ+ τi + εij = µi + εij

For the general case of a set of t Treatments suppose Y is orderedso that all observations for:

1st treatment occur in the first r1 rows,2nd treatment occur in the next r2 rows,and so on with the last treatment occurring in the last rt rows.

Y =

Y11

Y12

. . .Y1r1

. . .Ytrt

,XT =

1r1 0r1×1 . . . 0r1×1

0r2×1 1r2 . . . 0r2×1

. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt

,θ =

µ1

µ2

· · ·µt

where 1ri is the ri × 1 column vector of ones and 0ri is the ri × 1column vector of zeroes (1 only ever vector, but 0 can be matrix.)



From the theory of linear models (see chapter XI, Chris Brien’snotes) the OLS estimators for θ and ΨT = E(Y) = XTθ are givenby

θ = (X′TXT )−1X′TY and ΨT = XT θ = MTY = T

where

X′TXT =

r1 0 . . . 00 r2 . . . 0. . . . . . . . . . . .0 0 . . . rt

,

MT = XT (X′TXT )−1X′T =

1r1

Jr1 0r1×r2 . . . 0r1×rt0r2×r1

1r2

Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1

rtJrt

It can be shown, by examining the OLS equation, that the estimatorsof the elements of θ and Ψ are the means of the treatments.

θ = [µ1, µ2, · · · , µt ]T = [T1, T2, · · · , Tt ]

T , Ti =

∑rij=1

riYij



Note that T = MTY is a vector with the first r1 elements of equalto the mean of the Yi ’s for the first treatment, the next r2 elementsequal to the mean of those for the second treatment and so on.

MT is called the treatment mean operator as it computes thetreatment means from the vector to which it is applied and replaceseach element of this vector with its treatment mean.

ΨT = XT θ = XT (X′TXT )−1X′TY = MTY = T =

T1

. . .T1

. . .Tt

. . .Tt

=

T11r1

T21r2

. . .Tt1rt

For the observed values y of Y, t = MTy is the estimate of ΨT .



Other types of restriction

∑ti=1 τi = 0

θ = [µ, τ1, τ2, · · · , τt ]T = [Y , T1 − Y , T2 − Y , · · · , Tt − Y ]T

τ1 = 0

θ = [µ, τ2, · · · , τt ]T = [T1, T2 − T1, · · · , Tt − T1]T

but, ΨT will be the same in any case.



Sums of squares for the analysis of variance

LM theory: Y′Y = Y′HY + Y′(I−H)Y =1

nY′JY + α′X′TY

From Chapter XII of Chris Brien’s notes,an SSq is the SSq of the elements of a vector andcan be written as the product of transpose of a column vector withoriginal column vector.

For a completely randomized design, the sums of squares in the analysisof variance for Units, Treatments and Residual are given by the quadraticforms, respectively,

Y′QUY Y′QTY and Y′QUResY

where QU = MU −MG , QT = MT −MG , andQURes

= MU −MT , MU = In, MG = 1nJn and


1r1

Jr1 0r1×r2 . . . 0r1×rt0r2×r1

1r2

Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1

rtJrt



Degrees of freedom of the sums of squares for anANOVA

Definition: The trace of a square matrix is the sum of its diagonalelements.Definition: The degrees of freedom of a sum of squares is the rank ofthe idempotent of its quadratic form. That is the degrees of freedom ofY′AY is given by rank(A).Lemma: For B idempotent, rank(B) = trace(B).Lemma: Let c be a scalar and (A), (B) and (C) be matrices. Thenwhen the appropriate operations are defined, we have

(i) trace(A) = trace(A′);

(ii) trace(cA) = c trace(A);

(iii) trace(A + B) =trace(A) + trace(B);

(iv) trace(AB) =trace(BA);

(v) trace(ABC) =trace(CAB) =trace(BCA)

(vi) trace(A⊗ B) = trace(B) trace(A);

(vii) trace(A′A) = 0 if only if A = 0.



Expected mean squares

Have an ANOVA in which we use F (= ratio of MSqs) to decidebetween models.

But why is this ratio appropriate?

One way of answering this question is to look at what the MSqsmeasure?

Use expected values of the MSqs, i.e. E[MSq]s, to do this.

To derive the expected values, we note that the general form of a meansquare is a quadratic form divided by degrees of freedom, Y′QY/ν.



Expectation of quadratic forms

Definition: A quadratic form in a vector Y is a scalar function of Y ofthe form Y′AY where A is called the matrix of the quadratic form.

Expectation: Let Y be an n × 1 vector of random variables with

E[Y] = Ψ and Var[Y] = V

where Ψ is a n × 1 vector of expected values and V is an n × n matrix.Let A be an n × n matrix of real values. Then

E(YTAY) = trace (AV) + ΨTAΨ



Distribution of a quadratic form

Theorem: Let A be an n × n symmetric matrix of rank ν and Y be ann × 1 normally distributed random vector with E[AY] = 0, Var[Y] = Vand E[Y′AY/ν] = λ. Then (1/λ)Y′AY follows a χ2-distribution withν = rank(A) degrees of freedom if and only if A is idempotent.

- The mean and variance of a χ2-distribution with ν degrees of freedomare equal to ν and 2ν, respectively.



Cochran’s theorem (1934)

Theorem: Let Y be an n × 1 normally distributed random vector withE[Y] = Xθ and Var[Y] = V. Let Y′A1Y, . . ., Y′AhY be a collection of hquadratic forms where, for each i = 1, 2, . . . , h,

Ai is symmetric, of rank νi , E[AiY] = 0¯

, E[Y′AiY/νi ] = λi .

If any two of the following three statements are true,1. All Ai are idempotent

2.∑h

i=1 Ai is idempotent

3. AiAj = 0, i 6= j

then for each i , Y′AiY/νi follows a χ2-distribution with νi degrees of

freedom. Furthermore, Y′AiY are independent for i 6= j and∑h

i=1 νi = ν

where ν denotes the rank of∑h

i=1 Ai .



Distribution of a ratio of independent χ2-distributions

Theorem: Let U1 and U2 be two random variables distributed χ2 with ν1

and ν2 degrees of freedom. Then, provided U1 and U2, are independent,the random variable

W =U1/ν1

U2/ν2

is distributed as Snedecor’s F with ν1 and ν2 degrees of freedom.

Note: Two quadratic forms Y′AiY and Y′AjY are independent ifAiAj = 0, i 6= j



It is possible to show (See Chapter XII of Chris Brien’s notes) thatfor the completely randomized design

-Y′QURes

Y

σ2∼ χ2

n−t

-Y′QTY

σ2∼ χ2

t−1, under H0

-Y′QURes

Y

σ2and

Y′QTY

σ2are independent

- F =Treatments MSq

Residual MSq∼ Ft−1,n−t



Standard errors of samples variances

Consider a random sample Yi , i = 1, 2, · · · , n, from a normal distributionwith mean E(Yi ) = µ and variance Var(Yi ) = σ2.

The sample mean Y =∑

i Yi/n and the varianceS2 =

∑i (yi − y)2/(n − 1) are unbiased estimators for µ and σ2,

respectively

(n − 1)S2/σ2 follows a χ2-distribution with (n − 1) df

E(S2) = σ2 and Var(S2) = 2σ4/(n − 1)

Let MS denote a mean square with ν df. If νMS/E(MS) ∼ χ2ν , the

variance of MS is Var(MS) = 2E2(MS)/ν. Hence,

Var(MS) = E(MS2)− E2(MS) = E(MS2)− ν

2Var(MS).

Thus, (ν + 2)Var(MS)/2 = E(MS2) and an unbiased estimator ofVar(MS) is given by

2MS2

ν + 2

As an illustration, the estimator of the variance of the variance S2 isVar(S2) = 2S4/(n + 1).



Linear combinations of χ2 variables

Consider the mean squares MSi , i = 1, 2, · · · , k , independent, with νidegree of fredoom, and that independently νiMSi/E(MSi ) ∼ χ2

νi .

Estimators of variance components usually take the form ofMS =

∑i aiMSi , where ai are constants.

Following Smith (1938), Satterthwaite (1946) considersνMS/E(MS) ∼ χ2

ν .

As a consequence, Var(MS) = 2E2(MS)/ν.

However, Var(MS) =∑

i a2i Var(MSi ) = 2

∑i [a

2i E2(MSi )/νi ]

Equating the two expressions for Var(MS),

ν =E2(MS)∑

i [a2i E2(MSi )/νi ]

=[∑

i aiE(MSi )]2∑i [a

2i E2(MSi )/νi ]

In practice ν is obtained from (∑

i aiMSi )2/∑

i (a2i MS2

i /νi ).Clarice G.B. Demetrio and Cristian Villegas 27 Modelos Mistos e Componentes de Variancia


Goodness of fit criterion

Adjusted R-square

R2 = 1−∑n

i=1(yi − x ′i β)2/(n − K )∑ni=1(yi − y)2/(n − 1)

Akaike’s Information Criterion

AIC = −2 logL(βml , σml , y) + 2K

Bayesian Information Criterion

BIC = −2 logL(βml , σml , y) + K log(n)



SAS procedure

Proc GLM data = data;

class x; * if x is a factor

model y = x;

output out=Regr p=Predite r=Residu;

run;



Checking

Gaussian hypothesis

Graphical

histogram, QQ-plot,

proc univariate data=Regr;var Residu ;histogram Residu / normal ;qqplot Residu / normal(mu=est sigma=estcolor=red L=1);inset mean std / cfill=blank format=5.2;run;

Statistical test

Kolmogorov-Smirnov

proc univariate data=Regr normaltest ;var Residu;run;

Homoscedasticity hypothesis

Graphical

residual/predicted

proc GPlot data=Regr ;plot Residu*Predite /vref=0;run;

Independence hypothesis

Difficult to test !!!



Variable selection in multiple regression

The main approaches

Forward selection, which involves starting with no variables in themodel, trying out the variables one by one and including them ifthey are statistically significant.

Backward elimination, which involves starting with all candidatevariables and testing them one by one for statistical significance,deleting any that are not significant.

Methods that are a combination of the above, testing at each stagefor variables to be included or excluded.

SAS Reg procedure

proc reg

model Y = x/selection = adjrsq bic;

model Y = x/selection = stepwise;

run;



Linear Mixed Models

Linear mixed effects models have been widely used in analysis ofdata where responses are clustered around some random effects,such that there is a natural dependence between observations in thesame cluster.

For example, consider repeated measurements taken on each subjectin longitudinal data, or observations taken on members of the samefamily in a genetic study.

They can easily accommodate covariances among observations.

They handle correlated data by incorporating random effects andestimating their associated variance components to model variabilityover and above the residual error.

Because of the estimation procedures usually envolved, mixed-modelapproaches can circumvent the problems associated with unbalancedand incomplete data.



Maize trial

Example

5 progenies of a population of maize progenies were investigated

the trial was conducted randomizing completely 4 replicates of eachprogeny

the response variable was the weight of corn-cob (kg/10m2)

Progenies Replicates1 5.95 6.21 5.40 5.182 5.07 6.71 5.46 4.983 4.82 5.11 4.68 4.524 3.87 4.16 4.11 4.845 5.53 5.82 4.29 4.70

At crossing, genetic effects may be reasonably assumed as normalrandom variables.During early stages of a selection programme, the nature ofgenotypic effects may still be regarded as random.In general, the interest is in the heritability of a trait.



Penicillin yield (Brien, 2009)

Example

The effects of four treatments on the yield of penicillin are to beinvestigated. It is known that corn steep liquor, an important rawmaterial in producing penicillin, is highly variable from one blending of itto another. To ensure that the results of the experiment apply to morethan one blend, five blends (blocks) are to be used in the experiment.The trial was conducted using the same blend in four flasks andrandomizing the four treatments to these four flasks.

interest of course in each particular treatment usedno interest in each blend which are very depending on thecircumstancesblend effect can be viewed as a sample of a random blend effect(levels are chosen at random from an infinite set of blend levels)interest in estimating the variance of the blend effect as a source ofrandom variation in the datathe four flasks with the same blend share something whichpresumably violates the assumption of independence



. . .

Blend 1

Flask 1 Flask 3 Fask 4Flask 2

Blend 5

Flask 1 Flask 3 Flask 4Flask 2

TreatmentBlend A B C D

1 89 88 97 942 84 77 92 793 81 87 87 854 87 92 89 845 79 81 80 88



Calf birth weight

Example

In an animal breeding experiment 20 unrelated cows were subjected tosuperovulation and artificial insemination. Each group of 4 cows wasinseminated with a different sire, with a total of 5 unrelated sires. Out ofeach mating (combination of dam and sire), three calves were generatedand their yearling weights were recorded.

no interest in each sire or dam which are very depending on thecircumstances

sire effect can be viewed as a sample of a random sire effect (levelsare chosen at random from an infinite set of sire levels)

dam effect can be viewed as a sample of a random dam effect (levelsare chosen at random from an infinite set of dam levels)

interest in estimating the variance of the sire and dam effects assources of random variation in the data

the three calves with the same parents share something whichpresumably violates the assumption of independence



...

S1

D1 D2 D3 D4

...

S5

D17 D18 D19 D20



Fixed vs Random effects

Random effect: A factor will be designated as random if it isconsidered appropriate to use a probability distribution function todescribe the distribution of effects associated with the population setof levels.

influence only the variance of the response variableinfinite set of levels (only a finite subset present) and interest liesmore in the variance induced by these levels than in the estimation ofthe levels themselvesblends in the penicilin example, progenies in the maize trial

Fixed effect: It will be designated as fixed if it is consideredappropriate to have the effects associated with the population set oflevels for the factor differ in an arbitrary manner, rather than beingdistributed according to a regularly-shaped p.d.f.

influence only the mean of the response variablefinite set of levels and interest lies in the estimation of eachparticular level effecttreatments in the penicilin example



In practice

Random if

i . large number of population levels andii . random behaviouriii . occur in two contrasting kinds of circumstances:

observational studies or designed experiments with hierarchicalstructure- School/Class/Student- Sire/Dam/Calfdesigned experiments with different spatial or temporal scales- longitudinal studies

Fixed if

i . small or large number of population levels andii . systematic behaviour

↪→ Consequence: data collected within each level of the random effectfactor are linked to a same realization of a random variable. Thisintroduce dependency between this data.



Type of Models

Fixed-effects model - envolves only fixed effects– to make inferences about those particular levels of theclassification factor that were used in the experiment

Random-effects model - envolves only random effects– to make inferences about the population from which these levelswere drawn

Mixed model - envolves fixed and mixed effects



Example

Consider a study, related to observations of half-sib families of Iunrelated sires.

If the interest is on comparing only the I sires, the following fixedmodel can be used to represent the data:

E(Yij) = µ+ si

where yij represents the phenotypic trait observation of progeny j ,j = 1, . . . , r , in family i , i = 1, . . . , I , µ is a mean, si is a fixed effectcommon to all animals having sire i .

If the I sires are considered as a sample of a population of sires, thefollowing random model can be used to represent the data:

E(Yij |si ) = µ+ si

where Si is a random effectTwo usual assumptions:

1 si ’s are independently and identically distributed2 si ’s have zero mean and the same variance σ2

s

Si ∼ i .i .d .(0, σ2s )



On matrix notation, this model can be expressed as:

y1

y2

· · ·yI

=

1r

1r

· · ·1r

µ+

1r 0r . . . 0r

0r 1r . . . 0r

· · · · · · · · · · · ·0r 0r . . . 1r

s1

s2

· · ·sI

+

ε1

ε2

· · ·εI

where yi = [yi1, yi2, . . . , yiI ]

′ represents the vector of observations ofprogeny i (i.e., relative to sire i); 1r and 0r represent r -dimensionalcolumn vectors of 1′s and 0′s, respectively; and εi = [εi1, εi2, . . . , εiI ]

′ isthe vector of residuals associated with progeny j .



Simulation

Case 1: Consider the simple model yij = µ+ si + eij , with 3 independentsires and 2 replicates

fix µ = 50

get a sample of 3 values for si from a N(0, σ2s )

get a sample of 6 values for eij from a N(0, σ2)

Case 2: We could have a more complex covariance structure for sires (forexample, A ∗ σ2

s , where A could be the parental matrix). The simulationcould be done using the Cholesky decomposition of A, i.e. A = DD ′).Then, we could get a vector z with dimension 3 from a normal N(0,1) –with each of its elements obtained from a (0,1) and then z is multipliedby D and by the square root of σ2

s , i.e. the s vector for sires is given bys = D ∗ z ∗ σs .



Advantages of Linear Mixed Models

flexibility of mixed models for grouped or correlated observations.

models can be used for related individuals (like animal and plantbreeding), longitudinal data, spatial statistics, etc.

generalized linear models with random effects, as, for example,implemented in GLIMMIX of SAS,

non-linear mixed models (NLINMIX of SAS, for example), forgrowth curves.



Linear Mixed Model

Y = Xβ + Zu + ε

Y is an observable data vector

β is a vector of unknown parameters

u is a vector of unobservable random variables

X and Z are design matrices for the fixed and random effects

ε is a vector of random errors

Generally, it is assumed that U and ε are independent from eachother and normally distributed with zero-mean vectors andvariance-covariance matrices G and Σ, respectively, i.e.:[

Uε

]∼ N

([00

],

[G 00 Σ

])Inferences regarding mixed effects models refer to the estimation offixed effects, the prediction of random effects, and the estimation ofvariance and covariance components, which are briefly discussednext.



Linear Mixed Models

Recall that the general linear mixed models equals

Y = Xβ + Zu + ε

U ∼ N(0,G)

ε ∼ N(0,Σ)

u and ε independentThen,

E(Y|u) = Xβ + Zu and Var(Y|u) = Σ

E(Y) = E[E(Y|u)] = E(Xβ + ZU) = Xβ

Var(Y) = Var[E(Y|u)] + E[Var(Y|u)] = Var(Xβ + ZU) + E(Σ) =ZGZ′ + Σ

The implied marginal model equals Y ∼ N(Xβ,V) whereV = ZGZ′ + Σ

Note that inferences based on the marginal model do not explicitlyassume the presence of random effects representing the naturalheterogeneity between subjects (case of longitudinal data)



Some properties of the direct product of matrices

if Ar and Br are square matrices of order r and c, respectively,

Ar ⊗ Bc =

a11B . . . a1rB. . . . . . . . .ar1B . . . arrB

where ⊗ is called the direct (Kronecker) product operator

In general, A⊗ B 6= B⊗ A

If u and v are vectors, then u′ ⊗ v = v ⊗ u′ = vu′

If D(n) is a diagonal matrix and A is any matrix, then:

D⊗ A = d11A⊕ d22A⊕ . . . dnnA

If matrix dimensions are compatible

(A⊗ B)(C⊗ D) = AC⊗ BD

(αAA⊗ αBB) = αAαB (A⊗ B)

(A⊗ B)T = (AT ⊗ BT )

(A⊗ B)−1 = (A)−1(B)−1

rank(A⊗ B) = rank(A)rank(B)

tr(A⊗ B) = tr(A)tr(B)

det(A⊗ B) = det(A)rank(B)det(B)rank(A)



Completely Randomized Design (CRD)

Let’s suppose a CRD with treatment as a random effect and with thesame number of replicates (r) per treatment. The model is

Yij = µ+ τi + εij ,

where i = 1, 2, · · · , t, j = 1, 2, · · · , r , µ constant, τi random and εijrandom

τi ∼ N(0, σ2T ) and εij ∼ N(0, σ2)

τi and εij , τi and τi ′ , εij and εi ′j′ (j 6= j ′ and/or i 6= i ′) areindependent

then

Var(Yij) = Var(τi + εij) = σ2 + σ2T

Cov(Yij ,Yij′) = Cov(τi + εij , τi + εij′) = σ2T (observations from the

same treatment)

Cov(Yij ,Yi ′j) = Cov(τi + εij , τi ′ + εi ′j) = 0 (observations fromdifferent treatments)



The variance matrices of the observations for the fixed and randommodels when r = 2, t = 3, for example, arei) τi fixed

Y =

y11

y12

y21

y22

y31

y32

, Var(Y) = Σ =

σ2 0 0 0 0 00 σ2 0 0 0 00 0 σ2 0 0 00 0 0 σ2 0 00 0 0 0 σ2 00 0 0 0 0 σ2

,

ii) τi random

Var(Y) = ZGZ′ + Σ =

σ2 + σ2T σ2

T 0 0 0 0σ2T σ2 + σ2

T 0 0 0 00 0 σ2 + σ2

T σ2T 0 0

0 0 σ2T σ2 + σ2

T 0 00 0 0 0 σ2 + σ2

T σ2T

0 0 0 0 σ2T σ2 + σ2

T

In this case:

Z =

12 02×1 02×1

02×1 12 02×1

02×1 02×1 12

,G = σ2T I3 and Σ = σ2I6



Expected mean squares for an ANOVA – CRD

Let’s suppose a CRD with treatment as a random effect, but withdifferent number of replicates (ri ) per treatment. The model is

Yij = µ+ τi + εij ,

where i = 1, 2, · · · , t, j = 1, 2, · · · , ri , µ constant, τi random and εijrandom. The ANOVA table is

Source df SSq MSq FUnits n − 1 Y′QUY

Treatments t − 1 Y′QTYY′QTY

t − 1MSqTMSqRes

Residual n − t Y′QUResY

Y′QUResY

n − t

whereMU = In, XG = 1n, MG = XG (XT

G XG )−1XTG = n−1Jn

QT = MT −MG , QU = MU −MG , QURes= MU −MT



XT =

1r1 0r1×1 . . . 0r1×1

0r2×1 1r2 . . . 0r2×1

. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt

,


1r1

Jr1 0r1×r2 . . . 0r1×rt0r2×r1

1r2

Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1

rtJrt

Then,

SSqT = Y′QTY =∑t

i=1T 2

i

ri− C , C =

(∑

i,j Yij )2

n

SSUnits =∑

i,j Y 2ij − C , SSRes = SSqUnits − SSqT

When r1 = r2 = · · · = rt = rMG = XG (XT

G XG )−1XTG = n−1Jt ⊗ Jr = n−1Jn

XT = It ⊗ 1r , MT = XT (XTTXT )−1XT

T = r−1It ⊗ Jr

SSqT = Y′QTY = 1r

∑ti=1 T 2

i − C



Assuming that

τi ∼ N(0, σ2T ) and εij ∼ N(0, σ2) and

τi and εij , τi and τi ′ , εij and εi ′j′ (j 6= j ′ and/or i 6= i ′) areindependent

i) E(SSqUnits)E(SSqUnits) =

∑i,j E(Y 2

ij )− E(C )

E(Y 2ij ) = E(µ2) + E(τ 2

i ) + E(ε2ij) + E(dp) = µ2 + σ2

T + σ2

E(∑

i,j Y 2ij ) = nµ2 + nσ2

T + nσ2∑i,j Yij = nµ+

∑i riτi +

∑i,j εij

E[(∑

i,j

Yij

)2]= n2µ2 + E

(∑i

riτi)2

+ E(∑

i,j

εij)2

+ E(dp)

= n2µ2 +∑i

r 2i σ

2T + nσ2

E(C ) =E[(∑

i,j Yij

)2]n = nµ2 +

∑i r

2i

n σ2T + σ2

E(SSqUnits) =(n −

∑i r

2i

n

)σ2T + (n − 1)σ2



ii) E(SSqT )

E(SSqT ) =∑t

i=1 E(T 2

i

ri

)− C

Ti =∑

j(µ+ τi + εij) = riµ+ riτi +∑

j εij

T 2i = r 2

i µ2 + r 2

i τ2i +

(∑j εij)2

+ dp

T 2i

ri= riµ

2 + riτ2i +

(∑j εij

)2

ri+ dp

ri

E∑t

i=1

(T 2i

ri

)=∑t

i=1(riµ2 + riσ

2T + σ2) = nµ2 + nσ2

T + tσ2

E(SSqT ) = nµ2 + nσ2T + tσ2 − nµ2 −

∑i r 2

i

nσ2T − σ2

=(n −

∑i r 2

i

n

)σ2T + (t − 1)σ2

E(MSqT ) = 1t−1

(n −

∑i r

2i

n

)σ2T + σ2

When r1 = r2 = · · · = rt = r

E(MSqT ) = rσ2T + σ2



iii) E(SSqRes)SSqRes = SSqUnits − SSqT

E(SSqRes) = E(SSqUnits)− E(SSqT ) = (n − t)σ2

E(MSqRes) = σ2

Exercise: Show that for a fixed CRD

E(MSqT ) = qT (Ψ) + σ2 and E(MSqRes) = σ2,

where qT (Ψ) =∑t

i=1

ri (τi − τ .)2

t − 1



The expected mean squares under the fixed and random models are givenin the following table

Source df SSq MSq (s2) E[MSq] E[MSq]Units n − 1 YTQUY

Treatments t − 1 YTQTYYTQTY

t − 1σ2 + qT (Ψ) σ2 + k1σ

2T

Residual n − t YTQUResY

YTQUResY

n − tσ2 σ2

where qT (Ψ) =ΨTQTΨ

t − 1=

t∑i=1

ri (τi − τ .)2

t − 1, k1 = 1

t−1

(n −

∑i r

2i

n

)MU = In, MG = n−1Jn


σ2 and σ2T are called components of variance



Expected mean squares for an ANOVA, using matrixnotation


Y = Xβ + Zu + ε

U ∼ N(0,G)

ε ∼ N(0,Σ)

u and ε independent. Then E(Y) = Xβ and V = ZGZT + Σ.

Expected mean squares for an ANOVATheorem: Let Y be an n × 1 vector of random variables with E[Y] = µand Var[Y] = V, where µ is a n × 1 vector of expected values and V isan n × n matrix. Let A an n × n matrix of real numbers.Then

E(YTAY) = tr(AV) + µTAµ



i) Assuming a fixed CRD model (fixed effect for treatment), we have

Y = XGµ+ XTτ + ε

with τ fixed and ε ∼ N(0, Inσ2), that is, E(τ ) = τ , G = Var(τ ) = 0t×t ,E(ε) = 0 and Σ = Var(ε) = Inσ2. Then,

E(Y) = µ = Xβ = XGµ+ XTτ and V = ZGZ′ + Σ = Inσ2

Remember thatSSqT = YTQTY = YT (MT −MG )Y andSSqRes = YTQURes

Y = YT (MU −MT )YwhereMU = In, XG = 1n MG = n−1Jn

XT =

1r1 0r1×1 . . . 0r1×1

0r2×1 1r2 . . . 0r2×1

. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt

,MT =

1r1

Jr1 0r1×r2 . . . 0r1×rt0r2×r1

1r2

Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1

rtJrt



Then,i.1) E(SSqT ) = E(YTQTY) = E(YTMTY)− E(YTMGY)

but

E(YTMTY) = tr(MT In)σ2 + µTMTµ

= tσ2 + µ2XTG MTXG + 2µXT

G XTτ + τTXTTXTτ

= tσ2 + nµ2 + 2µXTG XTτ +

t∑i=1

riτ2i

E(YTMGY) = tr(MG In)σ2 + µTMGµ

= σ2 + nµ2 + 2µXTG XTτ + τTXT

TMGXTτ

= σ2 + nµ2 + 2µXTG XTτ +

1

n

( t∑i=1

riτi)2

E(SSqT ) = (t − 1)σ2 +t∑

i=1

riτ2i −

1

n

( t∑i=1

riτi)2

= (t − 1)σ2 +t∑

i=1

ri(τi − τ

)2



and

E(MSqT ) = σ2 +1

t − 1

t∑i=1

ri(τi − τ

)2

i.2) E(SSqRes) = E(YTQUResY) = E(YTMUY)− E(YTMTY)

but

E(YTMUY) = tr(MU In)σ2 + µTMUµ

= nσ2 + µTMUµ = nσ2 + (XGµ+ XTτ )T (XGµ+ XTτ )

= nσ2 + nµ2 + 2µXTG XTτ +

t∑i=1

riτ2i

E(SSqRes) = (n − t)σ2 and E(MSqRes) = σ2



ii) For a random effect for treatment (random CRD model), we have

Y = XGµ+ ZTτ + ε

with τ ∼ N(0, Itσ2T ) and ε ∼ N(0, Inσ2), that is,

E(τ ) = 0, G = Var(τ ) = Itσ2T ,

E(ε) = 0 and Σ = Var(ε) = Inσ2.

Then,

E(Y) = µ = Xβ = XGµ and V = ZT ItZTTσ

2T + Σ = ZTZT

Tσ2T + Inσ

2

Remember thatSSqT = YTQTY = YT (MT −MG )Y andSSqRes = YTQURes

Y = YT (MU −MT )YwhereMU = In, XG = 1n MG = n−1Jn

ZT =

1r1 0r1×1 . . . 0r1×1

0r2×1 1r2 . . . 0r2×1

. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt

,MT =

1r1

Jr1 0r1×r2 . . . 0r1×rt0r2×r1

1r2

Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1

rtJrt



Then,ii.1) E(SSqT ) = E(YTQTY) = E(YTMTY)− E (YTMGY)but

E(YTMTY) = tr[MT (ZTZTTσ

2T + Inσ

2)] + µTMTµ

= tr(MTZTZTT )σ2

T + tr(MT )σ2 + µ2XTG MTXG

= tr(ZTZTT )σ2

T + tr(It)σ2 + nµ2

= nσ2T + tσ2 + nµ2

E(YTMGY) = tr[MG (ZTZTTσ

2T + Inσ

2)] + µTMGµ

= tr(MGZTZTT )σ2

T + tr(MG )σ2 + µ2XTG MGXG

=1

ntr(JnZTZT

T )σ2T + σ2 + µ2XT

G XG

=1

n

t∑i=1

r 2i σ

2T + σ2 + nµ2

E(SSqT ) =

(n − 1

n

t∑i=1

r 2i

)σ2T + (t − 1)σ2



and

E(MSqT ) = σ2 +1

t − 1

(n − 1

n

t∑i=1

r 2i

)σ2T

ii.2) E(SSqRes) = E(Y′QUResY) = E(Y′MUY)− E(Y′MTY)

but

E(YTMUY) = tr[MG (ZTZTTσ

2T + Inσ

2)] + µTMUµ

= tr(MUZTZTT )σ2

T + tr(MU)σ2 + µ2XTG XG =

= nσ2T + nσ2 + nµ2

E(SSqRes) = (n − t)σ2 and E(MSqRes) = σ2



U 1 h 1 Mean

Progenies 5 h 4 Progenies

U 1 x 1 Mean

Plots 20 x 19 Plots

U MG h MG Mean

Progenies MPr h MPr −MG Progenies

U MG x MG Mean

Plots MPl x MPl −MG Plots

U 1 h 1 Mean

Progenies 4σ2T h 4σ2

T Progenies

U 1 x 1 Mean

Plots σ2 x σ2 PlotsClarice G.B. Demetrio and Cristian Villegas 63 Modelos Mistos e Componentes de Variancia


Estimation of Fixed Effects


Y = Xβ + Zu + ε

with U ∼ N(0,G) and ε ∼ N(0,Σ) independent

Then,E(Y|u) = Xβ + Zu and Var(Y|u) = ΣE(Y) = E(Xβ + ZU) = XβVar(Y) = Var(Xβ + ZU) + E(Σ) = ZGZ′ + Σ = Vand marginal model Y ∼ N(Xβ,V)

Notationβ: vector of fixed effects (as before)α: vector of all variance components in G and Σθ = (β′,α′)′: vector of all parameters in marginal model

Marginal likelihood function:

LML(θ) = (2π)−n/2|V(α)|−1/2 exp[− 1

2(Y−Xβ)′V−1(α)(Y−Xβ)

]If α were known, MLE of β equals

β(α) = (X′V−1X)−1X′V−1Y ∼ N(β, (X′V−1X)−1)Clarice G.B. Demetrio and Cristian Villegas 64 Modelos Mistos e Componentes de Variancia


Estimation of Fixed Effects

As G and Σ are generally unknown, an estimate of V is used instead,such that the estimator becomes β(α) = (X′V−1X)−1X′V−1Y.

The variance-covariance matrix of β is approximated by(X′V−1X)−1.

Note: (X′V−1X)−1 is biased downwards as a consequence ofignoring the variability introduced by working with estimates of(co)variance components instead of their true (unknown) parametervalues.

Approximated confidence regions and test statistics for estimablefunctions of the type K′β can be obtained by using the result:

(K′β0)′(K′(X′V−1X)−K)−1(K′β0)

rank(K)≈ F[ϕNϕD ]

where F[ϕNϕD ] refers to an F-distribution with ϕN = rank(K) degreesof freedom for the numerator, and ϕD degrees of freedom for thedenominator, which is generally calculated from the data using, forexample, the Satterthwaite’s approach



Matrix review

X ∼ Nk(µ,Σ)

Consider the partitions:

X =

[X1

X2

], µ =

[µ1

µ2

]and Σ =

[Σ11 Σ12

Σ21 Σ22

],

X1 ∼ N(µ1,Σ11) e X2 ∼ N(µ2,Σ22) (marginal distributions)

and

X1|X2 ∼ N(µ1.2,Σ11.2) e X2|X1 ∼ N(µ2.1,Σ22.1) (condicional distributions),

where

µ1.2 = µ1 + Σ12Σ−122 (X2 − µ2), Σ11.2 = Σ11 −Σ12Σ

−122 Σ21

and

µ2.1 = µ2 + Σ21Σ−111 (X1 − µ1) e Σ22.1 = Σ22 −Σ21Σ

−111 Σ12.



Prediction of Random Effects

In addition to the estimation of fixed effects, very often in genetics,for example, interest is also on prediction of random effects.

In linear (Gaussian) models such predictions are given by theconditional expectation of U given the data, i.e. E[U|y].

Given the model specifications, the joint distribution of Y and U is:[YU

]∼ N

([Xβ0

],

[V ZG

GZ′ G

])From the properties of multivariate normal distribution, we have

E[U|y] = E[U] + Cov[U,Y′]Var−1[Y](y − E[Y])

= GZ′V−1(y − Xβ) = GZ′(ZGZ′ + Σ)−1(y − Xβ)

The fixed effects β are typically replaced by their estimates, so thatpredictions are made based on the following expression:

u = GZ′(ZGZ′ + Σ)−1(y − Xβ)



Mixed Model Equations

The solutions β and u discussed before require V−1. As V can be ofhuge dimensions, especially in plant and animal breedingapplications, its inverse is generally computationally demanding ifnot unfeasible.

However, Henderson (1950) presented the mixed model equations(MME) to estimate β and u simultaneously, without the need forcomputing V.The MME were derived by maximizing (β and u) the joint density ofY and U ,[f (y,u|β,G,Σ) = f (y|u|β,Σ)f (u|G)], expressed as:

f (y, u|β,G,Σ) ∝ |Σ|−1/2|G|−1/2 exp[−

1

2(y−Xβ−Zu)′Σ−1(y−Xβ−Zu)−

1

2u′G−1u

]

The logarithm of this function is:

` ∝ log |Σ|+ log |G|+ (y − Xβ − Zu)′Σ−1(y − Xβ − Zu) + u′G−1u

= log |Σ|+ log |G|+ y′Σ−1y − 2y′Σ−1Xβ − 2y′Σ−1Zu

+ β′X′Σ−1Xβ + 2β′X′Σ−1Zu + u′Z′Σ−1Zu + u′G−1u



Mixed Model Equations

The derivatives of ` regarding β and u are: ∂`

∂β∂`

∂u

=

[X′Σ−1y − X′Σ−1Xβ − X′Σ−1Zu

Z′Σ−1y − Z′Σ−1Xβ − Z′Σ−1Zu− G−1u

]

Equating them to zero gives the following system:[X′Σ−1Xβ + X′Σ−1Zu

Z′Σ−1Xβ + Z′Σ−1Zu + G−1u

]=

[X′Σ−1yZ′Σ−1y

]which can be expressed as:[

X′Σ−1X X′Σ−1ZZ′Σ−1X Z′Σ−1Z + G−1

] [βu

]=

[X′Σ−1yZ′Σ−1y

]known as the mixed model equations (MME).



BLUE and BLUP

Using the second part of the MME, we have that:

Z′Σ−1Xβ + (Z′Σ−1Z + G−1)u = Z′Σ−1y

so thatu = (Z′Σ−1Z + G−1)−1Z′Σ−1(y − Xβ)

It can be shown that this expression is equivalent tou = GZ′(ZGZ′ + Σ)−1(y − Xβ) and, more importantly, that u isthe best linear unbiased predictor (BLUP) of u.

Using this result into the first part of the MME, we have that:

X′Σ−1Xβ + X′Σ−1Zu = X′Σ−1y

X′Σ−1Xβ+ X′Σ−1Z(Z′Σ−1Z + G−1)−1Z′Σ−1(y−Xβ) = X′Σ−1y

β = {X′[Σ−1−Σ−1Z(Z′Σ−1Z+G−1)−1Z′Σ−1]X}−1X′[Σ−1−Σ−1Z(Z′Σ−1Z+G−1)−1Z′Σ−1]Y

Similarly, it is shown that this expression is equivalent toβ = (X′V−1X)−1X′V−1Y, which is the best linear unbiasedestimator (BLUE) of β



BLUE and BLUP

It is important to note that β and u require knowledge of G and Σ.

These matrices, however, are rarely known.

This is a problem without an exact solution using classical methods.

The practical approach is to replace G and Σ by their estimates (Gand Σ) into the MME.

Note that if G and Σ are known, the variance covariance matrix ofthe BLUE and BLUP is:

Var

[βu

]=

[X′Σ−1X X′Σ−1ZZ′Σ−1X Z′Σ−1Z + G−1

]−1



BLUE and BLUP

If G and Σ are unknown and their values are replaced in the MMEby some sort of point estimates G and Σ, the new solutions β and uof the system:[

X′Σ−1X X′Σ−1Z

Z′Σ−1X Z′Σ−1Z + G−1

] [βu

]=

[X′Σ−1y

Z′Σ−1y

]are no longer BLUE and BLUP solutions, as they are not even linearfunctions of the data y.

It is shown also that generally:

Var

[βu

]>

[X′Σ−1X X′Σ−1Z

Z′Σ−1X Z′Σ−1Z + G−1

]−1



Inverse of a nonsingular partitioned matrix

Let A be a nonsingular partitioned matrix and A−1 its inverse as follows

A =

[A11 A12

A21 A22

]A−1 =

[A11 A12

A21 A22

]=

[Var(β) Cov()Cov() Var(u)

]whereA11 = (A11 − A12A−1

22 A21)−1

A12 = A21T = −(A11 − A12A−122 A21)−1A12A−1

22

A22 = A−122 + A−1

22 A21(A11 − A12A−122 A21)−1A12A−1

22



Example

Considering the completely randomized design with random treatmenteffect and r = 2, t = 3, then β = µ, X = 16, G = σ2

T I3, Σ = σ2I6

Z =

12 02×1 02×1

02×1 12 02×1

02×1 02×1 12

and V =

J2×2 02×2 02×2

02×2 J2×2 02×2

02×2 02×2 J2×2

σ2T+σ2I6

Thenµ = y (Exercise!)

u = (Z′Σ−1Z + G−1)−1Z′Σ−1(y − Xβ)

=(Z′Z

σ2+

I3

σ2t

)−1 Z′

σ2(y − 16µ) =

( r

σ2I3 +

1

σ2t

I3

)−1 Z′

σ2(y − 16µ)

= =( rσ2

t + σ2

σ2σ2t

)−1 Z′

σ2(y − 16µ) =

σ2t

σ2t + σ2

r

Z′

r(y − 16µ)

ui = BLUP i =σ2t

σ2t +σ2

r

(yi − µ) = (shrinkage factor)BLUE i



The EBLUP for ui is given by(1− MSqRes

MSqT

)(yi − y)

The BLUP for µi = µ+ ui is given by

µi = y +σ2t

σ2t + σ2

r

(yi − y) = yi −σ2

rσ2t + σ2

(yi − y)

and substituting σ2t and σ2 by their estimates we have the EBLUP for µi

yi −MSqRes

MSqT(yi − y)

The relationship between the shrunk or adjusted means (EBLUP’s) andunadjusted means (BLUP’s) can also be illustrated by a scatter diagram.The shrinkage towards the overall mean is indicated by the fact that thepoints representing treatments that have an estimated mean above µ = ylie below the line

EBLUP = BLUP

whereas those representing treatments with an estimated mean belowµ = y lie above the line.



A11 = X′Σ−1X, A12 = X′Σ−1Z, A22 = Z′Σ−1Z + G−1

Var(Y ) =

[X′Σ−1X− X′Σ−1Z

(Z′Σ−1Z + G−1

)−1

Z′Σ−1X

]−1

=

[X′X

σ2− X′Z

σ2

(Z′Z

σ2+

I

σ2T

)−1Z′X

σ2

]−1

=

(n

σ2+

X′Z

σ2

σ2σ2T

rσ2T + σ2

Z′X

σ2

)−1

=σ2

n

(σ2

rσ2T + σ2

)−1

=rσ2

T + σ2

n

and an estimate of Var(Y ) is given by

Var(Y ) =MSqT

n



Var(u) =(

Z′Σ−1Z + G−1)−1

+(

Z′Σ−1Z + G−1)−1

Z′Σ−1X[X′Σ−1X− X′Σ−1Z

(Z′Σ−1Z + G−1

)−1

Z′Σ−1X

]−1

X′Σ−1Z(

Z′Σ−1Z + G−1)−1

=σ2σ2

T

rσ2T + σ2

and an estimate of Var(u) is given by

Var(u) =

(1− MSqRes

MSqT

)MSqRes

r



Estimation methods for the variance components

Recall that α is the vector of all variance components in G and Σ

In most cases, α is not known, and needs to be replaced by anestimate α

Three frequently used estimation methods for α

Moment method or ANOVA Method (MM)

Maximum likelihood method (ML)

Restricted maximum likelihood method (REML)




Anova Estimation

Fit the model by assuming that the random effects in the model arefixed effects. Obtain the corresponding ANOVA table.

Compute the expected mean squares of the observed mean squaresin the ANOVA table under the true assumption about the u′s and ε.

Equate the observed mean squares to their expected mean squaresand solve the resulting system of equations for each of the variancecomponents.

Use the resulting solutions as the estimates of the variancecomponents




Example

Consider the data set below, related to observations of half-sib families oft unrelated sires.

Sire1 2 . . . t

y11 y21 . . . yt1

y12 y22 . . . yt2

. . . . . . . . . . . .y1r1 y2r2 . . . ytrt

The following model can be used to represent these data:

yij = µ+ si + εij

where yij represents the phenotypic trait observation of progeny j(j = 1, 2 . . . , ri ) in family i , µ is a mean, si is an effect common toall animals having sire i , and εij is a residual term.




Example

The sire effect si is equivalent to the transmitting ability (which isequal to one-half additive genetic value) of sire i , as one-half of itsgenes are (randomly) transmitted to each of its ri progeny.

The residual terms εij refer to additional genetics effects (such asthe effect of dams) and environmental components.

It is assumed that si ∼ N(0, σ2s ) and εij ∼ N(0, σ2)

The expectation and variance of Yij are

E(Yij) = µ and Var(Yij) = σ2s + σ2




Example

The ANOVA table with expected mean squares isSource df SSq MSq E[MSq]Units n − 1Sire t − 1 SSSq SMSq σ2 + kσ2

s

Residual n − t RSSq RMSq σ2

where k = 1t−1 (n − 1

n

∑ti=1 r 2

i ).

The ANOVA (MM) estimators for σ2 and σ2s are

σ2 =RSSq

n − tand σ2

s =SMSq − RMSq

k=

1

k

[SMSq − σ2

]In the specific case of balanced data, i.e. the same progeny size forall sires, ri = r = n/t and the ANOVA estimators become:

σ2 = RMSq =RSSq

t(r − 1)and σ2

s =SMSq − RMSq

r=

1

r

[1

t − 1SSSq−σ2

]Clarice G.B. Demetrio and Cristian Villegas 82 Modelos Mistos e Componentes de Variancia



Anova Estimation – Advantages

In general, the ANOVA approach works well for simple models (suchas a one-way structure) or balanced data (such as data fromdesigned experiments with no missing data).

The estimators of the variance components are unbiased.

One can often approximate the degrees of freedom corresponding tothe estimated standard errors of estimators of estimable functions ofthe fixed effects by using Satterthwaite’s Method.For the sire example

σ2s =

SMSq − RMSq

k

with ns degrees of freedom given by

ns =(SMSq − RMSq)2

(SMSq)2

t − 1+

(RMSq)2

n − t

SAS and R can produce the necessary information to perform theseanalysis.




Anova Estimation – Disadvantages

It is not indicated for more complex models and data structures suchas those generally found in plant and animal breeding, longitudinalstudies.

There is no unique way in which to form an ANOVA table when thedata are not balanced.

The procedure can produce negative estimates of the variancecomponents which do not make sense.

If some of the expected mean squares of the random effects in theANOVA table depend on fixed effects, the method cannot beapplied. This problem can be avoided by placing all the fixed effectsin the model first followed by the random effects.




A number of methods have been proposed for estimating variancecomponents in more complex scenarios, such as the expected meansquares approach of Henderson (1953), and the minimum normquadratic unbiased estimation (Rao 1971a, 1971b), but maximumlikelihood based methods are currently the most popular ones,especially the restricted (or residual) maximum likelihood (REML)approach, which attempts to correct for the well-known bias in theclassical maximum likelihood (ML) estimation of variancecomponents.

These two methods are briefly described next.



Maize trial

Example

5 progenies of a maize population were investigated

the trial was conducted using a completely randomized design with 4replicates of each progeny

the response variable was the weight of corn-cob (kg/10m2)

Progenies Replicates1 5.95 6.21 5.40 5.182 5.07 6.71 5.46 4.983 4.82 5.11 4.68 4.524 3.87 4.16 4.11 4.845 5.53 5.82 4.29 4.70



Completely Randomized Design (CRD) with same numberof replicates – Expected mean squares for an ANOVA

Let Y be an n × 1 vector of random variables with E[Y] = µ andVar[Y] = V, where µ is a n × 1 vector of expected values and V is ann × n matrix. Let A an n × n matrix of real numbers. Then

E(YTAY) = tr (AV) + µTAµ

For a fixed CRD modelE(Y) = XTτ and V = Inσ2

For a random CRD modelE(Y) = Inµ and V = Inσ2 + rσ2

TMT

whereXT = It ⊗ 1r , MT = XT (XT

TXT )−1XTT = r−1Ir ⊗ Jt



The expected mean squares under the fixed and random models are givenin the following table

Source df SSq MSq (s2) E[MSq] E[MSq]Units n − 1 Y′QUY


t − 1σ2 + qT (Ψ) σ2 + rσ2

T

Residual n − t Y′QUResY

Y′QUResY

n − tσ2 σ2

where qT (Ψ) =Ψ′QTΨ

t − 1=

t∑i=1

r(τi − τ .)2

t − 1

MU = (It ⊗ Ir ) = Itr , MG = n−1Jt ⊗ Jr = n−1Jn


σ2 and σ2T are called components of variance

ANOVA estimators:

σ2 = RMSq, σ2T =

TMSq − RMSq

r



Maize trial

ANOVA table using R

Source df SSq MSq F Prob

Plots 19

Progeny 4 5.5078 1.3770 4.2872 0.01644∗Residual 15 4.8177 0.3212

MM REML ML

σ2P 0.2639 0.2639 0.1951

σ2 0.3212 0.3212 0.3212



τi (BLUE ) τi (BLUP) µi (BLUE) µi (BLUP)0.6145 0.4712 5.6850 5.54170.4845 0.3715 5.5550 5.4420-0.2880 -0.2208 4.7825 4.8497-0.8255 -0.6329 4.2450 4.43750.0145 0.0111 5.0850 5.0816

Var(τi ) (BLUE ) = Var(µi ) (BLUE ) = 0.0803

Var(τi ) (BLUP) = Var(µi ) (BLUP) = 0.0616

**

*

*

*

4.0 4.5 5.0 5.5 6.0

4.0

4.5

5.0

5.5

6.0

Unadjusted means

Adj

uste

d m

eans


SAS program

data prog;

input Progeny Yield @@;

cards;

1 5.95 3 4.68

1 6.21 3 4.52

1 5.40 4 3.87

1 5.18 4 4.16

2 5.07 4 4.11

2 6.71 4 4.84

2 5.46 5 5.53

2 4.98 5 5.82

3 4.82 5 4.29

3 5.11 5 4.70

;

* Moment Method;

proc glm data=prog;

class Progeny;

model Yield = Progeny;

run;

* Restricted Maximum Likelihood Method;

proc mixed data=prog;

class Progeny;

model Yield = / solution ddfm=sat;

random Progeny / solution ;

run;

* Maximum Likelihood Method;

proc mixed data=prog method=ML;

class Progeny;

model Yield = / solution ddfm=sat;

random Progeny / solution ;

run;

R program

CRDMaize.dat <- data.frame(Plots = factor(c(1:20)),Progeny = factor(rep(c(1:5), each=4)),Yield = c(5.95,6.21,5.40,5.18,5.07,

6.71,5.46,4.98,4.82,5.11,4.68,4.52,3.87,4.16,4.11,4.84,5.53,5.82,4.29,4.70))

CRDMaize.dat#attach(CRDMaize.dat)

summary(aov(Yield ~ Progeny+Error(Plots), CRDMaize.dat))summary(aov(Yield ~ Progeny, CRDMaize.dat))(length(levels(CRDMaize.dat$Progeny)))#number of levels of ProgeniesCRDMaize.lm <- lm(Yield ~ Progeny, CRDMaize.dat)anova(CRDMaize.lm)(df_res <- df.residual(CRDMaize.lm))(MSq_T <- anova(CRDMaize.lm)$"Mean Sq"[1])(MSq_Res <- anova(CRDMaize.lm)$"Mean Sq"[2])

## estimate of sigma2 and confidence interval for sigma2(sigma2 <- anova(CRDMaize.lm)$"Mean Sq"[2])(summary(CRDMaize.lm)$sigma)^2(Var_sigma2 <- 2*MSq_Res^2/17)(lim_inf <- df_res*sigma2/qchisq(0.975,df_res))(lim_sup <- df_res*sigma2/qchisq(0.025,df_res))

R program

## estimate of sigma2_T and confidence interval for sigma2_T(sigma2_T <- (MSq_T-MSq_Res)/4) # Moment Method(Var_sigma2_T <- 2/4*((MSq_T^2/(20+4)+(MSq_Res)^2/(4*(df_res+2)))))(sd_sigma2_T <- sqrt(Var_sigma2_T))(nu_T <-(MSq_T-MSq_Res)^2/(MSq_T^2/4+MSq_Res^2/df_res)) # by Sattertwaite(lim_inf <- nu_T*sigma2_T/qchisq(0.975,nu_T))(lim_sup <- nu_T*sigma2_T/qchisq(0.025,nu_T))

# estimate of mean, variance, confidence interval(ybar <- mean(CRDMaize.dat$Yield))(Var_ybar <- MSq_T/20)(sd_ybar <- sqrt(Var_ybar))(t_Var_ybar <- ybar/sd_ybar)ybar-qt(1-0.025,4)*sd_ybar;ybar+qt(1-0.025,4)*sd_ybar

## BLUE, BLUP (step by step)summary(lm(Yield ~ Progeny-1, CRDMaize.dat))# shows the BLUE’s for mu_isqrt(MSq_Res/4) #(standard error the BLUE’s for mu_i)mean_T <- tapply(CRDMaize.dat$Yield,CRDMaize.dat$Progeny,mean)tau_BLUE <- mean_T - ybartau_BLUP <- tau_BLUE*sigma2_T/(sigma2_T+sigma2/4)mean_T_BLUP <- ybar+tau_BLUP(var_tau_BLUP <- sigma2_T*sigma2/(4*sigma2_T+sigma2))sqrt(var_tau_BLUP)data.frame(tau_BLUE,tau_BLUP,mean_T,mean_T_BLUP)plot(mean_T,mean_T_BLUP, pch=’*’,xlim=c(4,6),ylim=c(4,6),xlab=’Unadjusted means’,ylab=’Adjusted means’)abline(0,1)

R program

require(nlme)# Restricted Maximum Likelihood MethodCRDMaize.reml <- lme(Yield ~ 1, random = ~1|Progeny,CRDMaize.dat, method="REML")summary(CRDMaize.reml)VarCorr(CRDMaize.reml)VarCorr(CRDMaize.reml)[1]VarCorr(CRDMaize.reml)[2]VarCorr(CRDMaize.reml)[3]VarCorr(CRDMaize.reml)[4](summary(CRDMaize.reml)$sigma)^2random.effects(CRDMaize.reml) ## tau EBLUPCRDMaize.reml$coef ## mean EBLUPcoef(CRDMaize.reml)

# Maximum Likelihood MethodCRDMaize.ml <- update(CRDMaize.reml, method="ML")#CRDMaize.ml <- lme(Yield ~ 1, random = ~1|Progeny, CRDMaize.dat, method="ML")summary(CRDMaize.ml,corr = F)VarCorr(CRDMaize.ml)(summary(CRDMaize.ml)$sigma)^2random.effects(CRDMaize.ml)coef(CRDMaize.ml)

R program

## Restricted Maximum Likelihood Method, using library lme4library(lme4)CRDMaize.lmer <- lmer(Yield ~ 1 + (1|Progeny),CRDMaize.dat, REML=TRUE)summary(CRDMaize.lmer)summary(CRDMaize.lmer)@coefsdata.frame(summary(CRDMaize.lmer)@REmat)

# Restricted Maximum Likelihood Method using ASReml-Rrequire(asreml)CRDMaize.asreml<- asreml(Yield ~ 1,random=~Progeny,data=CRDMaize.dat)summary(CRDMaize.asreml)summary(CRDMaize.asreml)$varcomp


Software

R functions

lm() – classical linear model

aov() – analysis of variance model

glm() – generalized linear model

gls() – generalized least squares model

gee() – generalized estimating equations (package gee)

lme() – linear mixed models (package nlme)

nlme() – non-linear mixed model (package nlme)

nls() – non-linear regression model (package nls)

lmer() – linear mixed models (package lme4)

ASReml [email protected]://www.vsni.co.uk/products/asremlASReml forum www.vsni.co.uk/forumCookbook: http://uncronopio.org/ASReml



Differences between lme4 and nlme

(B. Venables, 2010, personal communication)

1 With nlme the fixed and random parts of the model are specifiedusing two formulae; in lme4 they are specified in the one formulawith the random parts ”added on” to the fixed parts.

2 With nlme you have no generalized linear mixed model fitter, thoughglmmPQL in the MASS library can be used for some GLMMs, and ituses the nlme library. lme4 has a GLMM built-in. It allows you tospecify families in the glm sense, but not all glm families aresupported, yet.

3 nlme offers non-linear mixed effect models; lme4 does not and neverwill.

4 The nlme package allows you to specify variance heterogeneity andcorrelation patterns; the only way to do this within lme4 is to use aglm family, which is often not what you want to do.

5 The nlme package has a gls functon for ”generalized least squares”.This allows you to make use of the variance heterogeneity andcorrelation patterns feature even if the model does not contain anyrandom effects. This is handy.



Differences between lme4 and nlme

(B. Venables, 2010, personal communication, cont.)

6 (Probably most important difference). nlme is hard to use withcrossed random effects, but is very well-developed for nested randomeffects. lme4 is the opposite: it handles crossed random effects welland using it with nested random effects is still simple enough, but abit more work than with nlme.

7 nlme uses an older algorithm which struggles for large data sets.lme4 uses a newer algorighm and can handle quite large data setsvery quickly. (I think the SAS Proc mixed, though, will handle evenbigger ones.)

8 lme4 is that, at this stage, it is relatively under-developed. Someimportant things are missing.

9 ASREML is wonderful, but it only handles a relatively small set ofmodels (though the most important set, of course)



(C. Brien, 2010, personal communication)

1 ASREML does a wide range of heterogeneous variances andcorrelations for nested and crossed random effects, althoughprobably not the full range of heterogeneous, nested models thatnlme does. ASREML also does GLMMs, similar to GLMMPQL. Itdoes not do the non-linear models.

2 ASREML is good for experiments and lme4/nlme are good for largesurveys, because that is what they were developed for



Software

SAS procedures

PROC GLM – general linear model

PROC MIXED – linear mixed model

PROC GENMOD – generalized linear model

PROC GLIMMIX

PROC NLMIXED – non-linear mixed model



Basic SAS code1/ proc mixed data=variety.eval;

2/ class block type dose;

3/ model y = type|dose ;

4/ random block block*dose ;

5/ ods select Tests3 CovParms; run;

call procedure and declare data set

define block, type, dose as factor

define fixed effects in the model

declare random effects

output test type 3 and covarianceparameters



1/ proc mixed statement <options>;

DATA= SAS data set. Name of SAS data set to be used by PROCMIXED. The default is the most recently created data set.

METHOD

REML (default method)ML

COVTEST allows to specify if asymptotic standard errors and WaldZ-test for variance-covariance structure parameter estimates is used.



3/ MODEL statement <option>;

describes linear relation between Y and fixed covariables

S or Solution for fixed effects output;

DDFM method to compute approximate Degree of Freedom

CONTAIN (default)RESKRSATTERTH

outpred=Names1, output data-sets Names1 contains predictedvalues X β + Z u, sd...

outpredm=Names2, output data-sets Names2 contains predictedvalues X β, sd...

4/ Random statement

random block / Solution;

↪→ Blup and t-test



CRD – Variance of Component of Variance Estimators

From Session 1: Let MS denote a mean square with ν df. IfνMS/E(MS) ∼ χ2

ν , the variance of MS is

Var(MS) = 2E2(MS)/ν.

Hence,

Var(MS) = E(MS2)− E2(MS) = E(MS2)− ν

2Var(MS).

Thus, (ν + 2)Var(MS)/2 = E(MS2) and an unbiased estimator ofVar(MS) is given by

2MS2

ν + 2Then

Var(σ2) = Var(MSqRes) =2σ4

n − tand

Var(σ2) = Var(MSqRes) = 2MSq2

Res

n − t + 2

Maize example: σ2 = 0.3212 and Var(σ2) = 2 0.32122

15+2 = 0.0121



We saw that

σ2T =

MSqT −MSqRes

r=

∑i (yi − y)2

t − 1− s2

r

Since MSqT and MSqRes are independent, the two terms of σ2T are

distributed independently. Furthermore,

(t − 1)MSqT

σ2 + rσ2T

∼ χ2t−1 and

(n − t)MSqRes

σ2∼ χ2

n−t

From these results

Var(σ2T ) =

1

r 2[Var(MSqT ) + Var(MSqRes)] =

2

r 2

[(rσ2

T + σ2)2

t − 1+

σ4

n − t

]An unbiased estimator of this variance is given by

Var(σ2T ) =

2

r 2

[MSq2

T

t − 1 + 2+

MSq2Res

n − t + 2

]=

2

r

[MSq2

T

n + r+

MSq2Res

r(n − t + 2)

]Maize example: σ2

P = 0.2639 and

Var(σ2P) = 2

42

[1.37702

4+2 + 0.32122

15+2

]= 0.0403



Confidence interval for σ2

From(n − t)MSqRes

σ2∼ χ2

n−t obtained from

P

(χ2n−t;α/2 <

(n − t)MSqRes

σ2< χ2

n−t;1−α/2

)= 1− α

or equivalently

P

((n − t)MSqRes

χ2n−t;1−α/2

< σ2 <(n − t)MSqRes

χ2n−t;α/2

)= 1− α

Then a confidence interval for σ2 with a 100(1− α)% is[(n − t)MSqRes

χ2n−t;1−α/2

;(n − t)MSqRes

χ2n−t;α/2

]Maize example: A confidence interval for σ2 with a 100(1− α)% is[

15 ∗ 0.3212

27.4884;

15 ∗ 0.3212

6.2621

]= [0.1753; 0.7693]



Confidence interval for σ2T

To get the confidence interval for σ2T we need first to determine the

number of degrees of freedom associated to σ2T , by Satterthwaite

method.

As σ2T =

MSqT −MSqRes

r, from Session 1,

νT =(∑

i aiMSi )2∑

ia2i MS2

i

νi

=(MSqT −MSqRes)2

MSq2T

t−1 +MSq2

Res

n−t

.

Then a confidence interval for σ2T with a 100(1− α)% is[

νt σ2T

χ2νt ;1−α/2

;νt σ

2T

χ2νt ;α/2

]Maize example:

νT =(1.37695− 0.32118)2

1.376952

4 + 0.321182

15

= 2.32

and [2.32 ∗ 0.2639

8.0308;

2.32 ∗ 0.2639

0.0903

]= [0.07618; 6.7714]



Inference regarding the mean

It is easy to show that the sample mean Y =∑

i,j Yij

n is an unbiasedestimator for µ and has variance

Var(Y ) =1

t(σ2

T +σ2

r) =

1

n(rσ2

T + σ2)

An unbiased estimator of this variance is

Var(Y ) =1

nMSqT

The hypothesis H0 : µ = µ0 can be tested using

tt−1 =y − µ0√Var(Y )

which follows from the Student’s t-distribution with (t-1)d.f. The intervalfor µ with 100(1− α)% confidence has limits

CI (µ) :

[y − tt−1;α/2

√Var(Y ); y + tt−1;1−α/2

√Var(Y )

]Maize example: y = 5.07, Var(Y ) = 1.3770

20 = 0.0688,

t = (5.0705− 0)/√

0.0688 = 19.32 and the CI(µ): [4.34, 5.80].Clarice G.B. Demetrio and Cristian Villegas 108 Modelos Mistos e Componentes de Variancia


Expected mean squares for an ANOVA – CRD withsubsampling

The model for a CRD with subsampling (k subsamples per plot) withtreatment random is

Yijk = µ+ τi + εij + εijk ,

where i = 1, . . . , t, j = 1, . . . , r , k = 1, . . . , k , µ constant, τi random, εijrandom and εijk random. The ANOVA table is

Source df SSq MSq FPlots rt − 1 Y′QPY


t − 1MSqTMSqRes

Residual t(r − 1) Y′QUResY

Y′QUResY

n − tMSqResMSqW

Between samples within plots rt(k − 1) YTQUWY

YTQUWY

rt(k − 1)

MU = In, XG = 1n, MG = XG (XTG XG )−1XT

G = n−1Jn




Then,

SSqT = Y′QTY = 1rk

∑ti=1 T 2

i − C , C =(∑

i,j,k Yijk )2

n

SSqPlots = 1k

∑i,j Y 2

ij. − C , SSqRes = SSqPlots − SSqT

SSqWithin =∑

i,j,k Y 2ijk − C − SSqPlots

Assuming thatYijk = µ+ τi + εij + εijk

where τi ∼ N(0, σ2T ), εij ∼ N(0, σ2

P) and εijk ∼ N(0, σ2PS). Then

E(τi ) = 0, Var(τi ) = E(τ 2i ) = σ2

T ,

E(εij) = 0, Var(εij) = E(ε2ij) = σ2

P ,

E(εijk) = 0, Var(εijk) = E(ε2ijk) = σ2

PS ,



i) E(SSqUnits)

E(SSqUnits) =∑i,j,k

E(Y 2ijk)− E(C )

E(Y 2ijk) = E(µ2) + E(τ 2

i ) + E(ε2ij) + E(dp) = µ2 + σ2

T + σ2

∑i,j,k

E(Y 2ijk

)= nµ2 + nσ2

T + nσ2P + nσ2

PS

C =1

trk

(∑i,j,k

Yijk

)2=

1

trk

[∑i,j,k

(µ+ τi + εij + εijk

)]2=

1

trk

[(trkµ+ rk

∑i

τi + k∑i,j

εij +∑i,j,k

εijk)]2

=1

trk

[(trkµ)2 + (rk)2

(∑i

τi)2

+ k2(∑

i,j

εij)2

+(∑i,j,k

εijk)2

+ dp]

E(C) = trkµ2 +rk

tE[(∑

i

τi)2]

+k

trE[(∑

i,j

εij)2]

+1

rtkE[(∑

i,j,k

εijk)2]

+1

trkE(dp)



But

E[(∑

i

τi)2]

=∑i

E(τ 2i ) +

∑i

E(dp) = tσ2T

E[(∑

i,j

εij)2]

=∑i,j

E(ε2ij

)+∑i,j

E (dp) = trσ2P

E[(∑

i,j,k

εijk)2]

=∑i,j,k

E(ε2ijk)∑i,j,k

E(dp) = trkσ2PS

E(dp) = 0

Then

E(C ) = trkµ2 +rk

ttσ2

T +k

trtrσ2

P +1

trktrkσ2

PS

= trkµ2 + rkσ2T + kσ2

P + σ2PS

and

E(SSqUnits) = (n − rk)σ2T + (n − k)σ2

P + (n − 1)σ2PS

= rk(t − 1)σ2T + k(tr − 1)σ2

P + (n − 1)σ2PS



ii) E(SSqPlots)

E(SSqPlots) =1

k

∑i,j

E(P2ij

)− E(C )

Pij =∑k

Yijk =∑k

(µ+ τi + εij + εijk) = kµ+ kτi + kεij +∑k

εijk

P2ij =

[∑k

(µ+ τi + εij + εijk

)]2= k2µ2 + k2τ 2

i + k2ε2ij +

(∑k

εijk)2

+ dp

E(P2ij

)= k2µ2 + k2E(τ 2

i ) + k2E(ε2ij) + E

[(∑k

εijk)2]

+ E(dp)

= k2µ2 + k2σ2T + k2σ2

P + kσ2PS

E(SSqPlots) =1

k

∑i,j

(k2µ2 + k2σ2

T + k2σ2P + kσ2

PS

)− E(C)

=1

k

(trk2µ2 + trk2σ2

T + trk2σ2P + trkσ2

PS

)− E(C)

= trkµ2 + trkσ2P + trkσ2

P + trσ2PS − E(C)

= rk(t − 1)σ2T + k(tr − 1)σ2

P + (tr − 1)σ2PS



iii) E(SSqT ) and E(MSqT )

SSqT =1

rk

t∑i=1

T 2i − C

T 2i =

(rkµ+ rkτi + k

∑j

εij +∑j,k

εijk)2

=(rkµ)2

+(rkτi

)2+(k∑j

εij)2

+(∑

j,k

εijk)2

+

2(rkµ)(

rkτi)

+ 2(rkµ)(

k∑j

εij)

+ 2(rkµ)(∑

j,k

εijk)

+

2(rkτi

)(k∑j

εij)

+ 2(rkτi

)(∑j,k

εijk)

+ 2(k

r∑j=1

εij)(∑

j,k

εijk)

E(T 2i

)= r 2k2µ2 + r 2k2E

(τ 2i

)+ k2E

(∑j

εij)2

+ E(∑

j,k

εijk)2

= r 2k2µ2 + r 2k2σ2T + rk2σ2

P + rkσ2PS



1

rk

t∑i=1

E(T 2i

)= trkµ2 + trkσ2

T + tkσ2P + tσ2

PS

E(SSqT ) =1

rk

t∑i=1

E(T 2i

)−E(C ) = rk(t−1)σ2

T +k(t−1)σ2P +(t−1)σ2

PS

and

E(MSqT ) =E(SSqT )

t − 1= rkσ2

T + kσ2P + σ2

PS

iv) E(SSqRes) and E(MSqRes)

SSqRes = SSqPlots − SSqT

E(SSqRes) = kt(r−1)σ2P+t(r−1)σ2

PS and E(MSqRes) =E(SSqRes)

t(r − 1)= kσ2

P+σ2PS

v) E(SSqWithin) and E(MSqWithin)

SSqWithin = SSqUnits − SSqPlots

E(SSqWithin) = tr(k − 1)σ2PS and E(MSqWithin) =

E(SSqWithin)

tr(k − 1)= σ2

PS



E(MSq)Source df SSq MSq T fixed T RandomPlots rt − 1 YTQUY

Treat. t − 1 YTQTYYTQTY

t − 1σ2 + kσ2

P + qT (Ψ) σ2 + kσ2P + rkσ2

T

Res. t(r − 1) YTQUResY

YTQUResY

n − tσ2 + kσ2

P σ2 + kσ2P

S[Plots] rt(k − 1) YTQUWY

YTQUWY

rt(k − 1)σ2 σ2

qT (Ψ) =Ψ′QTΨ

t − 1=

t∑i=1

rk(τi − τ .)2

t − 1



Wood shearing strength

Example

The effects of six treatments (a 2 x 3 set of factorial treatments from twotypes of resin and 3 wood blade densities) on the shearing strength are tobe investigated. The two types of resin were APM (resin of highmolecular weight) and BPM (resin of low molecular weight) and thethree wood blade densities were VH (Very Hard), H (Hard) and S (Soft).The trial was conducted using a completely randomized design with threewood panels from each treatment and the shearing strength (kgf/cm2) offive test bodies from each panel resin was measured.

interest of course in each particular treatment used

no interest in each panel which are very depending on the circumstances

no interest in each test body which are very depending on thecircumstances

interest in estimating the variance of the panel effect as a source ofrandom variation in the data

the five body tests from the same panel share something whichpresumably violates the assumption of independence



Wood Test APM BPMBlade Body VH H S VH H S

1 1 10.620 6.251 9.982 18.23 9.553 11.3901 2 16.840 7.825 12.510 20.24 10.140 12.6301 3 11.120 8.606 13.650 18.92 10.900 10.6501 4 7.407 9.421 10.020 22.67 9.762 9.6521 5 14.400 6.405 7.154 24.92 10.250 6.3062 1 21.890 21.580 12.620 21.16 17.630 14.3702 2 20.770 20.060 12.990 10.82 20.700 13.0602 3 18.670 15.830 12.430 20.36 16.080 12.1302 4 16.160 16.120 14.250 20.33 14.770 14.4702 5 18.780 16.200 11.820 14.16 19.270 13.7503 1 18.710 13.550 7.385 14.40 11.400 12.3903 2 23.460 17.070 6.075 15.85 15.860 12.3703 3 16.650 14.210 12.890 18.37 11.270 13.7603 4 20.820 17.920 12.220 13.95 13.370 11.3803 5 16.240 18.670 7.781 16.04 13.920 13.210



Source df SSq MSq FPanels 17

Treatments 5 556.5 111.30 1.77Residual 12 754.0 62.83

Between body tests within panels 72 428.5 5.95

ANOVA estimators:

σ2 = MSqWithin = 5.95, σ2Res =

MSqRes −MSqWithin

t=

62.83− 5.95

5= 11.38

MM REML MLσ2P 11.38 11.38 7.19σ2PS 5.95 5.95 5.95


R program

CRDk_wood.dat <- read.csv2("Wood_CRD.csv",h=T)CRDk_wood.dat$Panels <- factor(rep(1:18, each=5))CRDk_wood.dat$Treat <-gl(6,15)CRDk_wood.dat$Panel <- factor(CRDk_wood.dat$Panel)CRDk_wood.dat$TB <- factor(CRDk_wood.dat$TB)

summary(aov(Strength ~ Treat+Error(Panels/TB), CRDk_wood.dat))(summary(CRDk_wood.aov)[[1]])[[1]]$"Mean Sq"[2](summary(CRDk_wood.aov)[[2]])[[1]]$"Mean Sq"[1]CRDk_wood.lm <-lm(Strength ~ Treat+Panels/TB, CRDk_wood.dat)(MSq_Res <- anova(CRDk_wood.lm)$"Mean Sq"[2])(MSq_Within <- anova(CRDk_wood.lm)$"Mean Sq"[3])

## components of variance - ANOVA method(sigma2_PS <- MSq_Within)(sigma2_Res <-(MSq_Res-MSq_Within)/5)

## REML using library nlmelibrary(nlme)CRDk_wood.lme <- lme(Strength ~ Treat,random = ~ 1|Panels,data=CRDk_wood.dat,method="REML")VarCorr(CRDk_wood.lme)summary(CRDk_wood.lme)## MLsummary(lme(Strength ~ Treat,random = ~ 1|Panels,data=CRDk_wood.dat,method="ML"))2.680866^2; 2.439458^2VarCorr(lme(Strength ~ Treat,random = ~ 1|Panels,data=CRDk_wood.dat,method="ML"))

## REML using library lme4library(lme4)CRDk_wood.lmer <- lmer(Strength ~ Treat + (1|Panels),data=CRDk_wood.dat,REML=TRUE)summary(CRDk_wood.lmer)summary(CRDk_wood.lmer)@coefsdata.frame(summary(CRDk_wood.lmer)@REmat)## MLlmer(Strength ~ Treat + (1|Panels),data=CRDk_wood.dat,REML=FALSE)


Randomized Complete Block Design (RCBD)

Consider a randomized complete block design,

Yjk = µ+ βj + τk + εjk , j = 1, . . . , r , k = 1, . . . , t

1. Fixed model with βj and τk as fixed effects and εjk random

βj as fixed ⇒ E(βj) = βj ,E(β2j ) = β2

j and Var(βj) = 0

τk as fixed ⇒ E(τk) = τk , E(τ 2k ) = τ 2

k and Var(τk) = 0εjk ∼ N(0, σ2) ⇒ E(ε2

jk) = 0 and Var(εjk) = E(εjk) = σ2

εjk and εj′k′ (j 6= j ′ and/or k 6= k ′) are independent

ThenVar(Yjk) = Var(µ+ βj + τk + εjk) = Var(εjk) = σ2

Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(εjk , εjk′) = 0 (observations from the same block and differenttreatments)Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(εjk , εj′k) = 0 (observations from different blocks and the sametreatment)Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k′) =Cov(εjk , εj′k′) = 0 (observations from different blocks and differenttreatments)



2. Mixed model with βj as random effect, τk as fixed effect and εjk asrandom

βj ∼ N(0, σ2B) as random ⇒ E(βj) = 0 and Var(βj) = E(β2

j ) = σ2B

τk as fixed ⇒ E(τk) = τk , E(τ 2k ) = τ 2

k and Var(τk) = 0

εjk ∼ N(0, σ2) ⇒ E(εjk) = 0 and Var(εjk) = E(ε2jk) = σ2

βj and εjk , βj and βj′ (j 6= j ′), εjk and εj′k′ (j 6= j ′ and/or k 6= k ′)are independent

Then

Var(Yjk) = Var(µ+ βj + τk + εjk) = Var(βj + εjk) = σ2 + σ2B

Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(βj + εjk , βj + εjk′) = σ2

B (observations from the same block anddifferent treatments)

Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(βj + εjk , βj′ + εj′k) = 0 (observations from different blocks andthe same treatment)

Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k′) =Cov(βj + εjk , βj′ + εj′k′) = 0 (observations from different blocks anddifferent treatments)



3. Mixed model with βj as fixed effect, τk as random effect and εjk asrandom

βj as fixed ⇒ E(βj) = βj ,E(β2j ) = β2

j and Var(βj) = 0

τk ∼ N(0, σ2B) as random ⇒ E(τk) = 0 and Var(τk) = E(τ 2

k ) = σ2T


τk and εjk , τk and τk′ (k 6= k ′), εjk and εj′k′ (j 6= j ′ and/or k 6= k ′)are independent

Then

Var(Yjk) = Var(µ+ βj + τk + εjk) = Var(βj + τk + εjk) = σ2 + σ2T

Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(τk + εjk , τk′ + εjk′) = 0 (observations from the same block anddifferent treatments)

Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(τk + εjk , τk + εj′k) = σ2

T (observations from different blocks andthe same treatment)

Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k) =Cov(τk + εjk , τk′ + εj′k) = 0 (observations from different blocks anddifferent treatments)



4. Random model with βj as random effect, τk as random effect and εjkas random

βj ∼ N(0, σ2B) as random ⇒ E(βj) = 0 and Var(βj) = E(β2

j ) = σ2B

τk ∼ N(0, σ2B) as random ⇒ E(τk) = 0 and Var(τk) = E(τ 2

k ) = σ2T


τk and εjk , τk and τk′ (k 6= k ′), βj and βj′ (j 6= j ′), εjk and εj′k′(j 6= j ′ and/or k 6= k ′) are independent

Then

Var(Yjk) = Var(µ+βj +τk +εjk) = Var(βj +τk +εjk) = σ2 +σ2B +σ2

T

Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(βj + τk + εjk , βj + τk′ + εjk′) = σ2

B (observations from the sameblock and different treatments)

Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(βj + τk + εjk , βj′ + τk + εj′k) = σ2

T (observations from differentblocks and the same treatment)

Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k′) =Cov(βj + τk + εjk , βj′ + τk′ + εj′k′) = 0 (observations from differentblocks and different treatments)



Variance-covariance matrix - Example

Suppose a RCBD with r = 2, t = 3, that is

y = [y11 y12 y13 y21 y22 y23]T

1. Fixed model with βj and τk as fixed effects and εjk random.In this case, in matrix notation:

Y = XGµ+ XBβ + XTτ + ε

E(Y) = XGµ+ XBβ + XTτ and Var(Y) = Σ = σ2I6

XG = 16, XB =

[13 03

03 13

], XT =

[I3

I3

],

Var(Y) =

σ2 0 0 0 0 00 σ2 0 0 0 00 0 σ2 0 0 00 0 0 σ2 0 00 0 0 0 σ2 00 0 0 0 0 σ2

,



2. Mixed model with βj as random effect, τk as fixed effect and εjk asrandomIn this case, in matrix notation:

Y = XGµ+ XTτ + ZBβ + ε

E(Y) = XGµ+ XTτ and Var(Y) = ZGZ′ + Σ = σ2BZBZ′B + σ2I6

XG = 16, XT =

[I3

I3

], ZB =

[13 03

03 13

], G = σ2

B I2, Σ = σ2I6

Var(Y) =

σ2 + σ2B σ2

B σ2B 0 0 0

σ2B σ2 + σ2

B σ2B 0 0 0

σ2B σ2

B σ2 + σ2B 0 0 0

0 0 0 σ2 + σ2B σ2

B σ2B

0 0 0 σ2B σ2 + σ2

B σ2B

0 0 0 σ2B σ2

B σ2 + σ2B



3. Mixed model with βj as fixed effect, τk as random effect and εjk asrandomIn this case, in matrix notation:

Y = XGµ+ XBβ + ZTτ + ε

E(Y) = XGµ+ XBβ and Var(Y) = ZGZ′ + Σ = σ2TZTZ′T + σ2I6

XG = 16, XB =

[13 03

03 13

], ZT =

[I3

I3

], G = σ2

T I3, Σ = σ2I6

Var(Y) =

σ2 + σ2T 0 0 σ2

T 0 00 σ2 + σ2

T 0 0 σ2T 0

0 0 σ2 + σ2T 0 0 σ2

T

σ2T 0 0 σ2 + σ2

T 0 00 σ2

T 0 0 σ2 + σ2T 0

0 0 σ2T 0 0 σ2 + σ2

T



4. Random model with βj as random effect, τk as random effect andεjk as randomIn this case, in matrix notation:

Y = XGµ+ ZBβ + ZTτ + ε

E(Y) = XGµ and Var(Y) = ZGZ′ + Σ = σ2BZBZ′B + σ2

TZTZ′T + σ2I6

XG = 16, ZB =

[13 03

03 13

], ZT =

[I3

I3

], G = σ2

T I3, Σ = σ2I6

σ2 + σ2B + σ2

T σ2B σ2

B σ2T 0 0

σ2B σ2 + σ2

B + σ2T σ2

B 0 σ2T 0

σ2B σ2

B σ2 + σ2B + σ2

T 0 0 σ2T

σ2T 0 0 σ2 + σ2

B + σ2T σ2

B σ2B

0 σ2T 0 σ2

B σ2 + σ2B + σ2

T σ2B

0 0 σ2T σ2

B σ2B σ2 + σ2

B + σ2T



E(MSq)Source df SSq Fixed B random T random Random

Blocks r − 1 Y′QBY σ2 + qB (Ψ) σ2 + tσ2B σ2 + qB (Ψ) σ2 + tσ2

BUnits[Blocks] r(t − 1) Y′QUYTreatments t − 1 Y′QT Y σ2 + qT (Ψ) σ2 + qT (Ψ) σ2 + rσ2

T σ2 + rσ2T

Residual (r − 1)(t − 1) Y′QUResY σ2 σ2 σ2 σ2

whereXG = 1n, XB = Ir ⊗ 1t , XT = 1r ⊗ It

MU = (Ir ⊗ It) = Irt , MG = n−1Jr ⊗ Jt = n−1Jn

MB = XB(XTB XB)−1XT

B = t−1Ir ⊗ Jt

MT = XT (XTTXT )−1XT

T = r−1Jr ⊗ It

QB = MB −MG , QT = MT −MG , QU = MU −MG

QURes= MU −MB −MT + MG

qB(Ψ) =Ψ′QBΨ

r − 1=

r∑j=1

t(βj − β.)2

r − 1

qT (Ψ) =Ψ′QTΨ

t − 1=

t∑k=1

b(τk − τ .)2

t − 1



Expected mean squares for an ANOVA – RCBD

Assuming that

Yjk = µ+ βj + τk + εjk

where τk is a fixed effect, βj ∼ N(0, σ2B) and εjk ∼ N(0, σ2). Then

E(τk) = τk , E(τ 2k ) = τ 2

k , Var(τk) = 0

E(βj) = 0, Var(βj) = E(β2j ) = σ2

B ,

E(εjk) = 0, Var(εjk) = E(ε2jk) = σ2,

i) E(SSqB) and E(MSqB)

SSqB =1

t

∑j

B2j − C , C =

1

rt

(∑j,k

Yjk

)2

Bj =∑k

Yjk =∑k

(µ+ βj + τk + εjk) = (tµ+ tβj +∑k

τk +∑k

εjk)



B2j = (

∑k

µ+ βj + τk + εjk)2 = (tµ+ tβj +∑k

τk +∑k

εjk)2

= (tµ)2 +(tβj)2

+(∑

k

τk)2

+(∑

k

εjk)2

+ 2t2µβj + 2tµ∑k

τk +

2tµ∑k

εjk + 2tβj∑k

τk + 2tβj∑k

εjk + 2∑k

τk∑k

εjk

E(B2j ) = t2µ2 + t2E

(β2j

)+ E(

∑k

τk)2 + E(∑

k

εjk)2

+ 2tµ∑k

τk

= t2µ2 + t2σ2B + (

∑k

τk)2 + tσ2 + 2tµ∑k

τk

1

t

r∑j=1

E(B2j ) =

1

t

[rt2µ2 + rt2σ2

B + r(∑k

τk)2 + rtσ2 + 2rtµ∑k

τk

]= rtµ2 + rtσ2

B +r

t

(∑k

τk)2

+ rσ2 + 2rµ∑k

τk



∑j,k

Yjk =∑j,k

(µ+ βj + τk + εjk) = rtµ+ t∑j

βj + r∑k

τk +∑j,k

εjk

(∑j,k

Yjk

)2=

(rtµ+ t

∑j

βj + r∑k

τk +∑j,k

εjk)2

= r 2t2µ2 + t2(∑j

βj)2

+ r 2(∑k

τk)2

+(∑

j,k

εjk)2

+

2rt2µ∑j

βj + 2r 2tµ∑k

τk + 2rtµ∑j,k

εjk + 2tr∑j

βj∑k

τk +

2t∑j

βj∑j,k

εjk + 2r∑k

τk∑j,k

εjk

1

rt

(∑j,k

Yjk

)2=

1

rt

[r 2t2µ2 + t2rσ2

B + r 2(∑k

τk)2 + rtσ2 + 2r 2tµ∑k

τk]

E(C) = rtµ2 + tσ2B +

r

t(∑k

τk)2 + σ2 + 2rµ∑k

τk

E(SSqB) = t(r − 1)σ2B + (r − 1)σ2 and E(MSqB) =

E(SSqB)

r − 1= tσ2

B + σ2



ii) E(SSqT ) and E(MSqT )

SSqT =1

r

t∑i=1

T 2k − C

Tk =∑j

(µ+ βj + τk + εjk) = rµ+∑j

βj + rτk +∑j

εjk

T 2k =

(rµ+

∑j

βj + rτk +∑j

εjk)2

= r 2µ2 +(∑

j

βj)2

+ r 2τ 2k +

(∑j

εjk)2

+ 2rµ∑j

βj + 2r 2µτk +

2rµ∑j

εjk + 2rτk∑j

βj + 2∑j

βj∑j

εjk + 2rτk∑j

εjk

E(T 2k

)= r 2µ2 + rσ2

B + r 2τ 2k + rσ2 + 2r 2µτk

1

r

t∑k=1

E(T 2k

)= trµ2 + tσ2

B + rt∑

k=1

τ 2k + tσ2 + 2rµ

t∑k=1

τk



E(SSqT ) =1

r

t∑k=1

E(T 2k

)− E(C ) = (t − 1)σ2 + r

t∑k=1

τ 2k −

r

t(∑k

τk)2

and

E(MSqT ) =E(SSqT )

t − 1= σ2 +

r

t − 1

[ t∑k=1

τ 2k −

1

t(∑k

τk)2]

= σ2 + qT (Ψ)

where qT (Ψ) =Ψ′QTΨ

t − 1=

t∑k=1

r(τk − τ .)2

t − 1



iv) E(SSqRes) and E(MSqRes)

E(SSqRes) = E(SSqTotal)−E(SSqT )−E(SSqB) =∑jk

E(Y 2j,k

)−1

r

t∑k=1

E(T 2k

)−E(SSqB)

Y 2jk = (µ+ βj + τk + εjk)2

= µ2 + β2j + τ 2

k + ε2jk + 2µβj + 2µτk + µεjk + 2βjτk + 2βjεjk + 2τkεjk

E(Y 2jk

)= µ2 + σ2

B + τ 2k + σ2 + 2µτk

∑j,k

E(Y 2jk

)= rtµ2 + trσ2

B + r∑k

τ 2k + rtσ2 + 2rµ

∑k

τk

E(SSqRes) = (t − 1)(r − 1)σ2

and

E(MSqRes) =E(SSqRes)

(t − 1)(r − 1)= σ2



Penicillin yield (Brien, 2009)

Example

The effects of four treatments on the yield of penicillin are to beinvestigated. It is known that corn steep liquor, an important rawmaterial in producing penicillin, is highly variable from one blending of itto another. To ensure that the results of the experiment apply to morethan one blend, five blends (blocks) are to be used in the experiment.The trial was conducted using the same blend in four flasks andrandomizing the four treatments to these four flasks.

interest of course in each particular treatment usedno interest in each blend which are very depending on thecircumstancesblend effect can be viewed as a sample of a random blend effect(levels are chosen at random from an infinite set of blend levels)interest in estimating the variance of the blend effect as a source ofrandom variation in the datathe four flasks with the same blend share something whichpresumably violates the assumption of independence



. . .

Blend 1

Flask 1 Flask 3 Fask 4Flask 2

Blend 5

Flask 1 Flask 3 Flask 4Flask 2

TreatmentBlend A B C D

1 89 88 97 942 84 77 92 793 81 87 87 854 87 92 89 845 79 81 80 88



Penicillin yield

ANOVA table using R


Blend 4 264.0 66.0 1.97 0.15

Plots[Blocks] 15

Treat 3 70.0 23.3 1.24 0.34

Residual 12 226.0 18.8

ANOVA estimators:

σ2 = MSqRes = 18.8, σ2B =

MSqB −MSqRes

t=

66.0− 18.8

4= 11.8

MM REML ML

σ2B 11.8 11.8 9.4

σ2 18.8 18.8 15.1


SAS programdata pen;

input Blend Treat$ Yield @@;

cards;

1 A 89 3 C 87

1 B 88 3 D 85

1 C 97 4 A 87

1 D 94 4 B 92

2 A 84 4 C 89

2 B 77 4 D 84

2 C 92 5 A 79

2 D 79 5 B 81

3 A 81 5 C 80

3 B 87 5 D 88

;

* Moment Method;

proc glm data=pen;

class Blend Treat;

model Yield = Blend Treat;

run;


proc mixed data=pen;

class Blend Treat;

model Yield = Treat / solution ddfm=sat;

random Blend / solution ;

run;


proc mixed data=pen method=ML;

class Blend Treat;

model Yield = Treat / solution ddfm=sat;

random Blend / solution ;

run;

R program

#set up data.frame with factors Flasks, Blends and Treat and response variable YieldRCBDPen.dat <- data.frame(Blend=factor(rep(c(1,2,3,4,5), times=c(4,4,4,4,4))),

Flask = factor(rep(c(1,2,3,4), times=5)),Treat = factor(rep(c("A","B","C","D"), times=5)))

RCBDPen.dat$Yield <- c(89,88,97,94,84,77,92,79,81,87,87,85,87,92,89,84,79,81,80,88)

RCBDPen.dat#attach(RCBDPen.dat)# Moment MethodRCBDPen.lm <- lm(Yield ~ Blend + Treat, RCBDPen.dat)anova(RCBDPen.lm)(66.000-18.833)/4(anova(RCBDPen.lm)$"Mean Sq"[1]-anova(RCBDPen.lm)$"Mean Sq"[3])/(length(levels(RCBDPen.dat$Blend))-1)anova(lm(Yield ~1, RCBDPen.dat)) # to get the Total SSrequire(nlme)# Restricted Maximum Likelihood MethodRCBD.reml <- lme(Yield ~ Treat, random = ~1|Blend,RCBDPen.dat, method="REML")summary(RCBD.reml,corr = F)VarCorr(RCBD.reml)# Maximum Likelihood MethodRCBD.ml <- lme(Yield ~ Treat, random = ~1|Blend, RCBDPen.dat, method="ML")summary(RCBD.ml,corr = F)VarCorr(RCBD.ml)# Restricted Maximum Likelihood Method using ASReml-Rrequire(asreml)RCBD.asreml<- asreml(Yield ~ Treat,random=~Blend,data=RCBDPen.dat)summary(RCBD.asreml)summary(RCBD.asreml)$varcomp


U 1 x 1 Mean

Blend 5 x 4 B

Blend∧Flask 20 x 15 F[B]

U 1 h 1 Mean

Treat 4 h 3 T



U MG x MG Mean

Blend MB x MB −MG B

Blend∧Flask MBF x MBF −MB F[B]

U MG h MG Mean

Treat MT h MT −MG T



U 1 x 1 Mean

Blend 4σ2B x σ2 + 4σ2

B B

Blend∧Flask σ2 x σ2 F[B]

U 1 h 1 Mean

Treat qT (Ψ) h qT (Ψ) T



Randomized Complete Block Design (RCBD) withsubsampling

In breeding experiments, when families or clones are evaluated, themeasurements are made at individual’s levels within the plots inorder to estimate the variability within the plot.

For clones evaluation (same genes), as in sugar-cane, potato ormanioc the variation within the plot will be only due to theenvironment.

The same is true for homozygote lines.

For segregant families of plants, the phenotypic variation within theplots is due to two components: one genetic and anotherenvironmental, that is, the phenotypic variance (σ2

W ) is equal to theenvironmental variance within the plots (σ2

E ) plus genetic variancewithin the families (σ2

G ).

This type of information allows the geneticist to get estimates ofgenetic parameters as heritability and expected gain with theselection.



Eucalyptus data (Ramalho et all, 2013)

Example

For the evaluation of progenies of Eucalyptus camaldulensis a RCBD wasperformed using 10 progenies as treatments and three blocks. Theresponse variable was wood volume (m3 × 10−4) of six trees per plot.

Block I Block II Block IIIProgeny I II III IV V VI I II III IV V VI I II III IV V VI Means

1 55 96 212 289 140 142 218 162 106 124 119 155 105 38 124 119 59 58 128.942 124 230 108 111 46 111 146 138 194 236 214 116 218 207 63 146 212 192 156.223 42 134 229 246 166 175 181 262 150 290 112 133 239 195 195 146 169 117 176.724 99 75 175 64 106 192 388 207 339 256 124 282 320 356 77 367 273 160 214.445 201 131 33 236 195 273 124 225 206 147 281 90 155 368 285 210 142 111 189.616 109 131 124 256 110 94 223 108 69 59 129 54 119 92 218 106 70 257 129.337 138 62 27 132 138 100 214 290 60 175 106 80 131 34 166 51 49 24 109.838 37 70 38 157 84 142 181 48 194 134 108 91 81 250 91 295 175 30 122.569 126 106 210 61 190 86 126 98 41 29 274 54 168 210 256 90 106 142 131.83

10 104 136 140 137 111 358 48 62 68 68 20 157 67 134 157 108 33 194 116.78

Note that the number of possible descendents of every plant is enormous and in general onlya sample of them is evaluated (a random effect);

We have three possible types of analysis: at individual level, at mean level or at total level.



Randomized Complete Block Design (RCBD) withsubsampling

ANOVA at individual level:Consider a randomized complete block design, with subsampling

Yijk = µ+ βi + τj + εij + εijk , i = 1, . . . , r , j = 1, . . . , t, k = 1, . . . , s

where µ constant, βi is the effect of the i-th block (fixed), τj is the effectof the j-th treatment (random), εij is the experimental error at plot’slevel and εijk is the effect of the individual (e.g. plant) k within the plotij . The ANOVA table is

Source df SSq MSq E(MSq) F

Blocks r − 1 Y′QBY σ2W + sσ2

e + qB (Ψ)Plots[Blocks] r(t − 1) Y′QPY

Treatments t − 1 Y′QT YY′QT Y

t − 1σ2W + sσ2

e + rsσ2T

MSqTMSqRes

Residual (r − 1)(t − 1) Y′QUResY

Y′QUResY

n − tσ2W + sσ2

eMSqR es

MSqW

Samples[Blocks ∧ Plots] rt(s − 1) YT QUWY

YT QUWY

rt(k − 1)σ2W



Then,

SSqB = 1t

∑i B2

i − C , C =(∑

i,j,k Yijk )2

n

SSqT = 1rs

∑rj=1 T 2

j − C ,

SSqPlots = 1k

∑i,j Y 2

ij. − C ,

SSqPlots[Blocks] = 1k

∑i,j Y 2

ij. − C − SSqB = 1s

∑i,j Y 2

ij. − 1t

∑i B2

i ,

SSqRes = SSqPlots − SSqT ,

SSqResidual = SSqPlots[Blocks] − SSqT ,

SSqWithin =∑

i,j,k Y 2ijk − C − SSqPlots[Blocks]



Y = XGµ+ XBβ + ZTτ + Zeε+ ε

E(Y) = XGµ+XBβ and Var(Y) = ZGZ′+Σ = σ2TZTZ′T+σ2

eZeZ′e+σ2W Irst

y = [y111 . . . y11s . . . y1t1 . . . y1ts . . . yr11 . . . yr1s . . . yrt1 . . . yrts ]T

XG = 1rst = 1r ⊗ 1t ⊗ 1s , XB =

1st 0st . . . 0st

0st 1st . . . 0st

. . . . . . . . . . . .0st 0st . . . 1st

= Ir ⊗ 1t ⊗ 1s ,

ZT =

1s 0s . . . 0s

0s 1s . . . 0s

. . . . . . . . . . . .0s 0s . . . 1s

. . . . . . . . . . . .1s 0s . . . 0s

0s 1s . . . 0s

. . . . . . . . . . . .0s 0s . . . 1s

, Ze =

1s 0s . . . 0s . . . 0s 0s . . . 0s

0s 1s . . . 0s . . . 0s 0s . . . 0s

. . . . . . . . . . . . . . . . . . . . . . . . . . .0s 0s . . . 1s . . . 0s 0s . . . 0s

. . . . . . . . . . . . . . . . . . . . . . . . . . .0s 0s . . . 0s . . . 1s 0s . . . 0s

0s 0s . . . 0s . . . 0s 1s . . . 0s

. . . . . . . . . . . . . . . . . . . . . . . . . . .0s 0s . . . 0s . . . 0s 0s . . . 1s

,

ZT = 1r ⊗ It ⊗ 1s , Ze = Ir ⊗ It ⊗ 1s

GT = σ2T It , Ge = σ2

e Irt , Σ = σ2Irst

Var(Y) = σ2TJr ⊗ It ⊗ Js + σ2

e Ir ⊗ It ⊗ Js + σ2W Irst



Assuming r = 4, t = 3 and s = 2

Var(Y) = σ2TJ4 ⊗ I3 ⊗ J2 + σ2

e I4 ⊗ I3 ⊗ J2 + σ2W I24

Var(Y) =

V1 V2 V2 V3 V2 V2 V3 V2 V2 V3 V2 V2

V2 V1 V2 V2 V3 V2 V2 V3 V2 V2 V3 V2

V2 V2 V1 V2 V2 V3 V2 V2 V3 V2 V2 V3

V3 V2 V2 V1 V2 V2 V3 V2 V2 V3 V2 V2

V2 V3 V2 V2 V1 V2 V2 V3 V2 V2 V3 V2

V2 V2 V3 V2 V2 V1 V2 V2 V3 V2 V2 V3

V3 V2 V2 V3 V2 V2 V1 V2 V2 V3 V2 V2

V2 V3 V2 V2 V3 V2 V2 V1 V2 V2 V3 V2

V2 V2 V3 V2 V2 V3 V2 V2 V1 V2 V2 V3

V3 V2 V2 V3 V2 V2 V3 V2 V2 V1 V2 V2

V2 V3 V2 V2 V3 V2 V2 V3 V2 V2 V1 V2

V2 V2 V3 V2 V2 V3 V2 V2 V3 V2 V2 V1

V1 =

[σ2W + σ2

e + σ2T σ2

e + σ2T

σ2e + σ2

T σ2W + σ2

e + σ2T

]V2 =

[0 00 0

]V3 =

[σ2T σ2

T

σ2T σ2

T

]Clarice G.B. Demetrio and Cristian Villegas 149 Modelos Mistos e Componentes de Variancia


Eucalyptus data

Source df SSq MSq FBlocks 2 12987.88 6493.94Plots[Blocks] 27

Progenies 9 199609.12 22178.79 2.23Residual 18 178969.57 9942.75

Trees[Blocks ∧ Plots] 150 764157.50 5094.38

ANOVA estimators:σ

2W = MSqWithin = 5094.38

σ2e =

MSqRes −MSqWithin

s=

9942.75− 5094.38

6= 808.06

σ2P =

MSqP −MSqRes

rs=

22178.79− 9942.75

3x6= 679.78

Note that

σ2W is the estimate of the phenotypic variance within plots, that is, the variance between

trees within plots from the same family. Because there is genetic variation between treesfrom a family of half-sibs, σ2

W = σ2G + σ2

E

σ2Wσ2e

= 5094.38808.06 = 6.30 which shows that the phenotypic variation between plants within the

plots is 6.3 times bigger than the error variance.

σ2P is the estimate of the genetic variance between families of half-sibs.



MM REML MLσ2P 679.78 679.78 611.80σ2e 808.06 808.06 642.35σ2W 5094.38 5094.38 5094.38



The variance of σ2P is given by

Var(σ2P) =

2

r 2s2

( MSq2P

t − 1 + 2+

MSq2Res

(t − 1)(r − 1) + 2

)=

2

3262

(22178.742

9 + 2+

9942.752

18 + 2

)= 306549.3

Another important estimate is the phenotypic variance between themeans of the families (σ2

Pwhich can be obtained by three ways:

i) Using the variance of the means of the progenies

σ2P =

1

9

(128.942+156.222+. . .+116.782−128.94 + . . . 116.78

10

)= 1232.16

ii) Using MSqP , that is,

σ2P =

MSqP

rs=

22178.79

3× 6= 1232.16

iii) Using the components of variance

σ2P =

MSqPrs

=σ2W + sσ2

e + rsσ2P

rs=

5094.38

3× 6+

808.06

3+ 679.78 = 1232.16



The last expression helps the geneticist to study the ways ofreducing the phenotypic variance between means of progenies.

One way is to improve the experimental precision, and as aconsequence decrease the estimates of σ2

W and σ2e .

Another option is to increase de number r of replicates or thenumber s of plants per plot.

It is possible to play with r and k to see what is better: to increasethe number of replicates or the number of plants per plot

r = 4 and s = 6 ⇒ σ2P = 5094.38

4×6+ 808.06

4+ 679.78 = 1094.06

r = 3 and s = 8 ⇒ σ2P = 5094.38

3×8+ 808.06

3+ 679.78 = 1161.40

which shows that the better option is to increase the number ofreplicates instead the number of plants per plot.



τi (BLUE ) τi (BLUP) µi (BLUE) µi (BLUP)-18.683333 -10.307910 128.9444 125.58.594444 4.741700 156.2222 140.6

29.094444 - 16.051895 176.7222 151.966.816667 - 36.863881 214.4444 172.741.983333 23.162913 189.6111 159.0-18.294444 -10.093353 129.3333 125.7-37.794444 -20.851832 109.8333 115.0-25.072222 -13.832768 122.5556 122.0-15.794444 -8.714061 131.8333 127.1-30.850000 -17.020465 116.7778 118.8

τBLUE = yP − y

τEBLUP = τBLUEσ2P

σ2P +

σ2e

r +σ2W

rs



ANOVA for totals:



ANOVA for means:





Heritability coefficient

The amount of genetic variation among the individuals of a speciesof crop or domesticated animal can be compared with the amount ofvariation due to non-genetic causes in a ratio called the heritability.

The heritability of a trait is defined as

h2 =σ2G

σ2Ph

where σ2G is the genetic component of variance, i.e. the part of the

variation in the organism’s phenotype (its observable traits) that isdue to genetic effects; σ2

Ph is the phenotypic variance, i.e. thevariance due to the combined effects of genotype and environment.

For the Eucalyptus example

h2 =σ2P

σ2W

rs +σ2e

r + σ2P

=679.78

5094.383×6 + 808.06

3 + 679.78= 0.5517


R program

rm(list=ls(all=TRUE))RCBDk_Eucaliptus.dat<- read.table("RCBDkEucaliptus.csv", sep=";", h=T)names(RCBDk_Eucaliptus.dat)head(RCBDk_Eucaliptus.dat)str(RCBDk_Eucaliptus.dat)

RCBDk_Eucaliptus.dat$block<- factor(RCBDk_Eucaliptus.dat$block)RCBDk_Eucaliptus.dat$progeny<- factor(RCBDk_Eucaliptus.dat$progeny)RCBDk_Eucaliptus.dat$plots<- factor(rep(1:10,times=18))str(RCBDk_Eucaliptus.dat)

## ANOVARCBDk_Eucaliptus.aov<-aov(volume~progeny+Error(block/plots),RCBDk_Eucaliptus.dat)summary(RCBDk_Eucaliptus.aov)

(MSqB <-(summary(RCBDk_Eucaliptus.aov)[[1]])[[1]]$"Mean Sq"[1])(MSqP <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Mean Sq"[1])(MSqRes <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Mean Sq"[2])(MSqWithin <- (summary(RCBDk_Eucaliptus.aov)[[3]])[[1]]$"Mean Sq"[1])(dfB <-(summary(RCBDk_Eucaliptus.aov)[[1]])[[1]]$"Df"[1])(dfP <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Df"[1])(dfRes <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Df"[2])(dfW <- (summary(RCBDk_Eucaliptus.aov)[[3]])[[1]]$"Df"[1])

## Components of variance - Moment method(sigma2_W <- MSqWithin)(sigma2_Res<- (MSqRes - MSqWithin)/6)(sigma2_P<- (MSqP- MSqRes)/(3*6))(sigma2_W/sigma2_Res)(h2 <- sigma2_P/(sigma2_P+sigma2_Res/3+sigma2_W/18))## Estimate of Var(sigma2_P)(Var_sigma2_P=2/(3^2*6^2)*(MSqP^2/(dfP+2)+MSqRes^2/(dfRes+2)))sqrt(Var_sigma2_P)

## BLUE and EBLUP for tau - calculating step by step(ybar <- mean(RCBDk_Eucaliptus.dat$volume))(mean_P <- tapply(RCBDk_Eucaliptus.dat$volume,RCBDk_Eucaliptus.dat$progeny,mean))mean(mean_P) ## mean of the Progeny means(mean_B <- tapply(RCBDk_Eucaliptus.dat$volume,RCBDk_Eucaliptus.dat$block,mean))mean(mean_B)(tau_BLUE <- mean_P - ybar)tau_EBLUPc <- tau_BLUE*sigma2_P/(sigma2_P+sigma2_Res/3+sigma2_W/18)

## REML using library lme4library(lme4)RCBDk_Eucaliptus.REML <- lmer(volume~ block +(1|progeny)+ (1|block:plots),data=RCBDk_Eucaliptus.dat,REML=TRUE)summary(RCBDk_Eucaliptus.REML)summary(RCBDk_Eucaliptus.REML)@coefsdata.frame(summary(RCBDk_Eucaliptus.REML)@REmat)

## EBLUP for tau and shrunk meanstau_EBLUP <- ranef(RCBDk_Eucaliptus.REML)[[2]]round(sum(tau_EBLUP),2)mm <- model.matrix(terms(RCBDk_Eucaliptus.REML),RCBDk_Eucaliptus.dat)RCBDk_Eucaliptus.dat$distance <- mm %*% fixef(RCBDk_Eucaliptus.REML)mu_EBLUP <- RCBDk_Eucaliptus.dat$distance + tau_EBLUP(Blup<- data.frame(round(tau_BLUE,1),round(tau_EBLUP,1),round(mean_P,1),round(mu_EBLUP,1)))plot(Blup[[3]],Blup[[4]], pch=’*’,xlim=c(100,200),ylim=c(100,200),xlab=’Unadjusted means’,ylab=’Shrunk means’)abline(0,1)

## MLRCBDk_Eucaliptus.ML <- lmer(volume~ block +(1|progeny)+ (1|block:plots),data=RCBDk_Eucaliptus.dat,REML=FALSE)summary(RCBDk_Eucaliptus.ML)summary(RCBDk_Eucaliptus.ML)@coefsdata.frame(summary(RCBDk_Eucaliptus.ML)@REmat)


Randomized Incomplete Block Design

In many situations the number of treatments is large and given theheterogeneity of the experimental conditions there is need to useblocks

However, blocks with too many plots could also becomeheterogeneous.

In breeding experiments, for example, it is common to have 100 ormore cultivars of corn to evaluate.

In other situations, there is not enough material to use. -

In biological work on animals, for example, it will be desirable, if atall possible, to compare several treatments within litters, but the sizeof the litter will depend on the particular species and will often besuch that it is impossible to include all the treatments within a litter.



The randomized incomplete block design can be of three types:

Balanced - here are included the “balanced incomplete block design(BIBD)” and the “balanced lattices square”

Partially unbalanced - here are included the “lattice squares” and“partially balanced incomplete block designs (PBIBD)”

Unbalanced



Definition: A balanced incomplete block design (BIBD) is one inwhich each of the t treatments is replicated r times and occurs at mostin each of the b blocks that contain k plots and the arrangement oftreatments in blocks is that each of treatments occurs together the samenumber of times (λ) in a block. (Brien, 2010)

The first condition means that the total number of units is tr = bk

the second condition implies that the total number of plots withother treatments in the blocks in which a treatment occur isλ(t − 1) = r(k − 1)

A BIBD cannot exist if these two first conditions are not met

However, that both of these conditions are satisfied does not implythat a BIBD must exist.

For example, a BIBD does not exist for t = 15, k = 5, b = 21, r = 7and λ = 2, even though both conditions are satisfied.

Such designs are not orthogonal, however they are balanced.

That is to say they are not orthogonal because treatments areconfounded with both blocks and plots within blocks.

They are balanced because all comparisons between treatments areconfounded with blocks to the same extent, as they are with plotwithin blocks.



It can be shown that for a BIBD the proportion of the informationwithin blocks is e2 = tλ

kr and between blocks is e1 = 1− e2.These proportions are called the canonical efficiency factors whichare always values between zero and one and sum to one for aparticular randomized term, in this case Treatments.It is desirable that e2 is as close to one as possible and this impliesthat as much of the information as possible is confounded withplots, which are less variable than blocks.Designs can be obtained from Cochran and Cox (1957) and Box,Hunter and Hunter (2005) or can be generated as follows.Suppose t = 4, k = 3, b = 4 ⇒ r = 3 and λ = 2

BlocksI II III IVA A A BB B C CC D D D

e2 = 4×2×3 = 0.8889and e1 = 1− 0.8889 = 0.1111, that is, 88,89% of

the information about treatments is between plots within blocks.Randomization: the treatment combinations are randomized to theblocks and the treatments in a block are randomized to the plots(dae).Clarice G.B. Demetrio and Cristian Villegas 161 Modelos Mistos e Componentes de Variancia


ANOVA table

E(MSq)Source df Fixed B random T random RandomBlocks b − 1Treatments t − 1 σ2 + e1qTB (Ψ) σ2 + kσ2

B + e1qTB (Ψ) σ2 + e1rσ2T σ2 + kσ2

B + e1rσ2T

Residual (b − t) σ2 + qB (Ψ) σ2 + kσ2B σ2 + qB (Ψ) σ2 + kσ2

BUnits[Blocks] b(k − 1)Treatments t − 1 σ2 + e2qT (Ψ) σ2 + e2qT (Ψ) σ2 + e2rσ

2T σ2 + e2rσ

2T

Residual (bk − b − t + 1) σ2 σ2 σ2 σ2

Total bk − 1

Note that there are two Treatment lines in the analysis, the firstbeing referred to as the “interblock” Treatment line and the secondas the “intrablock” Treatment line.Generally, one tries to have e2 as close to one as possible and tobase conclusions on the intrablock Treatment effects.Because, when Blocks are fixed, qTB

involves both β’s and τ ’s, it isnot possible to separately test for treatment difference between theblocks in this case – the intrablock test for treatments will be theonly test for treatments that can be performed here.Thus it is preferable to designate Blocks as random, if it isappropriate.



Barley data – unbalanced

Example

The data here is from a field trial of barley breeding lines (Galwey, page87). The lines studied were derived from a cross between two parentvarieties, “Chebec” and “Harrington”. They were “double haploid” lines,which means they were obtained by a laboratory technique that ensuresthat all plants within the same breeding line are genetically identical, sothat the line will breed true. This feature improves the precision whichgenetic variation among the lines can be estimated. The trial consideredhere was arranged in two randomized blocks. Within each block, eachline occupied a single rectangular field plot. All lines were present inBlock I, but due to limited seed stocks, some were absent in Block II.The grain yield (g/m2) was measured in each field plot.



Barley data

Blocks Blocks Blocks BlocksLines I II Lines I II Lines I II Lines I II

1 718 591 22 341 NA 43 678 837 64 648 6822 483 NA 23 606 818 44 518 873 65 819 7133 873 NA 24 671 463 45 520 576 66 688 8464 719 NA 25 429 NA 46 724 627 67 407 7035 799 NA 26 580 639 47 192 NA 68 326 3856 850 755 27 732 762 48 786 645 69 467 3797 907 820 28 680 NA 49 831 823 70 996 9058 636 300 29 606 932 50 721 886 71 596 5709 775 587 30 353 NA 51 693 746 72 166 259

10 765 757 31 167 226 52 603 NA 73 355 NA11 645 517 32 669 837 53 559 NA 74 489 55112 437 NA 33 770 847 54 809 859 75 617 NA13 541 475 34 673 639 55 555 NA 76 344 NA14 911 935 35 374 555 56 523 436 77 358 NA15 600 NA 36 800 1055 57 182 NA 78 260 26016 211 240 37 895 553 58 522 435 79 318 43917 552 959 38 641 541 59 553 635 80 488 47818 366 265 39 146 213 60 573 285 81 316 30419 623 424 40 411 568 61 612 472 82 251 NA20 632 NA 41 322 538 62 730 NA 83 280 NA21 515 NA 42 793 553 63 563 756



Barley data

The model for this data is

Yjk = µ+ βj + τk + εjk , j = 1, . . . , rk , k = 1, . . . , t

where Yjk is the grain yield of the k-th plot in the j-th block; µ is thegrand mean value of the grain yield; βj is the effect of the j-th block; τkis the effect of the effect of the k-th breeding line, being the line sown inthe jk-th block.plot combination.

It is natural in this case to consider block as a random effect, thatis, βj ∼ N(0, σ2

B) and εjk ∼ N(0, σ2).

Note that the cross Chebec × Harrington could produce many linesbesides those studied here, and the lines in this field trial mayreasonably be considered as a random sample from this populationof potential lines.

Thus it is reasonable to consider “line” as a random-effect term,that is to assume that τk ∼ N(0, σ2

L)



Using the R aov function with Error, the ANOVA table is

Source df SSq MSq F p-valueBlocks 1Treatments 1 58079.91 58079.91

Units[Blocks] 140Treatments 82 5343839 65168.77 4.86 < 0.01Residual 58 777747 13409.42

Total 141 6179665

The value 4.86 of F provides significance evidence against the nullhypothesis H0 : σ2

L = 0.

Using the R lme4 library, the estimates of the variance componentsare σ2

B = 8.208310−6, σ2L = 30666.89 and σ2 = 13225.75.

These estimates are similar, but not identical to the ones obtainedusing GENSTAT (see Galwey, page 99).

The estimate of the variance component due to block is very smallcompared with the other components.

We may decide that the best estimate of this component is zero,σ2B = 0, i.e. σ2

B has is a degenerate distribution instead of a normaldistribution.



REML MLσ2B 8.2083 10−6 1.4403 10−07

σ2L 30666.89 30199.55σ2 13225.75 13224.92µ 572.47 572.51sd(µ) 21.67 21.53

The estimate of variance due to breeding lines is about double theresidual variance.

The ML estimates of the variance components are smaller than theREML estimates

Note that the estimates of the fixed parameter µ using ML andREML don’t differ much.



The likelihood ratio test for σ2L = 0 is obtained by fitting the full

model and the reduced model from which the term “line” is omitted.

By comparing the deviances of both models, the contribution madeby the term “line” to the fit of the model can be assessed, providedthat the deviances were obtained from models with the samefixed-effect terms.

Using the R lme4 library, the deviances for the full and reducedmodels are, respectively, 1880.37 and 1919.67. The likelihood ratiostatistic with 1 d.f. is

Devreduced model − Devfull model = 1919.67− 1880.37 = 39.30

Note that R lme4 library uses the deviances from a ML estimation.

In a similar way, the likelihood ratio test for σ2B = 0 is obtained by

fitting the full model and the reduced model from which the term“block” is omitted. The likelihood ratio statistic with 1 d.f. is

Devreduced model−Devfull model = 2×940.18508−2×940.18504 = 0.00008

These results are similar but not identical to GENSTAT (see Galwey,page 101-104)


R results

summary(RIBD_barley.REML)Linear mixed model fit by REMLFormula: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)

Data: RIBD_barleyAIC BIC logLik deviance REMLdev

1880.384 1892.207 -936.1919 1880.37 1872.384Random effects:Groups Name Variance Std.Dev.fline (Intercept) 3.0666887e+04 1.7511964e+02fblock (Intercept) 8.2083130e-06 2.8650154e-03Residual 1.3225750e+04 1.1500326e+02

Number of obs: 142, groups: fline, 83; fblock, 2Fixed effects:

Estimate Std. Error t value(Intercept) 572.47356 21.67072 26.41691

> summary(RIBD_barley.REML_L0)Linear mixed model fit by REMLFormula: yield_g_m2 ~ 1 + (1 | fblock)


1918.107 1926.974 -956.0533 1919.673 1912.107Random effects:Groups Name Variance Std.Dev.fblock (Intercept) 6.4216528e-05 8.0135216e-03Residual 4.3827416e+04 2.0934998e+02

Number of obs: 142, groups: fblock, 2Fixed effects:


R results

> summary(RIBD_barley.ML)Linear mixed model fit by maximum likelihoodFormula: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)


1888.368 1900.191 -940.1838 1880.368 1872.386Random effects:Groups Name Variance Std.Dev.fline (Intercept) 3.0199552e+04 1.7378018e+02fblock (Intercept) 1.4403497e-07 3.7951939e-04Residual 1.3224923e+04 1.1499967e+02

Number of obs: 142, groups: fline, 83; fblock, 2Fixed effects:


> summary(RIBD_barley.ML_L0)Linear mixed model fit by maximum likelihoodFormula: yield_g_m2 ~ 1 + (1 | fblock)


1925.673 1934.541 -959.8366 1919.673 1912.107Random effects:Groups Name Variance Std.Dev.fblock (Intercept) 0.00 0.00000Residual 43518.77 208.61153

Number of obs: 142, groups: fblock, 2Fixed effects:


R results

> anova(RIBD_barley.REML,RIBD_barley.REML_L0)Data: RIBD_barleyModels:RIBD_barley.REML_L0: yield_g_m2 ~ 1 + (1 | fblock)RIBD_barley.REML: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)RIBD_barley.REML_L0 3 1925.6731 1934.5406 -959.83656RIBD_barley.REML 4 1888.3702 1900.1935 -940.18508 39.30295 1 3.6289e-10 ***

> anova(RIBD_barley.ML_L0,RIBD_barley.ML)Data: RIBD_barleyModels:RIBD_barley.ML_L0: yield_g_m2 ~ 1 + (1 | fblock)RIBD_barley.ML: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)RIBD_barley.ML_L0 3 1925.6731 1934.5406 -959.83655RIBD_barley.ML 4 1888.3676 1900.1909 -940.18381 39.30548 1 3.6242e-10 ***


Heritability. The prediction of genetic advance underselection.

The heritability for the Barley data can be calculated by

h2 =σ2L

σ2L + σ2

r∗

=30666.89

30666.89 + 13225.751.55

= 0.7823

where r∗ is the number of replications per line.

One way to calculate r∗ is to use

r∗ =t∑t

k=01ri

=83

24 11 + 59 1

2

= 1.55

similar to the value 1.63 given by Galwey, on page 106.

The heritability can be used to calculate the expected geneticadvance under selection in a plant or animal breeding programme.

This is given by the formula

Gs = iσPhh2

where i is an index of selection.Clarice G.B. Demetrio and Cristian Villegas 172 Modelos Mistos e Componentes de Variancia


The index is defined in relation to the standard normal model: thatis, the distribution of a variable Z , such that

Z ∼ N(0, 1)

It is the value of Z that corresponds to the fraction k of thepopulation that is to be selected.



The best linear unbiased predictor or ’shrunk’ estimate

The adjustment to obtain the random-effect mean is made asfollows.The true mean of the kth breeding line is represented by

µk = µ+ τk

In the table of means presented (using R lme4 library), this value isestimated by

µk =

∑rkj=1 yjk

rkwhere yjk is the jth observation of the kth breeding line; rk is thenumber of observations of the kth breeding line.The overall mean of the population of breeding lines, µ is estimatedby

µ = 572.5

Note that this not quite the same as the mean of all observations(= 581.1) or the mean of the line means (= 569.1).Then

µk = µ+ τk ⇒ τk = µk − µClarice G.B. Demetrio and Cristian Villegas 174 Modelos Mistos e Componentes de Variancia


An estimate of τk is given by

BLUE k = τk = µk − µ

To allow for the expectation that high-yielding lines in the presenttrial will perform less well in a future trial – and that low-yieldinglines will perform better - the BLUE is replaced by a “shrunkestimate” called the best linear unbiased predictor (BLUP)

BLUPk = BLUE kshrinkage factor = (µk − µ)σ2L

σ2L + σ2

rk

This relationship, combined with the constraint

t∑k=1

BLUPk = 0,

where t is the number of breeding lines, determines the value of µ aswell as those of the BLUP’s.

A new estimate of the mean for the kth breeding is then given by

µ′k = µ′ + BLUPk



Line τk µk µk (BLUP) Line τk µk µk (BLUP) Line τk µk µk (BLUP)1 67.5 654.5 639.9 29 162.0 769.4 734.4 57 -272.8 182.0 299.62 -62.3 483.3 510.2 30 -153.1 353.3 419.3 58 -77.3 478.5 495.23 210.0 873.0 782.5 31 -309.3 196.4 263.1 59 17.7 594.0 590.24 102.5 719.1 674.9 32 148.5 753.1 721.0 60 -118.2 428.8 454.35 158.3 799.0 730.7 33 194.2 808.5 766.7 61 -25.1 542.0 547.46 189.1 802.4 761.6 34 68.9 656.3 641.4 62 110.4 730.5 682.97 239.6 863.7 812.1 35 -88.8 464.5 483.6 63 71.5 659.4 644.08 -85.8 468.2 486.7 36 292.1 927.5 864.5 64 75.9 664.8 648.49 89.2 681.0 661.7 37 124.6 724.0 697.1 65 159.3 766.1 731.7

10 154.9 760.8 727.4 38 15.3 591.0 587.8 66 160.0 767.0 732.511 6.9 580.8 579.4 39 -323.5 179.2 249.0 67 -14.4 554.9 558.012 -94.9 436.6 477.5 40 -68.2 489.6 504.3 68 -178.3 355.7 394.213 -52.9 508.1 519.5 41 -117.3 429.9 455.2 69 -122.9 423.1 449.614 288.1 922.7 860.6 42 82.7 673.0 655.2 70 311.0 950.5 883.515 19.3 600.1 591.7 43 152.3 757.6 724.8 71 8.4 582.7 580.916 -285.5 225.4 286.9 44 101.1 695.3 673.6 72 -296.4 212.2 276.117 150.6 755.5 723.1 45 -20.3 547.8 552.2 73 -151.7 355.3 420.718 -211.1 315.8 361.3 46 84.9 675.6 657.3 74 -42.9 520.4 529.619 -40.5 523.2 532.0 47 -265.5 192.4 306.9 75 31.3 617.3 603.820 41.9 632.5 614.4 48 117.7 715.6 690.2 76 -159.3 344.5 413.221 -40.4 514.6 532.1 49 209.3 826.9 781.8 77 -149.9 357.9 422.622 -162.0 340.6 410.5 50 190.0 803.4 762.4 78 -256.9 260.1 315.523 114.5 711.6 686.9 51 120.8 719.3 693.3 79 -159.6 378.4 412.824 -4.3 567.2 568.2 52 21.4 603.0 593.8 80 -73.6 483.0 498.925 -100.5 428.6 472.0 53 -9.3 559.1 563.2 81 -216.2 309.7 356.326 30.7 609.8 603.2 54 215.5 834.4 787.9 82 -224.4 251.3 348.127 143.7 747.1 716.2 55 -12.0 555.3 560.5 83 -204.0 280.4 368.428 75.1 679.9 647.5 56 -76.8 479.1 495.7


R results

RIBD_barley <- read.table("barleyprogeny.dat", header=TRUE)attach(RIBD_barley)head(RIBD_barley)str(RIBD_barley)RIBD_barley$fline <- factor(RIBD_barley$line)RIBD_barley$fblock <- factor(RIBD_barley$block)RIBD_barley$plot <- factor(c(rep(1,83),rep(2,59)))head(RIBD_barley)str(RIBD_barley)

options(digits=10)barley.aov <- aov(yield_g_m2 ~ fline + Error(fblock), data=RIBD_barley)summary(barley.aov)summary(aov(yield_g_m2 ~ 1, data=RIBD_barley))

## REML considering fblock and fline as random effects## using library lme4library(lme4)RIBD_barley.REML <- lmer(yield_g_m2 ~ 1 + (1|fblock) +(1|fline), data=RIBD_barley, REML=TRUE)summary(RIBD_barley.REML)summary(RIBD_barley.REML)@coefsdata.frame(summary(RIBD_barley.REML)@REmat)

## a likelihood ratio test for sigma2_LRIBD_barley.REML_L0 <- lmer(yield_g_m2 ~ 1 +(1|fblock) , data=RIBD_barley, REML=TRUE)summary(RIBD_barley.REML)summary(RIBD_barley.REML_L0)anova(RIBD_barley.REML,RIBD_barley.REML_L0)

## a likelihood ratio test for sigma2_B(RIBD_barley.REML_B0 <- lmer(yield_g_m2 ~ 1 +(1|fline), data=RIBD_barley, REML=TRUE))anova(RIBD_barley.REML,RIBD_barley.REML_B0)

## ML considering fblock and fline as random effects## using library lme4(RIBD_barley.ML <- lmer(yield_g_m2 ~ 1 + (1|fblock) +(1|fline), data=RIBD_barley, REML=FALSE))(RIBD_barley.ML_L0 <- lmer(yield_g_m2 ~ 1 + (1|fblock),data=RIBD_barley, REML=FALSE))summary(RIBD_barley.ML)summary(RIBD_barley.ML_L0)anova(RIBD_barley.ML_L0,RIBD_barley.ML)

# meansunad_mean<-tapply(RIBD_barley$yield_g_m2,RIBD_barley$fline, mean) ## unadjusted meansmean(unad_mean) ## general meanmean(RIBD_barley$yield_g_m2) # mean of the line means

## tau_EBLUP, shrunk meansmm <- model.matrix(terms(RIBD_barley.REML),RIBD_barley)RIBD_barley$distance <- mm %*% fixef(RIBD_barley.REML)tau_EBLUP <- ranef(RIBD_barley.REML)[[1]]mu_EBLUP <- RIBD_barley$distance + tau_EBLUPBlup<- data.frame(round(tau_EBLUP,1),round(unad_mean,1),round(mu_EBLUP,1))plot(Blup[[2]],Blup[[3]], pch=’*’,xlim=c(0,1000),ylim=c(0,1000),xlab=’Unadjusted means’,ylab=’Shrunk means’)abline(0,1)


Latin squares designs (LS)

Sometimes we need more than one type of blocks. In general callone sort of blocks “rows” and the other sort “columns”.

Definition: A Latin square design is one in which

each treatment occurs once and only once in each row and eachcolumnso that the numbers of rows, columns and treatments are all equal.

Clearly, the total number of observations is n = t2.

Suppose in a field trial moisture is varying across the field and thestoniness down the field.

A Latin square can eliminate both sources of variability.



Sugarcane experiment

Suppose there are five different varieties of sugarcane to becompared and suppose that moisture is varying across the field andthat the stoniness down the field.

A Latin square design for this would be as follows:

5× 5 Latin SquareColumn Less

1 2 3 4 5 stony ofI A B C D E field

II C D E A B ⇓Row III E A B C D ⇓

IV B C D E A ⇓V D E A B C Stonier

end offield

Less Moremoisture ⇒⇒⇒ moisture

VarietiesA, B, C, D, E



Even if one has not identified trends in two directions, a Latin squaremay be employed to guard against the problem of putting the blocksin the wrong direction.

Latin squares may also be used when there are two different kinds ofblocking variables, for example, animals and times.

General principle is that one is interested in maximizing row andcolumn differences so as to minimize the amount of uncontrolledvariation affecting treatment comparisons.

The major disadvantage with the Latin square is that you arerestricted to having the number of replicates equal to the number oftreatments.



Several fundamentally different Latin squares exist for a particular t

for t = 4 there are three different squares.A collection of Latin squares for t = 3, 4, . . . , 9 is given in Appendix8A of Box, Hunter and Hunter.

To randomize these designs appropriately involves the following:

1. randomly select one of the designs for a value of t;2. randomly permute the rows and then the columns;3. randomly assign letters to treatments.

Note: no nested.factors as Rows and Columns are to be randomizedindependently

Hence they are not nested (they are crossed)

Generally we will use R to obtain randomized layouts.

General instructions given in Appendix B (Chris Brien’s notes),Randomized layouts and sample size computations in R.



Latin Square Design

Consider a latin square design,

yij = µ+ βi + γj + τk(ij) + εij

where µ constant, βi is the effect of the i-th row, γj is the effect of j-thcolumn, τk is the effect of the k-th treatment within the plot ij , εij is theexperimental error associated to the i , j-th plot. The ANOVA table is

Source df SSq MSq F

Rows t − 1 Y′QRYY′QRY

t − 1

Columns t − 1 Y′QCYY′QCY

t − 1Rows:Columns (t − 1)2


t − 1

Residual (t − 1)(t − 2) Y′QRCRes YY′QRCRes Y

(t − 1)(t − 2)Total t2 − 1

where SSqR = 1t

∑ti=1 R2

i − C , C =(∑

i,j,k Yijk )2

n ,

SSqC = 1t

∑tj=1 C 2

j − C , SSqT = 1t

∑tk=1 T 2

k − CClarice G.B. Demetrio and Cristian Villegas 182 Modelos Mistos e Componentes de Variancia


Considering t=3:

ColumnRow 1 2 3

1 A B C2 C A B3 B C A

In matrix notation, for a fixed model,

Y = XGµ + XRβ + XCγ + XTτ + ε

y11

y12

y13

y21

y22

y23

y31

y32

y33

=

111111111

µ +

1 0 01 0 01 0 00 1 00 1 00 1 00 0 10 0 10 0 1

β +

1 0 00 1 00 0 11 0 00 1 00 0 11 0 00 1 00 0 1

γ +

1 0 00 1 00 0 10 0 11 0 00 1 00 1 00 0 11 0 0

τ + ε



In a general way we can have at least five types of models:

1 Fixed model with βi , γj and τk as fixed effects and εij ∼ N(0, σ2)

2 Mixed model with γj and τk as fixed effects and βi ∼ N(0, σ2R) and

εij ∼ N(0, σ2);

3 Mixed model with βi and γj as fixed effects and τk ∼ N(0, σ2T ) and

εij ∼ N(0, σ2);

4 Mixed model with βi as a fixed effect and γj ∼ N(0, σ2C ),

τk ∼ N(0, σ2T ) and εij ∼ N(0, σ2);

5 A random model with βi ∼ N(0, σ2R), γj ∼ N(0, σ2

C ), τk ∼ N(0, σ2T )

and εij ∼ N(0, σ2).



1. Fixed model

βi , γj and τk as fixed effects and εij ∼ N(0, σ2)

Y = XGµ+ XRβ + XCγ + XTτ + ε

Source d.f. E(MSq)Row (t-1) σ2 + qR(ψ)Column (t-1) σ2 + qC (ψ)Row # Column (t-1)2

Treatment (t-1) σ2 + qT (ψ)Residual (t-1)(t-2) σ2

Then

E(Y) = XGµ+ XRβ + XCγ + XTτ

Var(Y) = Σ = σ2It2



Var(Y) = Σ = σ2It2

=

σ2 0 0 0 0 0 0 00 σ2 0 0 0 0 0 00 0 σ2 0 0 0 0 0 00 0 0 σ2 0 0 0 0 00 0 0 0 σ2 0 0 0 00 0 0 0 0 σ2 0 0 00 0 0 0 0 0 σ2 0 00 0 0 0 0 0 0 σ2 00 0 0 0 0 0 0 0 σ2



2. Mixed model

with γj and τk as fixed effects and βi ∼ N(0, σ2R) and εij ∼ N(0, σ2)

Y = XGµ+ XCγ + XTτ + ZRβ + ε

Source d.f. E(MSq)Row (t-1) σ2 + tσ2

R

Column (t-1) σ2 + qC (ψ)Row # Column (t-1)2

Treatment (t-1) σ2 + qT (ψ)Residual (t-1)(t-2) σ2

Then

E(Y) = XGµ+ XCγ + XTτ

Var(Y) = ZRσ2R ItZ

′R + Σ = σ2

RZRZ′R + σ2It2



Var(Y) = σ2RZRZ′R + σ2It2 =

σ2R

1 1 1 0 0 0 0 0 01 1 1 0 0 0 0 0 01 1 1 0 0 0 0 0 0

0 0 0 1 1 1 0 0 00 0 0 1 1 1 0 0 00 0 0 1 1 1 0 0 0

0 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 1

+ σ2

1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

=

σ2 + σ2R σ2

R σ2R 0 0 0 0 0 0

σ2R σ2 + σ2

R σ2R 0 0 0 0 0 0

σ2R σ2

R σ2 + σ2R 0 0 0 0 0 0

0 0 0 σ2 + σ2R σ2

R σ2R 0 0 0

0 0 0 σ2R σ2 + σ2

R σ2R 0 0 0

0 0 0 σ2R σ2

R σ2 + σ2R 0 0 0

0 0 0 0 0 0 σ2 + σ2R σ2

R σ2R

0 0 0 0 0 0 σ2R σ2 + σ2

R σ2R

0 0 0 0 0 0 σ2R σ2

R σ2 + σ2R



3. Mixed model

βi and γj as fixed effects and τk ∼ N(0, σ2T ) and εij ∼ N(0, σ2)

Y = XGµ+ XRβ + XCγ + ZRτ + ε

Source d.f E(MSq)Row (t-1) σ2 + qR(ψ)Column (t-1) σ2 + qC (ψ)Row # Column (t-1)2

Treatment (t-1) σ2 + tσ2T

Residual (t-1)(t-2) σ2

Then

E(Y) = XGµ+ XRβ + XCγ

Var(Y) = ZTσ2T ItZ

′T + Σ = σ2

TZTZ′T + σ2It2



Var(Y) = σ2TZTZ′T + σ2It2

= σ2T

1 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 0

0 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 0

0 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 1

+ σ2

1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

=

σ2 + σ2T 0 0 0 σ2

T 0 0 0 σ2T

0 σ2 + σ2T 0 0 0 σ2

T σ2T 0 0

0 0 σ2 + σ2T σ2

T 0 0 0 σ2T 0

0 0 σ2T σ2 + σ2

T 0 0 0 σ2T 0

σ2T 0 0 0 σ2 + σ2

T 0 0 0 σ2T

0 σ2T 0 0 0 σ2 + σ2

T σ2T 0 0

0 σ2T 0 0 0 σ2

T σ2 + σ2T 0 0

0 0 σ2T σ2

T 0 0 0 σ2 + σ2T 0

σ2T 0 0 0 σ2

T 0 0 0 σ2 + σ2T

=

V1 V2 VT2

VT2 V1 V2

V2 VT2 V1

, V1 =

σ2 + σ2T 0 0

0 σ2 + σ2T 0

0 0 σ2 + σ2T

, V2 =

0 σ2T 0

0 0 σ2T

σ2T 0 0



4. Mixed model

with βi as a fixed effect and γj ∼ N(0, σ2C ), τk ∼ N(0, σ2

T ) andεij ∼ N(0, σ2)

Y = XGµ+ XRβ + ZCγ + ZTτ + ε

Source d.f. E(MSq)Row (t-1) σ2 + qR(ψ)Column (t-1) σ2 + σ2

C

Row # Column (t-1)2

Treatment (t-1) σ2 + σ2T

Residual (t-1)(t-2) σ2

Then

E(Y) = XGµ+ XRβ

Var(Y) = ZCσ2C ItZ

′C + ZTσ

2T ItZ

′T + Σ

= σ2CZCZ′C + σ2

TZTZ′T + σ2It2



Var(Y) = σ2CZCZ′C + σ2

TZTZ′T + σ2It2

= σ2C

1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1

1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1

1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1

+ σ2

T

1 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 0

0 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 0

0 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 1

+σ2

1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

=

V1 V2 VT2

VT2 V1 V2

V2 VT2 V1

V1 =

σ2 + σ2C + σ2

T 0 0

0 σ2 + σ2C + σ2

T 0

0 0 σ2 + σ2C + σ2

T

, V2 =

σ2C σ2

T 0

0 σ2C σ2

Tσ2T 0 σ2

C



In summary

ModelsSource d.f. 1 2 3 4 5

Row (t-1) σ2 + qR (ψ) σ2 + tσ2R σ2 + qR (ψ) σ2 + qR (ψ) σ2 + tσ2

RColumn (t-1) σ2 + qC (ψ) σ2 + qC (ψ) σ2 + qC (ψ) σ2 + tσ2

C σ2 + tσ2C

Row # Column (t-1)2

Treatment (t-1) σ2 + qT (ψ) σ2 + qT (ψ) σ2 + tσ2T σ2 + tσ2

T σ2 + tσ2T

Residual (t-1)(t-2) σ2 σ2 σ2 σ2 σ2



A Design of factorial experiments

Often be more than one factor of interest to the experimenter.

Definition: Experiments that involve more than one randomized ortreatment factor are called factorial experiments.

In general, the number of treatments in a factorial experiment is theproduct of the numbers of levels of the treatment factors.

The disadvantage of this is that the number of treatments increasesvery quickly.

Given the number of treatments, the experiment could be laid out as

a Completely Randomized Design,a Randomized Complete Block Design ora Latin Square with that number of treatments.

The incomplete block designs, such as BIBDs or Youden Squares arenot suitable for factorial experiments.



a) Obtaining a layout for a factorial experiment in R

Layouts for factorial experiments can be obtained in R usingexpressions for the chosen design when only a single-factor isinvolved.

Difference with factorial experiments is that the several treatmentfactors are entered.

Their values can be generated using fac.gen.

fac.gen(generate, each=1, times=1, order=”standard”)

It is likely to be necessary to use either the each or times argumentsto generate the replicate combinations.

The syntax of fac.gen and examples are given in Appendix B,Randomized layouts and sample size computations in R.

For Yates order, as opposed to standard order, the first factorchanges fastest, last slowest whereas the first factor changes slowestand the last fastest in standard order.



Summary of advantages of factorial experiments

To summarize, relative to one-factor-at-a-time experiments, factorialexperiments have the advantages that:

1. if the factors interact, factorial experiments allow this to bedetected and estimates of the interaction effect can be obtained, and2. if the factors are independent, factorial experiments result in theestimation of the main effects with greater precision.



CRD Factorial

yijk = µ+ αi + βj + τij + εijk

where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; αi is effect of the i-thfactor A; βj is effect of the j-th factor B; τij is effect of the i-th factor Acombined with the j-th factor B; εijk is the experimental error associatedto the i , j ; k-th plot.

Considering a=3, b=4 e r=2:

A2B1 A3B4 A1B1 A3B1

A2B4 A3B3 A1B3 A2B3

A2B2 A1B1 A3B2 A1B2

A1B2 A3B1 A2B2 A1B4

A2B1 A1B4 A3B2 A2B4

A1B3 A2B3 A3B3 A3B4



unrandomized randomized

1µ

1

��

1µ

1

ww ''abr

Plot P(abr−1)hh ii

Aa

A(a−1)

&&

Bb

B(b−1)

xxA ∧ Bab

A#B

(a−1)(b−1)

µ

��

µ

vv ((σ2p

Plot Pσ2p

AqA(ψ)

AqA(ψ)

''

BqB (ψ)

BqB (ψ)

wwA ∧ B

qAB (ψ)A#B

qAB (ψ)




1µ

1

��

1µ

1

ww ''abr

Plot P(abr−1)hh ii

Aa

A(a−1)

&&

Bb

B(b−1)

xxA ∧ Bab

A#B

(a−1)(b−1)

µ

��

µ

vv ((σ2p

Plot Pσ2p

AqA(ψ)

AqA(ψ)

''

BqB (ψ)

BqB (ψ)

wwA ∧ B

qAB (ψ)A#B

qAB (ψ)



source d.f. E(QM)Plots rab - 1

A a-1 σ2p + qA(ψ)

B b-1 σ2p + qB(ψ)

A#B (a-1)(b-1) σ2p + qAB(ψ)

Residual ab(r-1) σ2p

Total rab-1



CRDB Factorial

yijk = µ+ γk + αi + βj + τij + εijk

where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; γk is effect of the k-thblock; αi is effect of the i-th factor A; βj is effect of the j-th factor B; τijis effect of the i-th factor A combined with the j-th factor B; εijk is theexperimental error associated to the i , j ; k-th plot.

3*Bloco I A3B4 A1B2 A2B2 A1B3

A2B4 A2B1 A1B1 A2B3

A1B4 A3B1 A3B2 A3B3

3*Bloco II A2B3 A1B3 A3B1 A3B2

A3B4 A2B1 A1B4 A2B2

A1B1 A2B4 A1B2 A3B3




1µ

1

��

1µ

1

ww ''Blockr

Bl(r−1)

��

Aa

A(a−1)

&&ww

Bb

B(b−1)

xxrrPlot ∧ Block

abrP[Bl]

r(ab−1)

A ∧ Bab

A#B

(a−1)(b−1)

µ

��

µ

ww ''Block

qBl (ψ)Bl

σ2PB

+qBl (ψ)

��

AqA(ψ)

AqA(ψ)

&&

BqB (ψ)

BqB (ψ)

xxPlot ∧ Block

σ2PB

P[Bl]

σ2PB

A ∧ BqAB (ψ)

A#B

qAB (ψ)



source d.f. E(QM)Block r-1 σ2

PB + qBl(ψ)Plots[Blocks] r(ab - 1)

A a-1 σ2PB + qA(ψ)

B b-1 σ2PB + qB(ψ)

A#B (a-1)(b-1) σ2PB + qAB(ψ)

Residual (ab-1)(r-1) σ2PB

Total rab-1



A Design of split-plot experiments

Designs in which main effects confounded with more variable unitssuch as large plots.

Their defining attribute is that there is randomization to twodifferent physical entities such that some main effects arerandomized to the more variable entities.

Definition: The standard split-plot design is one in which twofactors, say A and B with a and b levels, respectively are assigned asfollows:

one of the factors, A say, is randomized according to a RCBD withsay r blocks andeach of its ra plots, called the main plots, is split into b subplots (orsplit-plots) and levels of B randomized independently in each subplot.Altogether the experiment involves n = rab subplots.

That is, the generic factor names for this design are Blocks,MainPlots, SubPlots, A and B.



Split-plot principle

Very flexible principle that can be used to generate a large number ofdifferent types of experiments.

For example, the main plots could be arranged in any of a CRD, RCBD,Latin square, BIBD, Youden Square

each plot of the design is subdivided into subplots.

The subplots may utilize more complicated designs as well.

For example, the main plots may be arranged in a RCBD each ofwhich are subdivided in such a way as to allow a Latin Square to beplaced in each main plot.

Also, subplots can be split into subsubplots and subsubplots into ...

Nor is one restricted to applying just one factor to each type of unit.

More than one factor can be randomized to main plots, more thanone to subplots and so on.

The standard split-plot design is nearly the simplest possibility; only aCRD in the main plots would be simpler.



When to use a split-plot design

1 When the physical attributes of a factor require the use of largerunits of experimental material than other factors.

For example, land preparation treatments usually require to beperformed on larger areas of land than do the sowing of differentvarieties (due to the different pieces of equipment).Temperature control for say storage purposes involves the use ofrelatively large chambers in which several samples can usually bestored.Different processing runs are often of a minimum size such that theirproduce can be readily subdivided for the application of furthertreatments.Also, some factors are relatively hard to change. For example, thetemperature of a production operation is often difficult to change sothat it might be better to change it less often by making it amain-plot factor.



2 When it is desired to incorporate an additional factor into anexperiment.

3 When it is expected that differences amongst the levels of certainfactors are larger than amongst those of other factors.

The levels of the factors with larger differences are randomized tomain plots.One effect of this may be to increase the precision of comparisonsbetween the levels of the other factors.

4 When it is desired to ensure greater precision between some factorsthan others.

Irrespective of the size of the differences between the main plottreatment factors, it is desired to increase the precision of somefactors by assigning them to subplots.One may be less interested in main effects of some factors. Aparticular example of such factors is ”noise” factors.



Notes

Note that the last two of these situations are utilising theanticipated greater variability of main plots relative to subplots.

That is, we are expecting the larger units to be more variable thanthe smaller units.This will be expressed in the models and E[MSq]s for theseexperiments.

In describing the type of study, you need to identify the main plotand subplot design.



Ravioli dataHere we will illustrate the design with data from an evaluation of fourcommercial brands of ravioli by nine trained assessors. (Guillermo Hough,DESA-ISETA, Argentina.) The purpose of the study was to identifydifferences in taste and texture between the brands. Knowledge of suchdifferences is of great commercial importance to food manufacturers, butdifficult to obtain: these sensory characteristics must ultimately beassessed by the subjective impressions of a human observer, which varyamong individuals, and over occasions in the same individual. However, ifthe subjective assessment of some aspect of taste or texture (such assaltiness or gumminess) is consistent, for a particular brand, amongindividuals and over occasions that is, if the perceived differencesbetween brands are statistically significant it is safe to conclude thatthese differences are real. Differences among assessors are of less interest.Different individuals may simply be using different parts of theassessment scale to describe the same sensations: who can say whetherfood tastes saltier to you than it does to me? However, if there aresignificant interactions between brand and assessor for example, if theassessor ANA consistently perceives Brand A as saltier than Brand B,whereas GUI consistently ranks these brands in the opposite order this isof interest to the investigator.



CRD Split-plot

yijk = µ+ αi + eik + βj + τij + εijk

where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; αi is effect of the i-thfactor A; eik is the experimental error associated to the i-th factorassociated with the k-th plot; βj is effect of the j-th factor B; τij is effectof the i-th factor A combined with the j-th factor B; εijk is theexperimental error associated to the i , j ; k-th sub-plot.


2*A3 B1 B2 B4 B1

B4 B3 B3 B2

2*A1 B3 B4 B1 B2

B2 B1 B3 B4

2*A2 B1 B4 B2 B4

B2 B3 B3 B1




1µ

1

��

1µ

1

ww ''ar Plot P

ar−1

��

iiAa

A(a−1)

&&

Bb

B(b−1)

xxrrSubplot ∧ Plot

abr

S[P]

ar(b−1)

A ∧ Bab

A#B

(a−1)(b−1)oo

µ

��

µ

ww ''Plotsabrarσ2p

P

σ2+bσ2p

��

AqA(ψ)

AqA(ψ)

&&

BqB (ψ)

BqB (ψ)

xxS ∧ P

σ2S[P]

σ2

A ∧ BqAB (ψ)

A#B

qAB (ψ)



source d.f. E(QM)Plots ra - 1

A a-1 σ2 + bσ2P + qA(ψ)

Residual a(r-1) σ2 + bσ2P

Subplots[Plots] ra(b-1)B b-1 σ2 + qB(ψ)A#B (a-1)(b-1) σ2 + qAB(ψ)Residual a(b-1)(r-1) σ2

Total rab-1



yijk = µ+ γk + αi + eik + βj + τij + εijk

where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; γk is effect of the k-thblock; αi is effect of the i-th factor A; eik is the experimental errorassociated to the i-th factor associated with the k-th plot; βj is effect ofthe j-th factor B; τij is effect of the i-th factor A combined with the j-thfactor B; εijk is the experimental error associated to the i , j ; k-thsub-plot.


3*Block I A3 B1 B2 B4 B3

A1 B4 B3 B1 B2

A2 B3 B4 B1 B2

3*Block II A2 B1 B3 B4 B2

A1 B2 B3 B1 B4

A3 B3 B4 B2 B1




1µ

1

��

1µ

1

ww ''r Block Bl

r−1

��

Aa

A(a−1)

&&vv

Bb

B(b−1)

xx

ww

Plot ∧ Blockar

P[ Bl ]

r(a−1)

��

A ∧ Bab

A#B

(a−1)(b−1)

ssSubplot ∧ Plot ∧ Block

abr

S[P ∧ Bl]

ar(b−1)




µ

��

µ

ww ''Block

qBl (ψ)Bl

σ2+bσ2p+qBl (ψ)

��

AqA(ψ)

AqA(ψ)

%%

BqB (ψ)

BqB (ψ)

yyPlot ∧ Block

abrarσ2p

P[Bl]

σ2+bσ2p

��

A ∧ BqAB (ψ)

A#B

qAB (ψ)

Subplot ∧ Plot ∧ Block

σ2S[P ∧ Bl]

σ2



source d.f. E(QM) F

Blocks r-1 σ2 + bσ2P + qBl (ψ)

Plots r(a - 1)A a-1 σ2 + bσ2

P + qA(ψ) QMA/QMResAResidual A (a-1)(r-1) σ2 + bσ2

P

Subplots[Plots] ra(b-1)B b-1 σ2 + qB (ψ) QMB/QMResBA#B (a-1)(b-1) σ2 + qAB (ψ) QMAB/QMResBResidual B a(b-1)(r-1) σ2

Total rab-1



If βj ∼ N(0, σ2B)

H01: σ2AB=0

H02: σ2B=0

H03: µA1 = µA2 = . . . = µAa = 0

source d.f. E(QM) F

Blocks r-1 σ2 + bσ2P + qBl (ψ)

Plots r(a - 1)

A a-1 σ2 + bσ2P + rσ2

AB + qA(ψ) z (sob H03)

Residual (a-1)(r-1) σ2 + bσ2P

Subplots[Plots] ra(b-1)

B b-1 σ2 + rσ2AB + raσ2

B QMB/QMAB (sob H02)

A#B (a-1)(b-1) σ2 + rσ2AB QMAB/QMResB (sob H01)

Residual a(b-1)(r-1) σ2

Total rab-1

z F = QMA+QMResBQMResA+QMA#B ∼ Fν1,ν2

Sattertwaite: ν1 = (QMA+QMResB)2

QMA2

a−1QMResB2

a(b−1)(r−1)

e ν2 = (QMResA+QMA#B)2

QMResA2

(r−1)(a−1)QMA#B2

(a−1)(b−1)



The brands of ravioli were cooked, served into small dishes and presentedhot to the assessors. Three replicate evaluations were made, each beingcompleted on a single day; hence each day comprised a block. There mayhave been uncontrolled and unobserved variation from day to day in thecooking and serving conditions for example, the temperature of theroom may have changed. On each day, the order in which the fourbrands were presented to the assessors was randomised. However, on anygiven day, all the assessors received the brands in the same order: for thistype of product it is complicated to randomise the order of presentationamong assessors. Hence each presentation of a brand comprised a mainplot: the brand varied only between presentations, but the whole set ofassessors received the brand within each presentation. Each serving, in asingle dish, comprised a sub-plot. During each presentation, the servingswere shuffled before being taken to the assessors; thus the assessors wereinformally randomised over the sub-plots within each main plot. (Itwould have been cumbersome to follow a formal randomisation at thisstage: it was more important to get the servings to the assessors whilethey were still hot.)Each assessor gave the serving presented to him or her a numerical scorefor saltiness.



Hierarchical classification or nested classification model

Experimental designs with hierarchical classification are frequentlyused in agricultural,genetic, industrial, medical and other types ofresearch.Cochran (1939) described a sampling scheme for estimating wheatproduction: samples of farms were selected from six districts; at thenext stage, samples of fields were selected from each of the selectedfarm; at the final stage, measurements on the yield of wheat wereobtained from sample ”paths” in each of the selected fields.For demographic, political and socioeconomic studies, samples ofgeographic regions, counties, districts and towns are selected in asuccession.Similar procedures are employed in geographical studies on rockformation, mineral deposition and soil erosion.These types of designs are also used in studies related to water andatmospheric pollution, and also in environmental and ecologicalstudies.Nested designs can be balanced or unbalanced, the classificationfactors can be fixed or random, and the variances can be equal orunequal.



Further illustrations (Rao, 1997)

A number of applications of the nested designs appeared in the literature,as, for example,

Fabric differences: Tippett (1931) presents an experiment forexamining the properties of four fabrics. Three tests were performedon each of the fabrics, and each test was repeated four times.

Blood pressure measurements: Canner et al. (1991) used data fromthe Hypertension Prevention Trial, conducted in U.S. during1983-1986 and a nested model to examine errors made in measuringblood pressure of individuals. The effects of the participants in thestudy, their visits and duplicate measurements on each visit wereincluded in the model and they were all assumed to be random.

Eye examination: Rosner (1982) used a nested model to study itemsuch as ”intraoccular pressures in persons”. The models consists ofgroups, individuals in the groups, and the measurements on both theeyes. The groups were considered to be fixed, and the remaining twofactors random. For some items, measurement on one of the eyeswas missing. For some other items, the condition being examined bythe opthalmologist existed in only one of the eyes.



Asparagus clones: For patenting asparagus clones, a plant producerused estimates of means and variances of their importantcharacteristics. For future experimentation, estimates of the variancecomponents of the cladophylls, ” the tiny leaves located on theasparagus branches,” were also obtained (Trout, 1985). The studywas conducted by selecting in stages, five stalks from a clone, twobranches from each stalk, five nodes on each branch, and threecladophylls from each node. Variance components were obtainedfrom the lengths of the 150 cladophylls at the nodes.

Experimental drugs: Patients with certain diseases are hospitalizedand administered suitable medical treatments. Experimental drugscan be examined by administering them to the patients receivingeach of the treatments. In a split-plot experiment, the treatmentsand drugs are considered to be the main and the sub-plottreatments, respectively.

Spectral density: Jackson and Lawton (1969) examined theconsequences of estimating a spectral density through a nestedclassification.



Textile production: Bainbridge (1965) suggests a staggered designfor detecting sources of variation occurring in industrial productions,and illustrates it through a chemical test on a specific textile. Froma large number of machines, two were selected on each of forty-twodays, the sample from one machine was tested by two analysts ondifferent shifts, one of them obtaining duplicate measurements. Thesample from the second machine was analysed only once by ananalyst. The data from this experiment were used for studying thevariations occurring from (1) changes in the raw material over thedays, (2) differences in the machines, (3) long term tests at thedifferent shifts, and (4) short term tests through the duplicatemeasurements. This four stage design is unbalanced.

Animal breeding: In several experiments on animal breeding, each ofa sample of sires are randomly mated to samples of dames. Theobservations from the offspring are analyzed through the model for anested design.



Calf birth weight

Example

In an animal breeding experiment 20 unrelated cows were subjected tosuperovulation and artificial insemination. Each group of 4 cows wasinseminated with a different sire, with a total of 5 unrelated sires. Out ofeach mating (combination of dam and sire), three calves were generatedand their yearling weights were recorded.

no interest in each sire or dam which are very depending on thecircumstances

sire effect can be viewed as a sample of a random sire effect (levelsare chosen at random from an infinite set of sire levels)

dam effect can be viewed as a sample of a random dam effect (levelsare chosen at random from an infinite set of dam levels)

interest in estimating the variance of the sire and dam effects assources of random variation in the data

the three calves with the same parents share something whichpresumably violates the assumption of independence



...

S1

D1 D2 D3 D4

...

S5

D17 D18 D19 D20



Dam 1 Dam 2 Dam 3 Dam 4Sire C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3

1 30.1 31.1 34.6 29.2 30.8 31.6 32.0 32.6 32.7 33.3 40.2 36.72 32.3 36.7 40.1 35.6 34.3 41.1 34.1 30.8 39.3 39.9 36.7 38.73 39.8 36.5 38.9 37.5 38.6 36.8 39.0 39.8 38.6 36.7 37.6 38.94 40.8 42.0 45.0 42.7 43.9 46.7 44.5 46.0 47.0 43.9 45.0 48.05 41.9 43.2 45.3 45.3 44.0 47.1 45.3 44.8 45.3 46.0 47.2 48.0



Hierarchical classification model

The three stages of the illustration can be represented by the model

yijk = µ+ αi + β(i)j + ε(ij)k

where µ is the grand mean, αi is the effect of the i-th sire, β(i)j is theeffect of the j-th dam inseminated by the i-th sire, ε(ij)k is the effect ofthe k-th calf born from the j-th dam with the i-th sire. Assuming

αi ∼ N(0, σ2s ), β(i)j ∼ N(0, σ2

d) and ε(ij)k ∼ N(0, σ2)

αi β(i)j and ε(ij)k , αi and α′i (i 6= i ′), β(i)j and β(i ′)j′ (i 6= i ′ and/orj 6= j ′), ε(ij)k and ε(i ′j′)k′ (i 6= i ′, j 6= j ′ and/or k 6= k ′) areindependent

then

Var(Yijk) = Var(µ+ αi + β(i)j + ε(ij)k) = σ2 + σ2s + σ2

d

Cov(Yijk ,Yijk′) = Cov(µ+αi + β(i)j + ε(ij)k , µ+αi + β(i)j + ε(ij)k′) =σ2s + σ2

d (observations from same sire and same dam)

Cov(Yijk ,Yijk′) = Cov(µ+αi+β(i)j+ε(ij)k , µ+αi+β(i)j′+ε(ij′)k′) = σ2s

(observations from same sire and different dam)

Cov(Yijk ,Yijk′) = Cov(µ+αi+d(i)j+ε(ij)k , µ+α′i+d(i ′)j′+ε(i ′j′)k′) = 0(observations from different sire and different dam)



Considering 2 sires/2 dams/2 calves,

Y = XGµ+ Z1α+ Z2β + ε

E(Y) = XGµ

Var(Y) = ZGZ′ + Σ = Z1G1Z′1 + Z2G2Z′2 + Σ

Var(Y) = σ2s I2 ⊗ J4 + σ2

d I4 ⊗ J2 + σ2I8 =

[V 04×4

04×4 V

]where

V =

σ2s + σ2

d + σ2 σ2s + σ2

d σ2s σ2

s

σ2s + σ2

d σ2s + σ2

d + σ2 σ2s σ2

s

σ2s σ2

s σ2 + σ2s + σ2

d σ2s + σ2

d

σ2s σ2

s σ2s + σ2

d σ2s + σ2

d + σ2

In this case:

Z1 =

[14 04×1

04×1 14

]= I2 ⊗ 12 ⊗ 12,Z2 =

12 02×1 02×1 02×1

02×1 12 02×1 02×1

02×1 02×1 12 02×1

02×1 02×1 02×1 12

= I2 ⊗ I2 ⊗ 12,

G1 = σ2s I2, G2 = σ

2d I4 and Σ = σ

2I8



For Sires, Dams and Calves random (I Sires/JDams/K Calves), theanalysis of variance table is

Source df SSq E(MSq)Sire I − 1 Y′QSY σ2 + Kσ2

d + JKσ2s

Dam[Sire] I (J − 1) Y′QDY σ2 + Kσ2d

Residual IJ(K − 1) Y′QUResY σ2

ANOVA estimators:

σ2 = RMSq, σ2d =

DSMSq − RMSq

K, σ2

s =SMSq − DSMSq

JK



D.f

1µ

1

��SireI

SI−1

��Dam ∧ Sire

IJD[S]

I (J−1)

��IJK

Calf ∧ Dam ∧ Sire C[D ∧ S]IJ(K−1)

E(QM)

µ

��Sire

JKσ2s

Sσ2+kσ2

d+JKσ2

s

��Dam ∧ Sire

kσ2d

D[S]

σ2+kσ2d

��σ2 Calf ∧ Dam ∧ Sire C[D ∧ S]

σ2



Calf birth weight

ANOVA table using R or SAS


Sire 4 1356 339 71.22 < 0.01

Dam[Sire] 15 129 9 1.81 0.068

Residual 40 190 4.8

MM REML ML

σ2s 28 27 22

σ2v 1.3 1.2 1.2

σ2 4.8 4.8 4.8


SAS program

data calf_b;input weight sire dam @@;cards;30.1 1 1 39.3 2 3 43.9 4 231.1 1 1 39.9 2 4 46.7 4 234.6 1 1 36.7 2 4 44.5 4 329.2 1 2 38.7 2 4 46.0 4 330.8 1 2 39.8 3 1 47.0 4 331.6 1 2 36.5 3 1 43.9 4 432.0 1 3 38.9 3 1 45.0 4 432.6 1 3 37.5 3 2 48.0 4 432.7 1 3 38.6 3 2 41.9 5 133.3 1 4 36.8 3 2 43.2 5 140.2 1 4 39.0 3 3 45.3 5 136.7 1 4 39.8 3 3 45.3 5 232.3 2 1 38.6 3 3 44.0 5 236.7 2 1 36.7 3 4 47.1 5 240.1 2 1 37.6 3 4 45.3 5 335.6 2 2 38.9 3 4 44.8 5 334.3 2 2 40.8 4 1 45.3 5 341.1 2 2 42.0 4 1 46.0 5 434.1 2 3 45.0 4 1 47.2 5 430.8 2 3 42.7 4 2 48.0 5 4;* Moment Method;proc glm data=calf_b;class sire dam;model weight = sire dam(sire);random sire dam(sire)/test;run;

SAS program (cont.)

* Restricted Maximum Likelihood Method;proc mixed data=calf_b;class sire dam;model weight = / solution ;random sire dam / solution G;run;

* Maximum Likelihood Method;proc mixed data=calf_b method=ML;class sire dam;model weight = / solution ddfm=sat;random sire dam / solution G;run;

R program

sire <- factor(rep(c(1,2,3,4,5), times=c(rep(12,5)))); siredam <- factor(rep(rep(c(1:4), each=3),times=5))weight <- c(30.1, 31.1, 34.6, 29.2, 30.8, 31.6, 32 , 32.6, 32.7, 33.3, 40.2, 36.7,

32.3, 36.7, 40.1, 35.6, 34.3, 41.1, 34.1, 30.8, 39.3, 39.9, 36.7, 38.7,39.8, 36.5, 38.9, 37.5, 38.6, 36.8, 39.0, 39.8, 38.6, 36.7, 37.6, 38.9,40.8, 42.0, 45.0, 42.7, 43.9, 46.7, 44.5, 46.0, 47.0, 43.9, 45.0, 48.0,41.9, 43.2, 45.3, 45.3, 44.0, 47.1, 45.3, 44.8, 45.3, 46.0, 47.2, 48.0)

calf.dat<- data.frame(weight, sire, dam)

# Moment Methodcalf.lm <- lm(weight ~ sire/dam, calf.dat)anova(calf.lm)(MSSire = anova(calf.lm)$Mean[1])(MSDamdSire = anova(calf.lm)$Mean[2])(MSRes = anova(calf.lm)$Mean[3])(MSSire - MSDamdSire)/ (3*4)(MSDamdSire - MSRes)/ (3)

library(nlme)# Maximum Likelihood Methodcalf.ml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="ML")summary(calf.ml,corr = F)(summary(calf.ml)$sigma)^2

# Restricted Maximum Likelihood Methodrequire(nlme)calf.reml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="REML")summary(calf.reml,corr = F)(summary(calf.reml)$sigma)^2 # sigma^2names(summary(calf.reml)) # sigma^2


Unbalanced data

The experiment just described is not common. In general, experimentsthis type are unbalanced as the example in the following figure.

S1

D1 D2

S2

D3 D4 D5

S3

D6 D7D1 D2 D3 D4 D5 D6 D7



Dam 1 Dam 2 Dam 3Sire C1 C2 C1 C2 C1 C2 C3

1 32.0 33.5 55.02 36.0 34.5 35.0 48.0 49.5 50.03 32.5 31.5 58.0 57.0



For Sires, Dams and Calves random (I Sires/niDams/mijCalves), theanalysis of variance table is

Source df SSq E(MSq)Sire I − 1 Y′QSY σ2 + K2σ

2d + K3σ

2s

Dam[Sire]∑I

i=1 ni − I Y′QDY σ2 + K1σ2d

Residual N −∑I

i=1 ni − I Y′QUResY σ2

where K1 =1∑I

i=1 ni − I

(N −

I∑i=1

∑nij=1 m2

ij

mi.

),

K2 =1

I − 1

[ I∑i=1

ni∑j=1

m2ij

(1

mi .− 1

N

)]and K3 =

1

I − 1

(N −

∑Ii=1 m2

i.

N

).

N =∑I

i=1

∑nij=1 mij and mi. =

∑nij=1 mij

ANOVA estimators:

σ2 = RMSq, σ2d =

DSMSq − RMSq

K1, σ2

s =SMSq − K2σ

2d − σ2

K3



Calf birth weight



Sire 2 37.25 18.63 25.30 < 0.01

Dam[Sire] 4 1275.33 318.83 433.13 < 0.01

Residual 6 4.42 0.74

K1 = 1.75, K2 = 1.96, K3 = 4.15

MM ML REML

σ2s -81.53 0.00000012 0.00000023

σ2v 181.77 104.29 121.75

σ2 0.74 0.74 0.74

Negative component of variance


SAS programdata calf_unb;

input sire dam weight;

cards;

1 1 32.0

1 1 33.5

1 2 55.0

2 1 36.0

2 2 34.5

2 2 35.0

2 3 48.0

2 3 49.5

2 3 50.0

3 1 32.5

3 1 31.5

3 2 58.0

3 2 57.0

;

* Moment Method;

proc glm data=calf_unb;

class sire dam;

model weight = sire dam(sire);

random sire dam(sire)/test;

run;


proc mixed data=calf_unb;

class sire dam;

model weight = / solution ;

random sire dam / solution G;

run;


proc mixed data=calf_unb method=ML;

class sire dam;

model weight = / solution ddfm=sat;

random sire dam / solution G;

run;

R programsire <- factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3))

dam <- factor(c(1,1,2,1,2,2,3,3,3,1,1,2,2))

weight <- c(32.0,33.5,55.0,36.0,34.5,35.0,48.0,49.5,50.0,32.5,31.5,58.0,57.0)

(calf.dat<- data.frame(sire, dam, weight))

# Moment Method

calf.lm <- lm(weight ~ sire/dam, calf.dat)

anova(calf.lm)

(SireMSq = anova(calf.lm)$Mean[1])

(DamdSireMSq = anova(calf.lm)$Mean[2])

(ResMSq = anova(calf.lm)$Mean[3])

(k1 <- (13-(2^2+1^2)/3-(1^2+2^2+3^2)/6-(2^2+2^2)/4)/((2+3+2)-3))

(k2 <- ((2^2+1^2)*(1/3-1/13)+(1^2+2^2+3^2)*(1/6-1/13)+(2^2+2^2)*(1/4-1/13))/(3-1))

(k3 <- (13-(3^2+6^2+4^2)/13)/(3-1))

(sigma2D <- (DamdSireMSq - ResMSq)/k1) # sigmaD^2_hat

(sigma2S <- (SireMSq - ResMSq - k2*sigma2D)/ k3) # sigmaS^2_hat

# Restricted Maximum Likelihood Method

require(nlme)

calf.reml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="REML")

summary(calf.reml,corr = F)

(summary(calf.reml)$sigma)^2 # sigma^2

# Maximum Likelihood Method

calf.ml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="ML")

summary(calf.ml,corr = F)

(summary(calf.ml)$sigma)^2


Enzyme lypase data

Lypase is an enzyme used for certain types of medical diagnosis, andthe different stages of its preparation can affect the requiredspecifications.

Three laboratories preparing the enzyme were randomly selected forthe experiment.

Four weeks were randomly assigned for each of the laboratories.

Measurements were obtained in the mornings and evenings of thesample days chosen from the selected weeks.

Only the averages for the days are presented in the following table;450 is subtracted from each of the averages.



LaboratoryWeek 1 2 3

1 43.4, 46.2, 46.5 7.0, 7.8, 15.7 22.4,15.5,29.72 37.0. 16.6 32.4, 16.8 25.4, 23.13 23.6, 33.6 13.4, 9.6 22.9, 0.64 51.0, 52.4 23.9, 19.3 18.4, 3.7



Laboratory 2 2874.03 1437.02

Week[Laboratory] 9 1625.45 180.61

Residual 15 910.82 60.72



Unequal variances and sample sizes

For this general case, the variances of β(i)j are unequal and they canbe denoted by σ2

i .The variance of ε(ij)k , denoted by σ2

ij can also be unequal.

Equality of σ2i and σ2

ij are usually assumed.In some practical situation, these assunptions may not be valid.

Var(Y) =

[V1 04×4

04×4 V2

]where

V1 =

σ2s + σ2

1 + σ211 σ2

s + σ21 σ2

s σ2s

σ2s + σ2

1 σ2s + σ2

1 + σ211 σ2

s σ2s

σ2s σ2

s σ2s + σ2

1 + σ212 σ2

s + σ21

σ2s σ2

s σ2s + σ2

1 σ2s + σ2

1 + σ212

V2 =

σ2s + σ2

2 + σ221 σ2

s + σ22 σ2

s σ2s

σ2s + σ2

2 σ2s + σ2

2 + σ221 σ2

s σ2s

σ2s σ2

s σ2s + σ2

2 + σ222 σ2

s + σ22

σ2s σ2

s σ2s + σ2

2 σ2s + σ2

2 + σ222


Session 1 - Review of Basic ConceptsSession 2 - Linear ...€¦ · Session 1 - Review of Basic...

Documents

Transcript of Session 1 - Review of Basic ConceptsSession 2 - Linear ...€¦ · Session 1 - Review of Basic...