Singularity in AI matrix when fitting ANIMAL model
Question
I want to estimate the heritability of fish body weight but get the ASReml error
Singularity in Average Information Matrix
The asreml manual suggests I need to modify the model but the model is very simple. How can I get the heritability of body weight?
My ASReml job is
Data analysis for Flounder
animal !A !P
sire !A !P
dam !A !P
tank * !I
age
bl
wt
yaping.dat !ALPHA !SKIP 1 !MAKE
yaping.dat !SKIP 1 !MAXIT 500
wt ~ mu age !r animal
Note that !A is not required with !P since the fact that the fields are alphanumeric
is declared by the !ALPHA qualifier on the pedigree file line.
Answer
This is a common problem which arises because of the nature of the animal model.
What is happening?
Looking at the iteration summary we see
1 LogL=-3727.99 S2= 105.60 1298 df 0.1000 1.000
2 LogL=-3698.00 S2= 99.129 1298 df 0.1296 1.000
3 LogL=-3644.01 S2= 86.829 1298 df 0.2218 1.000
4 LogL=-3594.78 S2= 72.626 1298 df 0.4400 1.000
5 LogL=-3560.18 S2= 54.273 1298 df 1.062 1.000
6 LogL=-3545.76 S2= 35.421 1298 df 2.550 1.000
7 LogL=-3540.01 S2= 18.033 1298 df 6.809 1.000
8 LogL=-3537.88 S2= 5.9630 1298 df 24.47 1.000
9 LogL=-3537.25 S2= 0.84544 1298 df : 1 components restrained
10 LogL=-3537.16 S2= 0.53997E-01 1298 df : 1 components restrained
11 LogL=-3537.15 S2= 0.34172E-02 1298 df 0.4607E+05 1.000
Notice that the residual is shrinking as the variance ratio explodes.
It fails because the residual has become too small.
The singularity in AI matrix did not appear in the first iteration so the problem
is not structural (a common couse of this message) but data dependent.
Why is it happening?
The summary of the structure of the pedigree (given in ASReml 3) is
1339 identities in the pedigree over 1 generations
Sires SiresofSire DamsofSire Dams SiresofDam DamsofDam
26 0 0 13 0 0
There is no pedigree on the parents, and it looks like there are 26 families.
After defining sire and dam as
animal !P
sire !A
dam !A
Using tabulate wt ~ sire dam confirms that there are 13 dams and 2 sires per dam.
Fitting the model wt ~ mu age !r sire dam, the model converges to give
10 LogL=-3534.80 S2= 77.627 1298 df 0.6284 1.006 1.000
- - - Results from analysis of wt - - -
Approximate stratum variance decomposition
Stratum Degrees-Freedom Variance Component Coefficients
dam 11.17 11412.2 128.1 65.2 1.0
sire 12.88 3545.87 0.0 44.4 1.0
Residual Variance 1273.94 77.6266 0.0 0.0 1.0
Source Model terms Gamma Component Comp/SE % C
dam 13 13 0.628358 48.7773 1.19 0 P
sire 26 26 1.00558 78.0599 2.48 0 P
Variance 1300 1298 1.00000 77.6266 25.24 0 P
Wald F statistics
Source of Variation NumDF DenDF F_{i}c Prob
8 mu 1 11.0 71.29 <.001
5 age 1 13.8 0.34 0.568
Fitting just sire gives
11 LogL=-3535.74 S2= 77.624 1298 df 1.586 1.000
Final parameter values 1.5864 1.0000
- - - Results from analysis of wt - - -
Source Model terms Gamma Component Comp/SE % C
sire 26 26 1.58636 123.139 3.42 0 P
Variance 1300 1298 1.00000 77.6240 25.24 0 P
which is almost the same LogL. In the sire + dam model, the actual sire variance is 126.84 (48.78+78.06)
and the covariance between families with the same dam is 48.78). Assuming no covariance between families with the same dam,
the sire variance is 123.14.
The animal model is based on the genetic assumption that the sire variance represents 0.25 σ^{2}_{A} and the residual
represents σ^{2}_{E}+0.75σ^{2}_{A} . This gives σ^{2}_{A}=4 cross 123.14 = 492.56 and
σ^{2}_{E}= 77.62 - 369.42 = -291.8. The animal model falls over because ASReml can't estimate a negative residual variance
directly.
What if genetic is just the dam component?
I am not very familiar with fish experiments but I understand that sometimes families are raised in different tubs.
Since in this case, families are represented by sires, if these families were raised in different tubs, then the sire variance is
tub variance (in the sire dam model), and the dam variance is primarily genetic.
Under this scenario, sire is analgous to 'maternal environment' effects in animal experiments.
I have therefore fitted sire + dam (allowing for 6 outliers and on the log scale with the model, see below)
log(wt) ~ mu age out(53) out(301) out(302) out(342) out(1004) out(631) !r sire dam
This gives
Source Model terms Gamma Component Comp/SE % C
dam 13 13 0.747327 0.104939 1.09 0 P
sire 26 26 1.40185 0.196846 2.50 0 P
Variance 1300 1292 1.00000 0.140419 25.18 0 P
The dam variance is still too large to provide a plausible heritability though.
That is, the residual still goes to the boundary in the 'sire + animal' model
because 0.104939 > 0.140419/3
The bottom line is that there appears to be family effects over and above simple genetic effects.
Maybe you need to replicate at the family level so that you can partition the variance better.
What else is happening?
A plot of the residuals against fitted values shows a few (at least 5) fish that
are very large relative to their full sibs. Additionally, there is a general fanning
of the residuals (but age was not significant) so that heavier families are more variable.
This suggests a sqrt transformation might be in order (after fixing outliers).
Plot of Residuals [ -24.7561 63.3026] vs Fitted values [ 6.6769 43.7587] _{R}E11
------------1-----------------------------------------------
. .
. 1 .
. 1 .
. 1 .
. 1 1 .
. 1 .
. 1 .
. 1
. 1 1 1
. 1 2 .
. 1 1 11 1 2 2 1 11
. 1 2 1 2 1 2 23
. 1 121 2 1 1 4 22 1 5 6 22
. 33 43 115 5 4 3 43 1 2 54
. 3 243 *5 365 127 3 53 1 2 4 31
* * 2 *7* *7 683 587 4 7* 5 5 5 *2
* * * *2* *9 989 755 8 9* 4 7 7 36
*---*--*--***1--*9--*87---*46-----6-56------8-----3----3--8*
7 7 9 *** ** 8** 668 8 82 8 * 3 33
. 1 3 554 *7 737 7*8 7 45 4 4 7 37
. 1 11 194 3 31 3 4 4 34
. 21 2 25 3 3 4 22
. 11 5 1 1 12
-------------------------------------------------------1--31
10 November 2008
See Also