# Singularity in AI matrix when fitting ANIMAL model

## Question

I want to estimate the heritability of fish body weight but get the ASReml error Singularity in Average Information Matrix The asreml manual suggests I need to modify the model but the model is very simple. How can I get the heritability of body weight? My ASReml job is
``` Data analysis for Flounder
animal !A !P
sire !A !P
dam !A !P
tank * !I
age
bl
wt
yaping.dat !ALPHA !SKIP 1 !MAKE
yaping.dat !SKIP 1 !MAXIT 500
wt ~ mu age !r animal
```
Note that !A is not required with !P since the fact that the fields are alphanumeric is declared by the !ALPHA qualifier on the pedigree file line.

This is a common problem which arises because of the nature of the animal model.

#### What is happening?

Looking at the iteration summary we see
```   1 LogL=-3727.99     S2=  105.60       1298 df   0.1000      1.000
2 LogL=-3698.00     S2=  99.129       1298 df   0.1296      1.000
3 LogL=-3644.01     S2=  86.829       1298 df   0.2218      1.000
4 LogL=-3594.78     S2=  72.626       1298 df   0.4400      1.000
5 LogL=-3560.18     S2=  54.273       1298 df    1.062      1.000
6 LogL=-3545.76     S2=  35.421       1298 df    2.550      1.000
7 LogL=-3540.01     S2=  18.033       1298 df    6.809      1.000
8 LogL=-3537.88     S2=  5.9630       1298 df    24.47      1.000
9 LogL=-3537.25     S2= 0.84544       1298 df    :   1 components restrained
10 LogL=-3537.16     S2= 0.53997E-01   1298 df    :   1 components restrained
11 LogL=-3537.15     S2= 0.34172E-02   1298 df   0.4607E+05  1.000
```
Notice that the residual is shrinking as the variance ratio explodes. It fails because the residual has become too small.

The singularity in AI matrix did not appear in the first iteration so the problem is not structural (a common couse of this message) but data dependent.

#### Why is it happening?

The summary of the structure of the pedigree (given in ASReml 3) is
```     1339 identities in the pedigree over 1 generations
Sires SiresofSire  DamsofSire        Dams  SiresofDam   DamsofDam
26           0           0          13           0           0
```
There is no pedigree on the parents, and it looks like there are 26 families. After defining sire and dam as
``` animal  !P
sire !A
dam !A
```
Using tabulate wt ~ sire dam confirms that there are 13 dams and 2 sires per dam.

Fitting the model wt ~ mu age !r sire dam, the model converges to give
```  10 LogL=-3534.80     S2=  77.627       1298 df   0.6284      1.006      1.000

- - - Results from analysis of wt - - -

Approximate stratum variance decomposition
Stratum     Degrees-Freedom   Variance      Component Coefficients
dam                   11.17    11412.2       128.1    65.2     1.0
sire                  12.88    3545.87         0.0    44.4     1.0
Residual Variance   1273.94    77.6266         0.0     0.0     1.0

Source                Model  terms     Gamma     Component    Comp/SE   % C
dam                      13     13  0.628358       48.7773       1.19   0 P
sire                     26     26   1.00558       78.0599       2.48   0 P
Variance               1300   1298   1.00000       77.6266      25.24   0 P

Wald F statistics
Source of Variation           NumDF     DenDF    Fic             Prob
8 mu                                1      11.0    71.29            <.001
5 age                               1      13.8     0.34            0.568
```
Fitting just sire gives
```  11 LogL=-3535.74     S2=  77.624       1298 df    1.586      1.000
Final parameter values                        1.5864     1.0000

- - - Results from analysis of wt - - -

Source                Model  terms     Gamma     Component    Comp/SE   % C
sire                     26     26   1.58636       123.139       3.42   0 P
Variance               1300   1298   1.00000       77.6240      25.24   0 P
```
which is almost the same LogL. In the sire + dam model, the actual sire variance is 126.84 (48.78+78.06) and the covariance between families with the same dam is 48.78). Assuming no covariance between families with the same dam, the sire variance is 123.14.

The animal model is based on the genetic assumption that the sire variance represents 0.25 σ2A and the residual represents σ2E+0.75σ2A . This gives σ2A=4 cross 123.14 = 492.56 and σ2E= 77.62 - 369.42 = -291.8. The animal model falls over because ASReml can't estimate a negative residual variance directly.

#### What if genetic is just the dam component?

I am not very familiar with fish experiments but I understand that sometimes families are raised in different tubs. Since in this case, families are represented by sires, if these families were raised in different tubs, then the sire variance is tub variance (in the sire dam model), and the dam variance is primarily genetic. Under this scenario, sire is analgous to 'maternal environment' effects in animal experiments.

I have therefore fitted sire + dam (allowing for 6 outliers and on the log scale with the model, see below)
``` log(wt) ~ mu age out(53) out(301) out(302) out(342) out(1004) out(631) !r sire dam
```
This gives
``` Source                Model  terms     Gamma     Component    Comp/SE   % C
dam                      13     13  0.747327      0.104939       1.09   0 P
sire                     26     26   1.40185      0.196846       2.50   0 P
Variance               1300   1292   1.00000      0.140419      25.18   0 P
```
The dam variance is still too large to provide a plausible heritability though. That is, the residual still goes to the boundary in the 'sire + animal' model because 0.104939 > 0.140419/3

The bottom line is that there appears to be family effects over and above simple genetic effects. Maybe you need to replicate at the family level so that you can partition the variance better.

#### What else is happening?

A plot of the residuals against fitted values shows a few (at least 5) fish that are very large relative to their full sibs. Additionally, there is a general fanning of the residuals (but age was not significant) so that heavier families are more variable. This suggests a sqrt transformation might be in order (after fixing outliers).
```
Plot of Residuals [  -24.7561   63.3026] vs Fitted values [    6.6769   43.7587] RE11
------------1-----------------------------------------------
.                                                          .
.                          1                               .
.         1                                                .
.                          1                               .
.                          1                1              .
.                                           1              .
.                                           1              .
.                                                          1
.                          1                      1        1
.                         1                 2              .
.      1  1               11         1      2     2    1  11
.           1       2 1           2               1    2  23
.      1  121   2         1 1     4 22      1     5    6  22
.          33   43  115   5 4     3 43      1          2  54
.      3  243   *5  365   127     3 53      1     2    4  31
*   *  2  *7*   *7  683   587     4 7*      5     5    5  *2
*   *  *  *2*   *9  989   755     8 9*      4     7    7  36
*---*--*--***1--*9--*87---*46-----6-56------8-----3----3--8*
7   7  9  ***   **  8**   668     8 82      8     *    3  33
.   1  3  554   *7  737   7*8     7 45      4     4    7  37
.               1    11   194     3 31      3     4    4  34
.                         21      2 25      3     3    4  22
.                                   11      5     1    1  12
-------------------------------------------------------1--31
```

10 November 2008