Singularity in AI matrix when fitting ANIMAL model
Question
 I want to estimate the heritability of fish body weight but get the ASReml error
 Singularity in Average Information Matrix
 The asreml manual suggests I need to modify the model but the model is very simple. How can I get the heritability of body weight?
 My ASReml job is
 Data analysis for Flounder
 animal !A !P
 sire !A !P
 dam !A !P
 tank * !I
 age
 bl
 wt
 yaping.dat !ALPHA !SKIP 1 !MAKE
 yaping.dat !SKIP 1 !MAXIT 500
 wt ~ mu age !r animal
 Note that   !A is not required with  !P  since the fact that the fields are alphanumeric
 is declared by the  !ALPHA qualifier on the pedigree file line.
Answer
 This is a common problem which arises because of the nature of the animal model.
What is happening?
 Looking at the iteration summary we see
   1 LogL=-3727.99     S2=  105.60       1298 df   0.1000      1.000
   2 LogL=-3698.00     S2=  99.129       1298 df   0.1296      1.000
   3 LogL=-3644.01     S2=  86.829       1298 df   0.2218      1.000
   4 LogL=-3594.78     S2=  72.626       1298 df   0.4400      1.000
   5 LogL=-3560.18     S2=  54.273       1298 df    1.062      1.000
   6 LogL=-3545.76     S2=  35.421       1298 df    2.550      1.000
   7 LogL=-3540.01     S2=  18.033       1298 df    6.809      1.000
   8 LogL=-3537.88     S2=  5.9630       1298 df    24.47      1.000
   9 LogL=-3537.25     S2= 0.84544       1298 df    :   1 components restrained
  10 LogL=-3537.16     S2= 0.53997E-01   1298 df    :   1 components restrained
  11 LogL=-3537.15     S2= 0.34172E-02   1298 df   0.4607E+05  1.000
 Notice that the residual is shrinking as the variance ratio explodes.
 It fails because the residual has become too small.
 The singularity in AI matrix did not appear in the first iteration so the problem
 is not structural (a common couse of this message) but data dependent.
Why is it happening?
 The summary of the structure of the pedigree (given in ASReml 3) is
     1339 identities in the pedigree over 1 generations
       Sires SiresofSire  DamsofSire        Dams  SiresofDam   DamsofDam
          26           0           0          13           0           0
 There is no pedigree on the parents, and it looks like there are 26 families.
 After defining sire and dam as
 animal  !P
 sire !A
 dam !A
 Using  tabulate wt ~ sire dam confirms that there are 13 dams and 2 sires per dam.
 Fitting the model  wt ~ mu age !r sire dam, the model converges to give
  10 LogL=-3534.80     S2=  77.627       1298 df   0.6284      1.006      1.000
          - - - Results from analysis of wt - - -
          Approximate stratum variance decomposition
 Stratum     Degrees-Freedom   Variance      Component Coefficients
 dam                   11.17    11412.2       128.1    65.2     1.0
 sire                  12.88    3545.87         0.0    44.4     1.0
 Residual Variance   1273.94    77.6266         0.0     0.0     1.0
 Source                Model  terms     Gamma     Component    Comp/SE   % C
 dam                      13     13  0.628358       48.7773       1.19   0 P
 sire                     26     26   1.00558       78.0599       2.48   0 P
 Variance               1300   1298   1.00000       77.6266      25.24   0 P
                                   Wald F statistics
     Source of Variation           NumDF     DenDF    Fic             Prob
   8 mu                                1      11.0    71.29            <.001
   5 age                               1      13.8     0.34            0.568
 Fitting just  sire gives
  11 LogL=-3535.74     S2=  77.624       1298 df    1.586      1.000
 Final parameter values                        1.5864     1.0000
          - - - Results from analysis of wt - - -
 Source                Model  terms     Gamma     Component    Comp/SE   % C
 sire                     26     26   1.58636       123.139       3.42   0 P
 Variance               1300   1298   1.00000       77.6240      25.24   0 P
 which is almost the same LogL. In the sire + dam model, the actual sire variance is 126.84 (48.78+78.06)
 and the covariance between families with the same dam is 48.78).  Assuming no  covariance between families with the same dam,
 the sire variance is 123.14.
 The animal model is based on the genetic assumption that the sire variance represents 0.25 σ2A and the residual
 represents   σ2E+0.75σ2A  .  This gives  σ2A=4 cross  123.14 = 492.56 and
 σ2E= 77.62 - 369.42 = -291.8.  The animal model falls over because ASReml can't estimate a negative residual variance
 directly.
What if genetic is just the dam component?
 I am not very familiar with fish experiments but I understand that sometimes families are raised in different tubs.
 Since in this case, families are represented by sires, if these families were raised in different tubs, then the sire variance is
 tub variance (in the sire dam model), and the dam variance is primarily genetic.
 Under this scenario, sire is analgous to 'maternal environment' effects in animal experiments.
 I have therefore fitted  sire + dam (allowing for 6 outliers and on the log scale with the model, see below)
 log(wt) ~ mu age out(53) out(301) out(302) out(342) out(1004) out(631) !r sire dam
 This gives
 Source                Model  terms     Gamma     Component    Comp/SE   % C
 dam                      13     13  0.747327      0.104939       1.09   0 P
 sire                     26     26   1.40185      0.196846       2.50   0 P
 Variance               1300   1292   1.00000      0.140419      25.18   0 P
 The dam variance is still too large to provide a plausible heritability though.
 That is,  the residual still goes to the boundary in the  'sire + animal'  model
 because  0.104939 >   0.140419/3
 The bottom line is that there appears to be family effects over and above simple genetic effects.
 Maybe you need to replicate at the family level so that you can partition the variance better.
What else is happening?
 A plot of the residuals against fitted values shows a few (at least 5) fish that
 are very large  relative to their full sibs.  Additionally, there is a general fanning
 of the residuals (but age was not significant) so that heavier families are more variable.
 This suggests a sqrt transformation might be in order (after fixing outliers).
 Plot of Residuals [  -24.7561   63.3026] vs Fitted values [    6.6769   43.7587] RE11
     ------------1-----------------------------------------------
     .                                                          .
     .                          1                               .
     .         1                                                .
     .                          1                               .
     .                          1                1              .
     .                                           1              .
     .                                           1              .
     .                                                          1
     .                          1                      1        1
     .                         1                 2              .
     .      1  1               11         1      2     2    1  11
     .           1       2 1           2               1    2  23
     .      1  121   2         1 1     4 22      1     5    6  22
     .          33   43  115   5 4     3 43      1          2  54
     .      3  243   *5  365   127     3 53      1     2    4  31
     *   *  2  *7*   *7  683   587     4 7*      5     5    5  *2
     *   *  *  *2*   *9  989   755     8 9*      4     7    7  36
     *---*--*--***1--*9--*87---*46-----6-56------8-----3----3--8*
     7   7  9  ***   **  8**   668     8 82      8     *    3  33
     .   1  3  554   *7  737   7*8     7 45      4     4    7  37
     .               1    11   194     3 31      3     4    4  34
     .                         21      2 25      3     3    4  22
     .                                   11      5     1    1  12
     -------------------------------------------------------1--31
10 November 2008
See Also