IMP GPI Lipid Anchor Project IMP-Bioinformatics

The jack-knife-test II over the largest subset of non-homologous sequences only: Protozoa

Contact:
Birgit Eisenhaber (IMP/Austria)
Peer Bork (MDC/EMBL)
Frank Eisenhaber (IMP/Austria)


Prediction Results
How to read the prediction results

# 
# ------------------------------------------------------------------- #
# 
# package *-> proto <-*
# *********************
# 
# name of executable           : proto
# time of program compilation  : Jul 16 1999 (13:14:40)
# time of program execution    : Fri Jul 16 13:16:33 1999
# version of the code          : Revision: 1.15 (Date: 1999/01/28 15:56:41)
# 
ReadAapLib: AAProperty library  opened
Number of entrys in AapLib: <641>
-->All selects were switched on.


Selection parameters:
---------------------
0:     PToken: OC   
       SToken: 
         RExp: PROTOZOA
    is_SToken: 0
       expect: 1

-->OpenSWFile: Data file <./gpi.learn.sav> opened


-->begin GPI-site evaluations
-->ReadGPILib: GPI library  opened
 INFO> ReadGPILib: VARSPLIC line in entry <UPAR_RAT> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <ACES_TORCA> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <ACES_TORMA> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <CEPU_CHICK> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <DAF_HUMAN> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <NRTR_HUMAN> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <NRTR_MOUSE> .!!
 INFO> ReadGPILib: VARSPLIC line in entry <OPCM_RAT> .!!
number of entries of file 1                   :   169
number of accepted entries of file 1          :   169
total number of entries                       :   169

-->Statistic_confidence of GPILib 1 : 
number of accepted entries                    :   169
number of entries with certain GPI-site       :    40
number of entries with potential GPI-site     :    65
number of entries with GPI-site by similarity :    64

-->GPIStatistic_Length: GPILibNumber = 1
Lmin                :  14 ( entry: PAG1_TRYBB, file: 1 )
Lmax                :  31 ( entry: THY1_MACMU, file: 1 )
deltaL              :   5
number of intervals :   4
interval of length   from     to    number of entries
                 1     14     19                   21
                 2     20     24                   84
                 3     25     29                   53
                 4     30     34                   11
total number of entries                       :   169

-->Statistic_taxonomy of GPILib 1 / total number of entries 169
     1 VIRUSES
   168 EUKARYOTA
   120 ..METAZOA
     5 ....INSECTA
   113 ....VERTEBRATA
     3 ......PISCES
     5 ......AVES
   105 ......MAMMALIA
    55 ........PRIMATES
    10 ..FUNGI
    38 ..PROTOZOA

--> Jack-Knife-Test for largest subset is running!!

-->Calc_EVD_Param: file of values  opened
LinearCorrelation of 1061 data points [x,y]
           r=-0.975986 (P=0.000000)

LinearFit: from 1061 datapoints [x,y] for y=a+bx (1059 degrees of freedom)
           without weights (RMSD) for y (uniform sigma is estimated)
           x mean=-75.462328 (stan.dev.=24.061508)
           y mean=-0.574634 (stan.dev.=1.269734)
           interception a=-4.461182	(siga=0.027977)
           slope        b=-0.051503	(sigb=0.000353)
           correlation  R(siga,sigb)=0.952782

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > statistical validation
        1) total chi2=81.090823,	P=1.000000e+00 B=-ln(P)/df= 0.000
           (used for weight estimation !)

        2) t-Test for interception a=0: t=-159.459749	P=0.000000e+00
           [-4.553500,-4.368863] for alpha=0.001 (TWO_SIDED, Tna=3.299805)
           [-4.533376,-4.388987] for alpha=0.010 (TWO_SIDED, Tna=2.580505)
           [-4.516078,-4.406285] for alpha=0.050 (TWO_SIDED, Tna=1.962204)

        3) t-Test for slope        b=0: t=-145.804769	P=0.000000e+00
           [-0.052669,-0.050338] for alpha=0.001 (TWO_SIDED, Tna=3.299805)
           [-0.052415,-0.050592] for alpha=0.010 (TWO_SIDED, Tna=2.580505)
           [-0.052196,-0.050810] for alpha=0.050 (TWO_SIDED, Tna=1.962204)

        4) Fisher's test (average of function value versus regression)
           F=21259.030543 (df1=1, df2=1059) P=0.000000e+00

   u: -86.619580 lambda: 0.051503
Thresholds for GPI-anchor selecting:
   A-site: p = 0.0025 score =  29.69
   B-site: p = 0.0050 score =  16.21
   C-site: p = 0.0075 score =   8.31
   D-site: p = 0.0100 score =   2.70
   S-site: p = 0.0175 score =  -8.24
24.14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 24.14
ID:
GP63_LEIMA AC: P08148 Len: 602 1:B 577 Sc: 24.14 Pv: 3.325570e-03
29.52 -0.25 -0.38 -0.29 0.00 0.00 0.00 -1.49 -0.59 -0.08 -0.10 0.00 0.00 0.00 0.00 0.00 26.34
ID: VSG7_TRYBR AC: P02898 Len: 467 1:B 444 Sc: 26.34 Pv: 2.970484e-03
21.63 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.05 -0.45 0.00 0.00 0.00 0.00 0.00 21.14
ID: G13A_DICDI AC: P34115 Len: 730 1:B 708 Sc: 21.14 Pv: 3.880739e-03
14.08 -1.43 -2.34 -3.74 0.00 0.00 0.00 -2.51 0.00 0.00 -0.15 0.00 0.00 0.00 0.00 0.00 3.91
ID: GP85_TRYCR AC: Q03877 Len: 714 1:D 691 Sc: 3.91 Pv: 9.399976e-03
18.07 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.67 0.00 0.00 0.00 0.00 0.00 0.00 17.40
ID: MSA1_SARMU AC: Q01416 Len: 280 1:B 256 Sc: 17.40 Pv: 4.703500e-03
17.59 0.00 0.00 0.00 0.00 0.00 0.00 -1.22 0.00 -0.52 -0.16 0.00 0.00 0.00 0.00 0.00 15.69
ID: PARB_TRYBB AC: P09791 Len: 129 1:C 107 Sc: 15.69 Pv: 5.133386e-03
23.76 0.00 0.00 0.00 0.00 0.00 0.00 -0.13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 23.63
ID: PONA_DICDI AC: P54660 Len: 143 1:B 118 Sc: 23.63 Pv: 3.413372e-03
16.51 0.00 0.00 0.00 0.00 0.00 0.00 -0.47 0.00 -0.24 -1.58 -12.00 0.00 0.00 0.00 0.00 2.22
ID: PSA_DICDI AC: P12729 Len: 168 1:I 147 Sc: 2.22 Pv: 1.025039e-02
23.46 0.00 0.00 0.00 0.00 0.00 0.00 -0.03 -0.49 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22.94
ID: VSG2_TRYEQ AC: P20950 Len: 457 1:B 440 Sc: 22.94 Pv: 3.537403e-03
17.32 -0.13 -0.24 -0.52 0.00 0.00 0.00 -7.59 -0.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8.71
ID: VSI1_TRYBB AC: P26326 Len: 471 1:C 454 Sc: 8.71 Pv: 7.347121e-03
21.28 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 21.28
ID: VSI2_TRYBB AC: P26327 Len: 479 1:B 462 Sc: 21.28 Pv: 3.852163e-03
23.88 -0.88 -1.37 -0.52 0.00 0.00 0.00 0.00 -0.76 -0.21 -0.34 0.00 0.00 0.00 0.00 0.00 19.80
ID: VSI3_TRYBB AC: P26328 Len: 532 1:B 509 Sc: 19.80 Pv: 4.155906e-03
33.90 -0.64 -0.88 -0.83 0.00 0.00 0.00 -0.02 -1.06 -0.08 -0.10 0.00 0.00 0.00 0.00 0.00 30.28
ID: VSI6_TRYBB AC: P06014 Len: 503 1:A 480 Sc: 30.28 Pv: 2.425395e-03
24.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.98 -0.56 -0.73 -12.00 0.00 0.00 0.00 0.00 10.71
ID: VSM0_TRYBB AC: P07209 Len: 72 1:C 50 Sc: 10.71 Pv: 6.630512e-03
21.71 0.00 0.00 -0.02 0.00 0.00 0.00 0.00 -0.13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 21.55
ID: VSM2_TRYBB AC: P26332 Len: 476 1:B 459 Sc: 21.55 Pv: 3.798682e-03
31.92 -0.64 -0.88 -0.83 0.00 0.00 0.00 -1.57 -0.50 -0.03 -0.01 0.00 0.00 0.00 0.00 0.00 27.45
ID: VSM4_TRYBB AC: P02896 Len: 526 1:B 503 Sc: 27.45 Pv: 2.804483e-03
29.23 -0.71 -0.93 -0.72 0.00 0.00 0.00 0.00 -0.19 -0.85 -0.50 -12.00 0.00 0.00 -12.00 0.00 1.32
ID: VSM5_TRYBB AC: P26333 Len: 474 1:I 451 Sc: 1.32 Pv: 1.073353e-02
19.30 0.00 0.00 -0.02 0.00 0.00 0.00 0.00 -0.13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 19.14
ID: VSWB_TRYBR AC: P20947 Len: 487 1:B 470 Sc: 19.14 Pv: 4.299438e-03
18.03 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.26 0.00 0.00 0.00 0.00 0.00 0.00 17.76
ID: VSY1_TRYCO AC: P20948 Len: 419 1:B 400 Sc: 17.76 Pv: 4.615047e-03
-->end of list of found GPI-sites

   number of found GPI-sites 17
-->ReadSWEntry: End of File
   number of complete entries : 38
   number of too short entries : 0
   number of failed entries : 0
   number of unselected entries : 131
   number of entries belong to the largest subset: 19
   number of entries not belong to the largest subset: 19
-->CloseSWFile: Data file closed

# 
# 
# Normal termination of program proto.
# total program time: CPU 0:00:31.83 sec, elapsed time 0:00:41.00 sec
# 


Last modified: 12th June 2002