analysis of sequence from T20374.fa
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.
MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN
GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM
GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS
TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG
DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS
CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS
KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD
LFDSPRKNGT NDKTSEKNVD PDYQ

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

sec.str. with predator

> T20374
              .         .         .         .         .
1    MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYG   50
     ___EEEEEEEEEE_______________HHHHHHHHHH__EEEEEE____

              .         .         .         .         .
51   NRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQII  100
     ___EEEEE____EEEE____________________HHHHHHHHHEEEEE

              .         .         .         .         .
101  HGHSTFSSLAHETLMIGGLMGLRTVFTDHSLFGFADASAILTNKLVLQYS  150
     E____HHHHHHHHHHHH_EEEEEEEEE______HHHHHHHHHHHHHHHHH

              .         .         .         .         .
151  LINVDQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQFFN  200
     H____EEEEEEE_____EEE_________________EEEE_________

              .         .         .         .         .
201  NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGDGPKRIELEE  250
     ___EEEE__EEEEE___HHHHHEEEEEEE____EEEEEE_____HHHHHH

              .         .         .         .         .
251  MLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS  300
     HHHHHHHHHHEEE________EEEEEE__EEEEE___HHHHHHHHHH___

              .         .         .         .         .
301  CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLM  350
     ___EEEEEEE____EEEE___EEEE________HHHHHHHHHHHHHHH__

              .         .         .         .         .
351  DPTEKHEAVSKMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIG  400
     ___HHHHHHHHH_____HHHHHHHHHHHHHH___________EEE_____

              .         .         .         .    
401  FGIMYIVVSCIIIFWLTVLDLFDSPRKNGTNDKTSEKNVDPDYQ        444
     __EEEEEEEHHHHHHHHHH_________________________


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~


method         :         1
alpha-contents :       0.8 %
beta-contents  :      62.2 %
coil-contents  :      37.0 %
class          :      beta


method         :         2
alpha-contents :       0.0 %
beta-contents  :      54.5 %
coil-contents  :      45.5 %
class          :      beta


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

GPI: learning from metazoa
-21.92  -0.11  -0.66  -0.35  -4.00   0.00  -8.00   0.00  -0.06 -13.36  -4.18 -12.00 -12.00  -4.00 -12.00   0.00  -92.64
 -8.85  -0.67  -0.29  -1.35   0.00   0.00  -4.00   0.00   0.00 -13.36  -4.18 -12.00 -12.00  -4.00 -12.00   0.00  -72.72
ID: T20374	AC: xxx Len:  444 1:I   428 Sc:  -72.72 Pv: 6.246781e-01 NO_GPI_SITE
GPI: learning from protozoa
-18.20  -2.35  -0.45  -0.17  -4.00   0.00 -16.00   0.00   0.00 -11.23 -13.79 -12.00 -12.00   0.00 -12.00   0.00  -102.19
-24.64  -0.69  -0.39  -0.13  -4.00   0.00  -4.00   0.00   0.00 -11.23 -13.79 -12.00 -12.00  -4.00 -12.00   0.00  -98.87
ID: T20374	AC: xxx Len:  444 1:I   428 Sc:  -98.87 Pv: 8.473754e-01 NO_GPI_SITE

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

# SignalP euk predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
T20374       0.443 299 Y  0.399 430 Y  0.922 417 Y  0.156 N
# SignalP gram- predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
T20374       0.553  24 Y  0.335  24 N  0.827   9 N  0.335 N
# SignalP gram+ predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
T20374       0.513 299 Y  0.269 217 N  0.969 410 Y  0.250 N

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

low complexity regions: SEG 12 2.2 2.5
>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.

                                  1-327  MSLKIGPYSIALVSDFFCPNAGGVETHIYF
                                         LAQCLIELGHRVVVITHGYGNRKGIRYLSN
                                         GLKVYYLPFIVAYNGATLGSIVGSMPWLRK
                                         VLLRENVQIIHGHSTFSSLAHETLMIGGLM
                                         GLRTVFTDHSLFGFADASAILTNKLVLQYS
                                         LINVDQTICVSYTSKENTVLRGKLDPNKVS
                                         TIPNAIETSLFTPDRNQFFNNPTTIVFLGR
                                         LVYRKGADLLCEIVPKVCARHKSVRFIIGG
                                         DGPKRIELEEMLERFKLHERVVILGMLPHN
                                         QVKRVLNQGQIFINTSLTEAFCMSIVEAAS
                                         CGLHVVSTRVGGVPEVLPIGEFISLEE
               pvpddlvdallkavd  328-342  
                                343-444  RREKGLLMDPTEKHEAVSKMYNWPDVAART
                                         QVIYQKAVESEPTGRLGRLKGYYDQGIGFG
                                         IMYIVVSCIIIFWLTVLDLFDSPRKNGTND
                                         KTSEKNVDPDYQ

low complexity regions: SEG 25 3.0 3.3
>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.

                                  1-301  MSLKIGPYSIALVSDFFCPNAGGVETHIYF
                                         LAQCLIELGHRVVVITHGYGNRKGIRYLSN
                                         GLKVYYLPFIVAYNGATLGSIVGSMPWLRK
                                         VLLRENVQIIHGHSTFSSLAHETLMIGGLM
                                         GLRTVFTDHSLFGFADASAILTNKLVLQYS
                                         LINVDQTICVSYTSKENTVLRGKLDPNKVS
                                         TIPNAIETSLFTPDRNQFFNNPTTIVFLGR
                                         LVYRKGADLLCEIVPKVCARHKSVRFIIGG
                                         DGPKRIELEEMLERFKLHERVVILGMLPHN
                                         QVKRVLNQGQIFINTSLTEAFCMSIVEAAS
                                         C
glhvvstrvggvpevlpigefisleepvpd  302-342  
                   dlvdallkavd
                                343-444  RREKGLLMDPTEKHEAVSKMYNWPDVAART
                                         QVIYQKAVESEPTGRLGRLKGYYDQGIGFG
                                         IMYIVVSCIIIFWLTVLDLFDSPRKNGTND
                                         KTSEKNVDPDYQ

low complexity regions: SEG 45 3.4 3.75
>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.

                                  1-444  MSLKIGPYSIALVSDFFCPNAGGVETHIYF
                                         LAQCLIELGHRVVVITHGYGNRKGIRYLSN
                                         GLKVYYLPFIVAYNGATLGSIVGSMPWLRK
                                         VLLRENVQIIHGHSTFSSLAHETLMIGGLM
                                         GLRTVFTDHSLFGFADASAILTNKLVLQYS
                                         LINVDQTICVSYTSKENTVLRGKLDPNKVS
                                         TIPNAIETSLFTPDRNQFFNNPTTIVFLGR
                                         LVYRKGADLLCEIVPKVCARHKSVRFIIGG
                                         DGPKRIELEEMLERFKLHERVVILGMLPHN
                                         QVKRVLNQGQIFINTSLTEAFCMSIVEAAS
                                         CGLHVVSTRVGGVPEVLPIGEFISLEEPVP
                                         DDLVDALLKAVDRREKGLLMDPTEKHEAVS
                                         KMYNWPDVAARTQVIYQKAVESEPTGRLGR
                                         LKGYYDQGIGFGIMYIVVSCIIIFWLTVLD
                                         LFDSPRKNGTNDKTSEKNVDPDYQ


low complexity regions: XNU
# Score cutoff = 21, Search from offsets 1 to 4
# both members of each repeat flagged
# lambda = 0.347, K = 0.200, H = 0.664
>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.
MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSN
GLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGHSTFSSLAHETLMIGGLM
GLRTVFTDHSLFGFADASAILTNKLVLQYSLINVDQTICVSYTSKENTVLRGKLDPNKVS
TIPNAIETSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGG
DGPKRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS
CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLMDPTEKHEAVS
KMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIGFGIMYIVVSCIIIFWLTVLD
LFDSPRKNGTNDKTSEKNVDPDYQ
    1 -  444 MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN 
             GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM 
             GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS 
             TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG 
             DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS 
             CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS 
             KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD 
             LFDSPRKNGT NDKTSEKNVD PDYQ

low complexity regions: DUST
>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.
MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSN
GLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGHSTFSSLAHETLMIGGLM
GLRTVFTDHSLFGFADASAILTNKLVLQYSLINVDQTICVSYTSKENTVLRGKLDPNKVS
TIPNAIETSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGG
DGPKRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS
CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLMDPTEKHEAVS
KMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIGFGIMYIVVSCIIIFWLTVLD
LFDSPRKNGTNDKTSEKNVDPDYQ

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

coiled coil prediction for T20374
sequence: 444 amino acids, 0 residue(s) in coiled coil state

    .    |     .    |     .    |     .    |     .    |     .   60
MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  120
GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  180
GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  240
TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  300
DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~3336666 6666666666 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  360
CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  420
KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     
LFDSPRKNGT NDKTSEKNVD PDYQ
~~~~~~~~~~ ~~~~~~~~~~ ~~~~
---------- ---------- ----
~~~~~~~~~~ ~~~~~~~~~~ ~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

prediction of transmembrane regions with toppred2

     ***********************************
     *TOPPREDM with eukaryotic function*
     ***********************************

T20374.fa.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: T20374.fa.___inter___

 (1 sequences)
MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYG
NRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQII
HGHSTFSSLAHETLMIGGLMGLRTVFTDHSLFGFADASAILTNKLVLQYS
LINVDQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQFFN
NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGDGPKRIELEE
MLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS
CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLM
DPTEKHEAVSKMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIG
FGIMYIVVSCIIIFWLTVLDLFDSPRKNGTNDKTSEKNVDPDYQ


(p)rokaryotic or (e)ukaryotic: e


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 4 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1    66    86   1.301 Certain
     2   102   122   0.903 Putative
     3   288   308   0.902 Putative
     4   399   419   2.415 Certain

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     2     4
 Loop length    65    15   276    25
 K+R profile     +           +      
                    3.00        4.00      
CYT-EXT prof  1.13        0.76      
                       -           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -7.00
Tm probability: 0.76
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   1.89
-> Orientation: N-out

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     3     4
 Loop length    65   201    90    25
 K+R profile     +           +      
                       +        4.00      
CYT-EXT prof  1.13        0.72      
                    0.45           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -4.00
Tm probability: 0.76
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   1.40
-> Orientation: N-out

----------------------------------------------------------------------
Structure 3

Transmembrane segments included in this structure:
     Segment       1     4
 Loop length    65   312    25
 K+R profile     +        4.00      
                       +      
CYT-EXT prof  1.13           -      
                    0.71      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 4.00
Tm probability: 1.00
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   0.42
-> Orientation: N-out

----------------------------------------------------------------------
Structure 4

Transmembrane segments included in this structure:
     Segment       1     2     3     4
 Loop length    65    15   165    90    25
 K+R profile     +           +        4.00      
                    3.00           +      
CYT-EXT prof  1.13        0.46           -      
                       -        0.72      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.57
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   0.87
-> Orientation: N-out

----------------------------------------------------------------------

"T20374" 444 
 66 86 #t 1.30104
 102 122 #f 0.903125
 288 308 #f 0.902083
 399 419 #t 2.41458


     ************************************
     *TOPPREDM with prokaryotic function*
     ************************************

T20374.fa.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: T20374.fa.___inter___

 (1 sequences)
MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYG
NRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQII
HGHSTFSSLAHETLMIGGLMGLRTVFTDHSLFGFADASAILTNKLVLQYS
LINVDQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQFFN
NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGDGPKRIELEE
MLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS
CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLM
DPTEKHEAVSKMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIG
FGIMYIVVSCIIIFWLTVLDLFDSPRKNGTNDKTSEKNVDPDYQ


(p)rokaryotic or (e)ukaryotic: p


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 4 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1    66    86   1.301 Certain
     2   102   122   0.903 Putative
     3   288   308   0.902 Putative
     4   399   419   2.415 Certain

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     2     4
 Loop length    65    15   276    25
 K+R profile     +           +      
                    3.00        4.00      
CYT-EXT prof  1.13        0.76      
                       -           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -7.00
Tm probability: 0.76
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   1.89
-> Orientation: N-out

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     3     4
 Loop length    65   201    90    25
 K+R profile     +           +      
                       +        4.00      
CYT-EXT prof  1.13        0.72      
                    0.45           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -4.00
Tm probability: 0.76
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   1.40
-> Orientation: N-out

----------------------------------------------------------------------
Structure 3

Transmembrane segments included in this structure:
     Segment       1     4
 Loop length    65   312    25
 K+R profile     +        4.00      
                       +      
CYT-EXT prof  1.13           -      
                    0.71      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 4.00
Tm probability: 1.00
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   0.42
-> Orientation: N-out

----------------------------------------------------------------------
Structure 4

Transmembrane segments included in this structure:
     Segment       1     2     3     4
 Loop length    65    15   165    90    25
 K+R profile     +           +        4.00      
                    3.00           +      
CYT-EXT prof  1.13        0.46           -      
                       -        0.72      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.57
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -0.3333
                 NEG: 3.0000
                 POS: 6.0000
-> Orientation: N-in

CYT-EXT difference:   0.87
-> Orientation: N-out

----------------------------------------------------------------------

"T20374" 444 
 66 86 #t 1.30104
 102 122 #f 0.903125
 288 308 #f 0.902083
 399 419 #t 2.41458


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

SAPS.  Version of April 11, 1996.
Date run: Thu Nov 22 14:08:45 2001

File: /people/b_eisen/T20374.fa.___saps___
ID   T20374
DE   hypothetical protein D2085.6 - Caenorhabditis elegans.

number of residues:  444;   molecular weight:  49.5 kdal
 
         1  MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN 
        61  GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM 
       121  GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS 
       181  TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG 
       241  DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS 
       301  CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS 
       361  KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD 
       421  LFDSPRKNGT NDKTSEKNVD PDYQ

--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s)

A  : 21( 4.7%); C  :  8( 1.8%); D  : 20( 4.5%); E  : 24( 5.4%); F  : 20( 4.5%)
G  : 36( 8.1%); H  : 12( 2.7%); I  : 34( 7.7%); K  : 23( 5.2%); L  : 48(10.8%)
M  : 10( 2.3%); N  : 20( 4.5%); P  : 20( 4.5%); Q  : 12( 2.7%); R  : 23( 5.2%)
S  : 27( 6.1%); T  : 25( 5.6%); V  : 42( 9.5%); W  :  3( 0.7%); Y  : 16( 3.6%)

KR      :   46 ( 10.4%);   ED      :   44 (  9.9%);   AGP     :   77 ( 17.3%);
KRED    :   90 ( 20.3%);   KR-ED   :    2 (  0.5%);   FIKMNY  :  123 ( 27.7%);
LVIFM   :  154 ( 34.7%);   ST      :   52 ( 11.7%).

--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS
 
         1  000+000000 0000-00000 0000-00000 000000-000 +000000000 0++00+0000 
        61  00+0000000 0000000000 00000000++ 000+-00000 0000000000 0-00000000 
       121  00+0000-00 00000-0000 000+000000 0000-00000 0000+-0000 +0+0-00+00 
       181  000000-000 000-+00000 000000000+ 000++00-00 0-000+000+ 0+00+00000 
       241  -00++0-0-- 00-+0+00-+ 0000000000 00++000000 00000000-0 000000-000 
       301  00000000+0 0000-00000 -0000--000 --00-000+0 0-++-+0000 -00-+0-000 
       361  +00000-000 +000000+00 -0-000+00+ 0+000-0000 0000000000 000000000- 
       421  00-00++000 0-+00-+00- 0-00

A. CHARGE CLUSTERS.


Positive charge clusters (cmin =  9/30 or 12/45 or 15/60):  none


Negative charge clusters (cmin =  9/30 or 12/45 or 15/60):  none


Mixed charge clusters (cmin = 14/30 or 19/45 or 24/60):

 1) From  326 to  357:   EEPVPDDLVDALLKAVDRREKGLLMDPTEKHE
                         --000--00-000+00-++-+0000-00-+0-
    quartile: 3; size: 32, +count:  5, -count: 10, 0count: 17; t-value:  3.74
    L:  5 (15.6%);  E:  5 (15.6%);  D:  5 (15.6%);  LVIFM:  9 (28.1%);


B. HIGH SCORING (UN)CHARGED SEGMENTS.

There are no high scoring positive charge segments.
There are no high scoring negative charge segments.
There are no high scoring mixed charge segments.
There are no high scoring uncharged segments.


C. CHARGE RUNS AND PATTERNS.

pattern  (+)|  (-)|  (*)|  (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)|
lmin0     5 |   5 |   7 |  40 |   9 |   9 |  12 |  11 |  11 |  15 |   7 |   9 | 
lmin1     6 |   6 |   8 |  49 |  11 |  11 |  15 |  14 |  14 |  18 |   8 |  11 | 
lmin2     7 |   7 |  10 |  54 |  13 |  13 |  17 |  16 |  15 |  20 |  10 |  12 | 
 (Significance level: 0.010000; Minimal displayed length:  6)
There are no charge runs or patterns exceeding the given minimal lengths.

Run count statistics:

  +  runs >=   3:   0
  -  runs >=   3:   0
  *  runs >=   4:   1, at  342;
  0  runs >=  27:   0

--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES

1. HIGH SCORING SEGMENTS.

__________________________________
High scoring hydrophobic segments:

   2.00 (LVIFM)   1.00 (AGYCW)   0.00 (BZX)  -2.00 (PH)  -4.00 (STNQ)
  -8.00 (KEDR)

 Expected score/letter:  -1.640
 M_0.01=  30.79; M_0.05=  25.32

 1) From  398 to  416:  length= 19, score=26.00  * 
     398  GIGFGIMYIV VSCIIIFWL
    G:  3(15.8%);  V:  2(10.5%);  I:  6(31.6%);  F:  2(10.5%);


____________________________________
High scoring transmembrane segments:

   5.00 (LVIF)   2.00 (AGM)   0.00 (BZX)  -1.00 (YCW)  -2.00 (ST)
  -6.00 (P)  -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED)

 Expected score/letter:  -2.921
 M_0.01=  76.86; M_0.05=  62.67;     M_0.30=  45.80

 1) From  398 to  419:  length= 22, score=66.00  * 
     398  GIGFGIMYIV VSCIIIFWLT VL
    G:  3(13.6%);  V:  3(13.6%);  I:  6(27.3%);


2. SPACINGS OF C.


H2N-17-C-15-C-124-C-61-C-6-C-63-C-8-C-108-C-34-COOH


2*. SPACINGS OF C and H. (additional deluxe function for ALEX)


H2N-17-C-8-H-6-C-5-H-6-H-53-H-1-H-7-H-17-H-29-C-61-C-6-C-2-H-26-H-10-H-22-C-8-C-2-H-51-H-53-C-34-COOH

--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.

A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length:  4

Aligned matching blocks:


[ 208- 211]   LGRL
[ 388- 391]   LGRL


B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
   (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length:  8

--------------------------------------------------------------------------------

MULTIPLETS.

A. AMINO ACID ALPHABET.

1. Total number of amino acid multiplets:  28  (Expected range:  10-- 41)

2. Histogram of spacings between consecutive amino acid multiplets:
   (1-5) 10   (6-10) 5   (11-20) 8   (>=21) 6

3. Clusters of amino acid multiplets (cmin = 11/30 or 14/45 or 17/60):  none


B. CHARGE ALPHABET.

1. Total number of charge multiplets:  10  (Expected range:   0-- 17)
   7 +plets (f+: 10.4%), 3 -plets (f-: 9.9%)
   Total number of charge altplets: 10 (Critical number: 20)

2. Histogram of spacings between consecutive charge multiplets:
   (1-5) 2   (6-10) 0   (11-20) 2   (>=21) 7

--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.

A. AMINO ACID ALPHABET (core:  4; !-core: 5)

Location	Period	Element		Copies	Core	Errors
  49-  80	 8	Y.......  	 4	 4  	 0


B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core:  5; !-core: 6)
   and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core:  6; !-core:10)

Location	Period	Element		Copies	Core	Errors
   3-  51	 7	i000...   	 7	 7  	/0/2/2/2/./././
  64- 112	 7	i00.0..   	 7	 7  	/0/2/1/./2/././
 210- 233	 4	*..0      	 6	 6  	 0
 232- 261	 3	*..       	 9	 7  	 1
 283- 342	10	i0...0.0..	 6	 6  	/0/1/./././1/./1/././
 310- 327	 3	i..       	 6	 6  	 0


--------------------------------------------------------------------------------
SPACING ANALYSIS.

Location (Quartile) Spacing     Rank       P-value   Interpretation

   0-   4  (1.)     K(   4)K    24 of  24   0.0037   large minimal spacing
 433- 437  (4.)     K(   4)K    23 of  24   0.0037     matching minimum


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Pfam (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/Pfam
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model           Description                             Score    E-value  N 
--------        -----------                             -----    ------- ---
Glycos_transf_1 Glycosyl transferases group 1            91.6    2.9e-24   1
SRP54           SRP54-type protein, GTPase domain         0.1         68   1
fer4_NifH       4Fe-4S iron sulfur cluster binding pr    -1.0         95   1
KH-domain       KH domain                                -8.4         61   1
LysM            LysM domain                             -11.8         74   1
DUF196          Uncharacterized ACR, COG1343            -37.5         38   1

Parsed for domains:
Model           Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------        ------- ----- -----    ----- -----      -----  -------
LysM              1/1     123   161 ..     1    44 []   -11.8       74
fer4_NifH         1/1     247   259 ..   269   281 .]    -1.0       95
KH-domain         1/1     226   265 ..     1    49 []    -8.4       61
DUF196            1/1     234   323 ..     1    93 []   -37.5       38
SRP54             1/1     325   340 ..   200   215 .]     0.1       68
Glycos_transf_1   1/1     188   346 ..     1   180 [.    91.6  2.9e-24

Alignments of top-scoring domains:
LysM: domain 1 of 1, from 123 to 161: score -11.8, E = 74
                   *->YtVKsGDTLwkIArkygisvqeLkslNpgLssdn..lyvGQkLkip<
                       tV    +L++ A +  i        N +L   ++ + v Q+++++ 
      T20374   123    RTVFTDHSLFGFADASAIL------TN-KLVLQYslINVDQTICVS  161  

                   -*
                     
      T20374     -    -    

fer4_NifH: domain 1 of 1, from 247 to 259: score -1.0, E = 95
                   *->eLeeLlvkfgimd<-*
                      eLee+l+ f + +   
      T20374   247    ELEEMLERFKLHE    259  

KH-domain: domain 1 of 1, from 226 to 265: score -8.4, E = 61
                   *->evlvpasrvGliIGkgGsnIkeireetgakIdipddsegsverivti
                      +v  +++ v++iIG  G    e++e++  ++++ +         v i
      T20374   226    KVCARHKSVRFIIGGDGPKRIELEEML-ERFKLHER--------VVI 263  

                   tg<-*
                   +g   
      T20374   264 LG    265  

DUF196: domain 1 of 1, from 234 to 323: score -37.5, E = 38
                   *->myvLVvYDvsvdeRvnrlkKfLrkfGLn.wVQnSaFEGELtkadler
                      +  ++    ++   +  l   L++f L+++V   +  G+L  ++  r
      T20374   234    VRFIIG---GDGPKRIELEEMLERFKLHeRV---VILGMLPHNQVKR 274  

                   lkagidriid...eDrDsviIYkfrsRCSsaAvkrevlGl.EkspGeeev
                   +  + + +i+++  +  +  I    s C    v   v G++E+ p + e+
      T20374   275 VLNQGQIFINtslTEAFCMSIVEAAS-CGLHVVSTRVGGVpEVLP-IGEF 322  

                   i<-*
                   i   
      T20374   323 I    323  

SRP54: domain 1 of 1, from 325 to 340: score 0.1, E = 68
                   *->LepFdperfvsrLLgm<-*
                      Le+  p+ +v++LL     
      T20374   325    LEEPVPDDLVDALLKA    340  

Glycos_transf_1: domain 1 of 1, from 188 to 346: score 91.6, E = 2.9e-24
                   *->dreeirkklgikedkkiilfvGRlvpeKGidllieAfkkLkkkpkll
                       + + ++ +++++++ +i+f GRlv++KG dll e ++k++++    
      T20374   188    TSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCAR---- 230  

                   klnpnlkLvivGgpYdsedgeeedelkklaeklglednviflGfvpdedl
                     +  ++++i G      dg+++ el+++ e   l ++v +lG +p++++
      T20374   231 --HKSVRFIIGG------DGPKRIELEEMLERFKLHERVVILGMLPHNQV 272  

                   pelyksadvfvlPSryEgFGivllEAmAcGlPVIatncvgGipEvvkdge
                   + ++++  +f+ +S +E+F+++++EA +cGl V++t  vgG+pEv+  ge
      T20374   273 KRVLNQGQIFINTSLTEAFCMSIVEAASCGLHVVSTR-VGGVPEVLPIGE 321  

                   tGllvepgqdpealaeaiekllkdeekkdllel<-*
                      l ep   p++l++a++k+   +      e+   
      T20374   322 FISLEEPV--PDDLVDALLKAVDRR------EK    346  

//

Start with PfamFrag (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/PfamFrag
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model           Description                             Score    E-value  N 
--------        -----------                             -----    ------- ---
Glycos_transf_1 Glycosyl transferases group 1            91.6    2.9e-24   1
PK_C            Pyruvate kinase, alpha/beta domain        2.3         24   1
RuvA            RuvA N terminal domain                    0.7         31   1
Bac_export_1    Bacterial export proteins, family 1       0.4         41   1
PA_phosphat_N   Purple acid phosphatase, N-terminal i     0.4         92   1
SRP54           SRP54-type protein, GTPase domain         0.1         68   1
denso_VP4       Capsid protein VP4                       -0.3         25   1
fer4_NifH       4Fe-4S iron sulfur cluster binding pr    -1.0         95   1

Parsed for domains:
Model           Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------        ------- ----- -----    ----- -----      -----  -------
PA_phosphat_N     1/1       9    22 ..     1    14 [.     0.4       92
denso_VP4         1/1      74    81 ..   435   442 .]    -0.3       25
RuvA              1/1     129   134 ..    63    68 .]     0.7       31
fer4_NifH         1/1     247   259 ..   269   281 .]    -1.0       95
PK_C              1/1     288   299 ..     1    12 [.     2.3       24
SRP54             1/1     325   340 ..   200   215 .]     0.1       68
Glycos_transf_1   1/1     188   346 ..     1   180 [.    91.6  2.9e-24
Bac_export_1      1/1     399   427 ..   227   255 .]     0.4       41

Alignments of top-scoring domains:
PA_phosphat_N: domain 1 of 1, from 9 to 22: score 0.4, E = 92
                   *->dmpldsdvFrvppG<-*
                      ++ l+sd+F+ ++G   
      T20374     9    SIALVSDFFCPNAG    22   

denso_VP4: domain 1 of 1, from 74 to 81: score -0.3, E = 25
                   *->ngAtLGnv<-*
                      ngAtLG++   
      T20374    74    NGATLGSI    81   

RuvA: domain 1 of 1, from 129 to 134: score 0.7, E = 31
                   *->hlLYGF<-*
                      h+L+GF   
      T20374   129    HSLFGF    134  

fer4_NifH: domain 1 of 1, from 247 to 259: score -1.0, E = 95
                   *->eLeeLlvkfgimd<-*
                      eLee+l+ f + +   
      T20374   247    ELEEMLERFKLHE    259  

PK_C: domain 1 of 1, from 288 to 299: score 2.3, E = 24
                   *->tEaiAmSAVrAA<-*
                      tEa +mS V+AA   
      T20374   288    TEAFCMSIVEAA    299  

SRP54: domain 1 of 1, from 325 to 340: score 0.1, E = 68
                   *->LepFdperfvsrLLgm<-*
                      Le+  p+ +v++LL     
      T20374   325    LEEPVPDDLVDALLKA    340  

Glycos_transf_1: domain 1 of 1, from 188 to 346: score 91.6, E = 2.9e-24
                   *->dreeirkklgikedkkiilfvGRlvpeKGidllieAfkkLkkkpkll
                       + + ++ +++++++ +i+f GRlv++KG dll e ++k++++    
      T20374   188    TSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCAR---- 230  

                   klnpnlkLvivGgpYdsedgeeedelkklaeklglednviflGfvpdedl
                     +  ++++i G      dg+++ el+++ e   l ++v +lG +p++++
      T20374   231 --HKSVRFIIGG------DGPKRIELEEMLERFKLHERVVILGMLPHNQV 272  

                   pelyksadvfvlPSryEgFGivllEAmAcGlPVIatncvgGipEvvkdge
                   + ++++  +f+ +S +E+F+++++EA +cGl V++t  vgG+pEv+  ge
      T20374   273 KRVLNQGQIFINTSLTEAFCMSIVEAASCGLHVVSTR-VGGVPEVLPIGE 321  

                   tGllvepgqdpealaeaiekllkdeekkdllel<-*
                      l ep   p++l++a++k+   +      e+   
      T20374   322 FISLEEPV--PDDLVDALLKAVDRR------EK    346  

Bac_export_1: domain 1 of 1, from 399 to 427: score 0.4, E = 41
                   *->vglllLvlylpyilplfkeelsllfdlls<-*
                      +g+ ++++ ++ i++++  +l+l++++ +   
      T20374   399    IGFGIMYIVVSCIIIFWLTVLDLFDSPRK    427  

//

Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Prosite
---------------------------------------------------------
|          ppsearch (c) 1994 EMBL Data Library          |
|       based on MacPattern (c) 1990-1994 R. Fuchs      |
---------------------------------------------------------

PROSITE pattern search started: Thu Nov 22 14:10:45 2001

Sequence file: T20374.fa

----------------------------------------
Sequence T20374 (444 residues):

Matching pattern PS00001 ASN_GLYCOSYLATION:
  284: NTSL
  428: NGTN
Total matches: 2

Matching pattern PS00005 PKC_PHOSPHO_SITE:
    2: SLK
  142: TNK
  163: TSK
  233: SVR
  307: STR
  353: TEK
  385: TGR
  424: SPR
  435: SEK
Total matches: 9

Matching pattern PS00006 CK2_PHOSPHO_SITE:
  163: TSKE
  286: SLTE
  294: SIVE
  324: SLEE
  417: TVLD
Total matches: 5

Matching pattern PS00008 MYRISTYL:
   22: GGVETH
   50: GNRKGI
   75: GATLGS
   79: GSIVGS
  117: GGLMGL
  398: GIGFGI
Total matches: 6

Total no of hits in this sequence: 22

========================================

1314 pattern(s) searched in 1 sequence(s), 444 residues.
Total no of hits in all sequences: 22.
Search time: 00:00 min

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Profile Search

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with motif search against own library
     ***** bioMotif : Version V41a DB, 1999 Nov 11 *****
          SeqTyp=2 : PROTEIN  search; 


>APC D-Box is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>ER-GOLGI-traffic signal is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M minimal SH3 binding  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>EXTRACELL-M minimal furin protease cleavage site motif  is the MOTIF name

>T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. ;LENGTH=444; DIRECT_SEQUENCE
n 1 solutions 
m %_RXXR 387-390
f

>STATISTICS Total   : 1 solutions in 1 sequences, 444 units;  out of 1 sequences, 444 units

>EXTRACELL-M extended furin protease cleavage site motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>EXTRACELL-M  zinc binding motif in MMPs is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>EXTRACELL-M g alpha binding go loco is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SV40 LrgT experimentally determined  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS  hDNAtopoII experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS  hDNAtopoII experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS Max experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>PDZ domain binding motif science 278_2075_pawson is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units

>WW domain binding motif science 278_2075_pawson is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 444 units


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~

Start with HMM-search search against own library
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm.lib
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm-f.lib
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

L. Aravind's signalling DB+ PSSM from other authors
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.
         (444 letters)

Searching..................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

ACYC Adenylyl/Guanylyl cyclase domain                              22  1.9
PUM Pumilio repeat RNA binding domain                              21  3.0
INSL Insulinase like Metallo protease domain                       21  3.5
RASGAP RAS-type GTPase GTP hydrolysis activating protein           21  3.7
CALC Calcineurin like Phosphoesterase domain                       21  4.3
ARM Armadillo repeat                                               21  4.4
MBL Metallo-betalactamase domain                                   20  4.4
CYCL cyclophilin like peptidyl prolyl isomerases                   20  4.7
S1  S1 RNA binding domain                                          21  4.8
UBA Ubiquitin pathway associated domain                            20  5.7
BRIGHT BRIGHT domain (Alpha helical DNA binding domain)            20  6.8
CYCLIN Cyclin/TFIIB domain                                         20  9.1

>ACYC Adenylyl/Guanylyl cyclase domain 
          Length = 244

 Score = 21.8 bits (46), Expect = 1.9
 Identities = 5/41 (12%), Positives = 5/41 (12%)

Query: 16  FFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIR 56
                                                    
Sbjct: 94  WFHGYNEATPAEIIQILHAVNRLQAMTAKLNQKYELPFPLR 134


>PUM Pumilio repeat RNA binding domain 
          Length = 337

 Score = 21.3 bits (45), Expect = 3.0
 Identities = 5/28 (17%), Positives = 5/28 (17%)

Query: 25  ETHIYFLAQCLIELGHRVVVITHGYGNR 52
                                       
Sbjct: 153 PSKFGFIIDAIVEQNNIITISTHKHGCC 180


>INSL Insulinase like Metallo protease domain 
          Length = 433

 Score = 21.0 bits (44), Expect = 3.5
 Identities = 23/124 (18%), Positives = 23/124 (18%), Gaps = 15/124 (12%)

Query: 161 SYTSKENTVLRGKLDPNKVSTIPNAIET---SLFTP--DRNQFFNNPTTIVFLGRLVYRK 215
                                                                       
Sbjct: 96  AGTSKDYTYYHVEIAHPYW---KQALEVLYQLTMKATLDEEMIEKEKPIVIEELRRGKDN 152

Query: 216 GADLLCEIVPKVCARHKSVRFIIGG--DGPKRIELEEMLERFKLHER-----VVILGMLP 268
                                                                       
Sbjct: 153 PTTVLWEEFEKLVYKVSPYRFPIIGFEETIRKFTREKLLKFYKSFYQPRNMAVVIVGKVN 212

Query: 269 HNQV 272
               
Sbjct: 213 PKEV 216


 Score = 19.5 bits (40), Expect = 9.3
 Identities = 7/14 (50%), Positives = 7/14 (50%)

Query: 55 IRYLSNGLKVYYLP 68
                        
Sbjct: 23 IRDLPNGAKLIVKP 36


>RASGAP RAS-type GTPase GTP hydrolysis activating protein  
          Length = 292

 Score = 20.9 bits (43), Expect = 3.7
 Identities = 8/14 (57%), Positives = 8/14 (57%)

Query: 331 DDLVDALLKAVDRR 344
                         
Sbjct: 25  DDLMNLLLESVDQR 38


>CALC Calcineurin like Phosphoesterase domain 
          Length = 274

 Score = 20.5 bits (42), Expect = 4.3
 Identities = 9/39 (23%), Positives = 9/39 (23%)

Query: 201 NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIG 239
                                                  
Sbjct: 54  EFDVILATGDLVQDSSDEGYIRFVEMMKPFNKPVFWIPG 92


>ARM Armadillo repeat 
          Length = 532

 Score = 20.6 bits (43), Expect = 4.4
 Identities = 10/35 (28%), Positives = 10/35 (28%)

Query: 269 HNQVKRVLNQGQIFINTSLTEAFCMSIVEAASCGL 303
                                              
Sbjct: 402 HDQIKYLVEQGCIKPLCDLLVCPDPRIITVCLEGL 436


>MBL Metallo-betalactamase domain 
          Length = 256

 Score = 20.5 bits (42), Expect = 4.4
 Identities = 5/39 (12%), Positives = 5/39 (12%), Gaps = 3/39 (7%)

Query: 229 ARHKSVRFIIGGDGPKR---IELEEMLERFKLHERVVIL 264
                                                  
Sbjct: 218 PAAIKAKMWLYGYQPGPLPPALEDGFLGFVKRGQRFDLV 256


>CYCL cyclophilin like peptidyl prolyl isomerases 
          Length = 165

 Score = 20.5 bits (43), Expect = 4.7
 Identities = 18/56 (32%), Positives = 18/56 (32%), Gaps = 12/56 (21%)

Query: 238 IGGDGPKRIELEEMLERFKL-HERVVILGM---LPHNQVKRVLNQGQIFINTSLTE 289
                                                                   
Sbjct: 73  TGGKSIYGEKFED--ENFILKHTGPGILSMANAGPNT------NGSQFFICTAKTE 120


>S1  S1 RNA binding domain 
          Length = 305

 Score = 20.6 bits (43), Expect = 4.8
 Identities = 19/129 (14%), Positives = 19/129 (14%), Gaps = 40/129 (31%)

Query: 244 KRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAASCGL 303
                                                                       
Sbjct: 180 RRIQQAESMGKIAAGNIYE-------GKVAKIQPYG-VFVEIEGVTGL-----------L 220

Query: 304 HV---VSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRRE-------KGLLMDPT 353
                                                                       
Sbjct: 221 HVSQVSGTRVDSLNTLFAFG-----------QAISVYVQEIDEYKNRISLSTRILETYPG 269

Query: 354 EKHEAVSKM 362
                    
Sbjct: 270 ELVEKFDEM 278


>UBA Ubiquitin pathway associated domain 
          Length = 255

 Score = 20.0 bits (41), Expect = 5.7
 Identities = 15/81 (18%), Positives = 15/81 (18%), Gaps = 3/81 (3%)

Query: 332 DLVDALLKAVDRREKGLLMDPTEKHEAVSKMYNWPDVAARTQVIYQ--KAVESEPTGRLG 389
                                                                       
Sbjct: 147 EALAPLLENISARYPQLREHIMANPEVFVSMLLEAVGDNMQDVMEGADDMVEGEDIEVTG 206

Query: 390 RLKGY-YDQGIGFGIMYIVVS 409
                                
Sbjct: 207 EAAAAGLGQGEGEGSFQVDYT 227


>BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 
          Length = 172

 Score = 19.9 bits (41), Expect = 6.8
 Identities = 13/66 (19%), Positives = 13/66 (19%), Gaps = 8/66 (12%)

Query: 17  FCPNAGGVETHIYFLAQCLIELGHRVVV--------ITHGYGNRKGIRYLSNGLKVYYLP 68
                                                                       
Sbjct: 44  RLPIMAKSVLDLYELYNLVIARGGLVDVINKKLWQEIIKGLHLPSSITSAAFTLRTQYMK 103

Query: 69  FIVAYN 74
                 
Sbjct: 104 YLYPYE 109


>CYCLIN Cyclin/TFIIB domain 
          Length = 317

 Score = 19.5 bits (40), Expect = 9.1
 Identities = 11/94 (11%), Positives = 11/94 (11%), Gaps = 6/94 (6%)

Query: 274 RVLNQGQIFINTSLTEAFCMS---IVEAASCGLHVVSTRVGGVPEVLPIGEFISLEEPV- 329
                                                                       
Sbjct: 180 ILRKTADDFLNRIALTDAYLLYTPSQIALTA-ILSSASRAGITMESYLSESLMLKENRTC 238

Query: 330 PDDLVDALLKAVDRREKGLLMDPTEKHEAVSKMY 363
                                             
Sbjct: 239 LSQLLDIMKSMRNLVKK-YEPPRSEEVAVLKQKL 271


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 105
Number of sequences better than 10.0: 12 
Number of calls to ALIGN: 13 
Length of query: 444 
Total length of test sequences: 20182  
Effective length of test sequences: 16637.0
Effective search space size: 6828066.3
Initial X dropoff for ALIGN: 25.0 bits

Y. Wolf's SCOP PSSM
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= T20374 hypothetical protein D2085.6 - Caenorhabditis elegans.
         (444 letters)

Searching.................................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

gi|585096 [110..367] Zn-dependent exopeptidases                    26  0.94
gi|1174715 [19..356] Thiamin-binding                               26  1.2
gi|1652715 [5..195] NAD(P)-binding Rossmann-fold domains           26  1.7
gi|1945717 [15..256] alpha/beta-Hydrolases                         26  1.7
gi|398985 [18..447] PLP-dependent transferases                     25  2.1
gi|2145124 [22..373] Serpins                                       25  2.2
gi|2414449 [42..312] alpha/beta-Hydrolases                         25  2.6
gi|2597838 [186..359] Cupredoxins                                  25  3.0
gi|1652197 [32..277] alpha/beta-Hydrolases                         24  4.1
gi|2127787 [36..400] Periplasmic binding protein-like I            24  4.5
gi|2246648 [5..196] NAD(P)-binding Rossmann-fold domains           24  4.7
gi|451954 [433..740] R1 subunit of ribonucleotide reductase,...    24  5.0
gi|1174715 [357..558] Thiamin-binding                              24  5.3
gi|442927 [3..324] FAD/NAD(P)-binding domain                       24  5.5
gi|1518938 [387..612] Heat shock protein 70kD (HSP70), C-ter...    24  6.7
gi|1345687 [59..410] Heme-dependent peroxidases                    23  8.0
gi|2194045 [11..394] Ferritin-like                                 23  8.9
gi|416581 [8..168] Ribonuclease H-like motif                       23  9.0

>gi|585096 [110..367] Zn-dependent exopeptidases 
          Length = 258

 Score = 26.4 bits (58), Expect = 0.94
 Identities = 7/46 (15%), Positives = 7/46 (15%), Gaps = 6/46 (13%)

Query: 23 GVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKVYYLP 68
                                                        
Sbjct: 59 TTPIIMTFLNDYLLAL------TNQTTIRGLSMGPLYNQTTLSLVP 98


>gi|1174715 [19..356] Thiamin-binding 
          Length = 338

 Score = 26.1 bits (57), Expect = 1.2
 Identities = 15/110 (13%), Positives = 15/110 (13%), Gaps = 20/110 (18%)

Query: 275 VLNQGQIFINTSLTEAFCMSIVE-AASCGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDL 333
                                                                       
Sbjct: 185 FYDHNQISIEGDTKITLCEDTAARYRAYGWHVQ--EVEGGEN--------------VVGI 228

Query: 334 VDALLKAVDRREKGLLMD-PTEKHEAVSKMYNWPDVAARTQVIYQKAVES 382
                                                             
Sbjct: 229 EEAIANAKAATDRPSFISLRTIIGYPAPTLINTG--KAHGAALGEDEVAA 276


>gi|1652715 [5..195] NAD(P)-binding Rossmann-fold domains 
          Length = 191

 Score = 25.8 bits (55), Expect = 1.7
 Identities = 7/52 (13%), Positives = 7/52 (13%), Gaps = 3/52 (5%)

Query: 21 AGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKVYYLPFIVA 72
                                                              
Sbjct: 14 NRGIGKVL---VESFLEHGAAKVYAAVRKLESAAFLVDKYGNKIVPILIDLA 62


>gi|1945717 [15..256] alpha/beta-Hydrolases 
          Length = 242

 Score = 25.5 bits (54), Expect = 1.7
 Identities = 10/81 (12%), Positives = 10/81 (12%), Gaps = 16/81 (19%)

Query: 39  GHRVVVITHGYGN-----RKGIRYLSNGLKVYYL----------PFIVAYNGATLGSIVG 83
                                                                       
Sbjct: 3   GKASIMFAPGFGCDQSVWNAVAPAFEEDHRVILFDYVGSGHSDLRAYDLNRYQTLDGYAQ 62

Query: 84  SMPWLRKVLLRENVQIIHGHS 104
                                
Sbjct: 63  DVLDVCEALDLKETVFV-GHS 82


>gi|398985 [18..447] PLP-dependent transferases 
          Length = 430

 Score = 25.2 bits (54), Expect = 2.1
 Identities = 21/183 (11%), Positives = 21/183 (11%), Gaps = 19/183 (10%)

Query: 6   GPYSIA-LVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKV 64
                                                                       
Sbjct: 241 DAYLLRLCLNVNKYPNWSNGIFLCQSFAKNMGLYGERVGSL---SVITPATANNGKFNPL 297

Query: 65  YYLPFIVAYNGATLGSIVGSMP-------------WLRKVLLRENVQIIHGHSTFSSLAH 111
                                                                       
Sbjct: 298 QQKNSLQQNIDSQLKKIVRGMYSSPPGYGSRVVNVVLSDFKLKQQWFKDVDFMVQRLHHV 357

Query: 112 ETLMIGGL--MGLRTVFTDHSLFGFADASAILTNKLVLQYSLINVDQTICVSYTSKENTV 169
                                                                       
Sbjct: 358 RQEMFDRLGWPDLVNFAQQHGMFYYTRFSPKQVEILRNNSFVYLTGDGRLSLSGVNDSNV 417

Query: 170 LRG 172
              
Sbjct: 418 DYL 420


>gi|2145124 [22..373] Serpins 
          Length = 352

 Score = 25.3 bits (55), Expect = 2.2
 Identities = 22/149 (14%), Positives = 22/149 (14%), Gaps = 21/149 (14%)

Query: 163 TSKENTVLRGKLDPNKVSTIPNAIE-----TSLFTPD--RNQFFNNPTTIV------FLG 209
                                                                       
Sbjct: 127 SGMSNVVDSTMLDDNTLWTIINTIYFKGTWQCPFDIAKTHNASFTNKYGTKTVPMMNVVT 186

Query: 210 RLVYRKGAD--LLCEIVPKVCARHKSVRFIIGGDGPKRIELE---EMLERF--KLHERVV 262
                                                                       
Sbjct: 187 KLQGNTITVDDEEYDMARLPYKDTNISMYLAIGDNMTHFTDSITAAKLDYWSSQLGNKMY 246

Query: 263 ILGMLPHNQVKRVLNQGQIFINTSLTEAF 291
                                        
Sbjct: 247 NL-KLPRFSIENKRDIKSIAEMIAPGMFN 274


>gi|2414449 [42..312] alpha/beta-Hydrolases 
          Length = 271

 Score = 24.9 bits (54), Expect = 2.6
 Identities = 21/133 (15%), Positives = 21/133 (15%), Gaps = 29/133 (21%)

Query: 103 HSTFSSLAHE---TLMIGGLMGLRTVFTDHSLFGFADAS---AILTNKLVLQYSLINV-- 154
                                                                       
Sbjct: 92  RSGHEKTWQYVQDALSISQYRNYDVYVTGHSL-GGALAGLCAPRIVHDGLRQSQKIKVVT 150

Query: 155 -------DQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPD------------- 194
                                                                       
Sbjct: 151 FGEPRVGNIEFSRAYDQLVPYSFRVVHSGDVVPHLPGCVKDLSYTPPAGSDGSMPCDPVS 210

Query: 195 RNQFFNNPTTIVF 207
                        
Sbjct: 211 TNGGYHHAIEIWY 223


>gi|2597838 [186..359] Cupredoxins 
          Length = 174

 Score = 24.8 bits (53), Expect = 3.0
 Identities = 17/51 (33%), Positives = 17/51 (33%), Gaps = 6/51 (11%)

Query: 68  PFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGHSTFSSLAHETLMIGG 118
                                                              
Sbjct: 49  PSHVVFNGK-VGALTGKNALTANV--GENVLIVHSQANRDSRPH---LIGG 93


>gi|1652197 [32..277] alpha/beta-Hydrolases 
          Length = 246

 Score = 24.3 bits (51), Expect = 4.1
 Identities = 7/89 (7%), Positives = 7/89 (7%), Gaps = 16/89 (17%)

Query: 17  FCPNAGGVETHIYFLAQCLIELGHRVVVITH-GYGNRKGIRYLSNGLKVYYLPFIVAYNG 75
                                                                       
Sbjct: 2   LLHGLPSQSLCWTGVMPLLAEKGLTAIAPDWLGFGFSDILD--------------KRDFA 47

Query: 76  ATLGSIVGSMPWLRKVLLRENVQIIHGHS 104
                                        
Sbjct: 48  YTTAAYEQALGEFFQSLELAKIFLV-VQG 75


>gi|2127787 [36..400] Periplasmic binding protein-like I 
          Length = 365

 Score = 24.2 bits (52), Expect = 4.5
 Identities = 4/46 (8%), Positives = 4/46 (8%)

Query: 233 SVRFIIGGDGPKRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQ 278
                                                         
Sbjct: 175 DEIPYDPNIGDWSPIIQTTTNKIAGKGNDTGVIFIGYEEVATLLSQ 220


>gi|2246648 [5..196] NAD(P)-binding Rossmann-fold domains 
          Length = 192

 Score = 24.2 bits (51), Expect = 4.7
 Identities = 4/52 (7%), Positives = 4/52 (7%), Gaps = 3/52 (5%)

Query: 21 AGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKVYYLPFIVA 72
                                                              
Sbjct: 6  SNGLGRCW---TESVIHEYGDRVIGITRSVEAAQEMTALYPEHFIPCIADVR 54


>gi|451954 [433..740] R1 subunit of ribonucleotide reductase, C-terminal domain 
          Length = 308

 Score = 24.1 bits (52), Expect = 5.0
 Identities = 10/51 (19%), Positives = 10/51 (19%), Gaps = 5/51 (9%)

Query: 116 IG-GLMGLRTVFTDHSLFGFADASAILTNKLV---LQYSLINVDQTICVSY 162
                                                              
Sbjct: 84  LGICVTGLHSVFMTVGL-SYAHPDARRLYRMICEHIYYTCVRTSVDCCMKG 133


>gi|1174715 [357..558] Thiamin-binding 
          Length = 202

 Score = 24.1 bits (52), Expect = 5.3
 Identities = 5/23 (21%), Positives = 5/23 (21%)

Query: 104 STFSSLAHETLMIGGLMGLRTVF 126
                                  
Sbjct: 109 LQFSDYMRPSVRLASLMDIDTIY 131


>gi|442927 [3..324] FAD/NAD(P)-binding domain 
          Length = 322

 Score = 23.8 bits (51), Expect = 5.5
 Identities = 10/57 (17%), Positives = 10/57 (17%), Gaps = 4/57 (7%)

Query: 47  HGYGNRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGH 103
                                                                    
Sbjct: 194 RGVPTKKDFGC-GDPHGVSMFPNTLHEDQVRSDAARE---WLLPNYQRPNLQVLTGQ 246


>gi|1518938 [387..612] Heat shock protein 70kD (HSP70), C-terminal, substrate-binding fragment 
          Length = 226

 Score = 23.7 bits (51), Expect = 6.7
 Identities = 11/35 (31%), Positives = 11/35 (31%)

Query: 163 TSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQ 197
                                              
Sbjct: 162 ASVEDDKVGGKLSAEDKKTILDKCSESLSWLDNNH 196


>gi|1345687 [59..410] Heme-dependent peroxidases 
          Length = 352

 Score = 23.2 bits (49), Expect = 8.0
 Identities = 14/39 (35%), Positives = 14/39 (35%)

Query: 340 AVDRREKGLLMDPTEKHEAVSKMYNWPDVAARTQVIYQK 378
                                                  
Sbjct: 291 AVDPDEKDLAPDAEDPSKKVPTMMMTTDLALRFDPEYEK 329


>gi|2194045 [11..394] Ferritin-like 
          Length = 384

 Score = 23.0 bits (49), Expect = 8.9
 Identities = 6/34 (17%), Positives = 6/34 (17%)

Query: 331 DDLVDALLKAVDRREKGLLMDPTEKHEAVSKMYN 364
                                             
Sbjct: 110 ARYTQRFLAAYSSEGSIRTIDPYWRDEILNKYFG 143


>gi|416581 [8..168] Ribonuclease H-like motif 
          Length = 161

 Score = 23.2 bits (49), Expect = 9.0
 Identities = 14/46 (30%), Positives = 14/46 (30%), Gaps = 3/46 (6%)

Query: 43 VVITHGYGNRKGIRYLSNGLKVYYLPFIVAYN--GATLGSIVGSMP 86
                                                        
Sbjct: 1  IIMDNGTGYSK-LGYAGNDAPSYVFPTVIATRSAGASSGPAVSSKP 45


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 1187
Number of sequences better than 10.0: 18 
Number of calls to ALIGN: 18 
Length of query: 444 
Total length of test sequences: 256703  
Effective length of test sequences: 209547.0
Effective search space size: 84727213.0
Initial X dropoff for ALIGN: 25.0 bits

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

calculation of internal repeats with prospero
***** PROSPERO v1.3  Thu Nov 22 14:11:26 2001 *****

Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford
For help see http://www.well.ox.ac.uk/ariadne  For usage use -help
using gap penalty 11+1k
using matrix BLOSUM62
printing all alignments with eval < 0.100000
using sequence1 T20374
using self-comparison


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

TIGRFAM
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/tigrfam/tigrfam.hmm
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model     Description                                   Score    E-value  N 
--------  -----------                                   -----    ------- ---
TIGR00045 TIGR00045: conserved hypothetical protein T    -0.5         45   1
TIGR00118 acolac_lg: acetolactate synthase, biosynthe  -511.5         78   1

Parsed for domains:
Model     Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------  ------- ----- -----    ----- -----      -----  -------
TIGR00045   1/1     271   280 ..   372   381 .]    -0.5       45
TIGR00118   1/1      88   378 ..     1   593 []  -511.5       78

Alignments of top-scoring domains:
TIGR00045: domain 1 of 1, from 271 to 280: score -0.5, E = 45
                   *->NvAqvLaigq<-*
                      +v++vL +gq   
      T20374   271    QVKRVLNQGQ    280  

TIGR00118: domain 1 of 1, from 88 to 378: score -511.5, E = 78
                   *->msGAeaiveSLkdeGVetVFGYPGGAiLPiYDaLYrfetdsgieHIL
                            + + L +e+V ++ G             +   t s + H  
      T20374    88    ------LRKVLLRENVQIIHG-------------H--STFSSLAH-- 111  

                   vRHEQgAvHAADGYARASGKvGVvlaTSGPGATNlVTGIAtAYmDSvPlV
                                                                     
      T20374     - -------------------------------------------------- -    

                   VfTGQVpTslIGsDAFQEaDilGItmPiTKHS.fqVksaeDlP...riik
                          T +IG          G+    T HS f   +a+ +++++ ++ 
      T20374   112 ------ETLMIGG-------LMGLRTVFTDHSlFGFADASAILtnkLVLQ 148  

                   eAFhIAtTGRPGPVlvDLPKDvttaeiefpyddPekvnLPGYkPtveGhp
                    +             vD     +t+   +     e + L+G   ++ +++
      T20374   149 YSL----------INVD-----QTICVSYTSK--ENTVLRG---KLDPNK 178  

                   lQDeFvmqsIkKAaeLiekAkKPVilvGGGvIniagAseeLkelAErlqi
                                                    i +A e  + + +r q 
      T20374   179 VS------------------------------TIPNAIETSLFTPDRNQF 198  

                   PVttTLmGlGsFPedHPlsLGMLGMHGTktANlAvhEcDLlIAVGaRFDD
                   +             + P    +L   G+++       +DLl  +      
      T20374   199 F-------------NNPTTIVFL---GRLV---YRKGADLLCEI------ 223  

                   RvTGNlakFAPnAKRaaaeGRGGIIHIDIDPaeIGKnVrvdIPIVGDArn
                        + k + + K                       Vr  I   G  r+
      T20374   224 -----VPKVCARHK----------------------SVRFIIGGDGPKRI 246  

                   VLeeLlkklekekalkerseeqaWleqInkWKkeyplaYmdyteegkiKP
                    Lee+l++ +      er                 +l  +++  ++    
      T20374   247 ELEEMLERFKLH----ERVV---------------ILGMLPHNQVK---- 273  

                   QqVIeeisrvtkdigreAiVTTDVGQHQMWAAqFypfkkPRkfItSGGLG
                           rv++            GQ       F ++             
      T20374   274 --------RVLNQ-----------GQ------IFINTS------------ 286  

                   TMGFGlPAAiGAkVAkPeetVicitGDGSFqMnlQELsTivqYdiPVkiv
                                  + e+          F M + E +++   ++ V+  
      T20374   287 ---------------LTEA----------FCMSIVEAASC---GLHVV-- 306  

                   ILNNryLGMVrQWQeLFYeeRySethmgselPDFvkLAEaYGikGirIek
                                           t  g ++P    + E    + +    
      T20374   307 -----------------------STRVG-GVPEVLPIGE---FISLEEPV 329  

                   peEldeKLkEAleskrnNePVllDvvVDkseeVyPMV..aPGggLdEmig
                   p++l ++L  A+  +   +  l+D + +  e V  M + +  ++ ++ i 
      T20374   330 PDDLVDALLKAVDRR--EKGLLMDPTEK-HEAVSKMYnwPDVAARTQVIY 376  

                   ek<-*
                    k   
      T20374   377 QK    378  

//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/tigrfam/tigrfam.hmm-f
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model     Description                                   Score    E-value  N 
--------  -----------                                   -----    ------- ---
TIGR00008 infA: translation initiation factor IF-1        1.7         26   1
TIGR00490 aEF-2: translation elongation factor aEF-2     -0.2         17   1
TIGR00045 TIGR00045: conserved hypothetical protein T    -0.5         33   1
TIGR00282 TIGR00282: conserved hypothetical protein T    -0.5         48   1
TIGR00381 cdhD: CO dehydrogenase/acetyl-CoA synthase,    -1.4         37   1

Parsed for domains:
Model     Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------  ------- ----- -----    ----- -----      -----  -------
TIGR00008   1/1     208   215 ..    62    69 .]     1.7       26
TIGR00282   1/1     203   241 ..     1    39 [.    -0.5       48
TIGR00045   1/1     271   280 ..   372   381 .]    -0.5       33
TIGR00490   1/1     327   351 ..   701   724 .]    -0.2       17
TIGR00381   1/1     337   353 ..   265   281 ..    -1.4       37

Alignments of top-scoring domains:
TIGR00008: domain 1 of 1, from 208 to 215: score 1.7, E = 26
                   *->rGRIiyRl<-*
                      +GR +yR+   
      T20374   208    LGRLVYRK    215  

TIGR00282: domain 1 of 1, from 203 to 241: score -0.5, E = 48
                   *->ikvlflGdvyGkaGrkivkenlpklknkykpdlviange<-*
                        ++flG ++ + G +++ e +pk+  ++k    i+ g+   
      T20374   203    TTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGD    241  

TIGR00045: domain 1 of 1, from 271 to 280: score -0.5, E = 33
                   *->NvAqvLaigq<-*
                      +v++vL +gq   
      T20374   271    QVKRVLNQGQ    280  

TIGR00490: domain 1 of 1, from 327 to 351: score -0.2, E = 17
                   *->EkvPrelqeelvkev.RkRKGLklE<-*
                      E vP +l + l+k v+R+ KGL ++   
      T20374   327    EPVPDDLVDALLKAVdRREKGLLMD    351  

TIGR00381: domain 1 of 1, from 337 to 353: score -1.4, E = 37
                   *->LLkrglkpedSIVMDPT<-*
                      LLk++  +e    MDPT   
      T20374   337    LLKAVDRREKGLLMDPT    353  

//
SMART
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/iprscan/data/smart.HMMs
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
LysM     Lysin motif                                    -10.4         49   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
LysM       1/1     122   161 ..     1    45 []   -10.4       49

Alignments of top-scoring domains:
LysM: domain 1 of 1, from 122 to 161: score -10.4, E = 49
                   *->tYtVkkGDTLssIArkygvsvkdLlklNpilnpdnlyvGQkLkip<-
                        tV    +L++ A +  + +++L         + + v+Q+++++  
      T20374   122    LRTVFTDHSLFGFADASAILTNKLVLQY-----SLINVDQTICVS   161  

                   *
                    
      T20374     -   -    

//
COG
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/cogs/cogs.hmm
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
COG0438                                                 126.3    5.6e-34   1
COG0297                                                -262.0       0.11   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
COG0297    1/1       6   384 ..     1   556 []  -262.0     0.11
COG0438    1/1     201   384 ..     1   255 []   126.3  5.6e-34

Alignments of top-scoring domains:
COG0297: domain 1 of 1, from 6 to 384: score -262.0, E = 0.11
                   *->lnsqdryserMkILfvasEvtPfvKvGGLADVlgaLPkaLkklGhdV
                          ++ys     + ++s ++    +GG    ++ L++ L+ lGh V
      T20374     6    ----GPYS-----IALVSDFFC-PNAGGVETHIYFLAQCLIELGHRV 42   

                   rVlLPkYgriqgepieqlykvsegetvavvgreqqfdvlesyldGt.vgl
                   +V++  Yg  +++++ +     + ++++ +    ++ ++ +   G+ vg 
      T20374    43 VVITHGYGN-RKGIRYLS---NGLKVYYLP----FIVAYNGATLGSiVGS 84   

                   ylidKndyyfnregnPYhDanlygypDnaeRFafFsaAalelldgldpfw
                   +  ++ + + ++                    ++ ++ + + l ++    
      T20374    85 MPWLRKVLLRENVQ------------------IIHGHSTFSSLAHET--- 113  

                   qPDiVHaHDWhTGLvpalLKteyrklPFfervKtVFTIHNLaYQGEmIEY
                                ++++l+     +l+      tVFT H L   G     
      T20374   114 ------------LMIGGLM-----GLR------TVFTDHSLF--G----- 133  

                   GEVmTFLifpahylhellglplylfhyeglefpGqinflKaGivfaDhVT
                        F  ++a +++  l+l   l +      +++i              
      T20374   134 -----FADASAILTNK-LVLQYSLIN-----VDQTI-------------- 158  

                   TVSPTYAqEIqTpeygygLeglLkarssegklsGILNGIDyeiWnPetDp
                    VS           y  +   +L+ +  + k+s I N I ++ + P    
      T20374   159 CVS-----------YTSKENTVLRGKLDPNKVSTIPNAIETSLFTPD--- 194  

                   ylaanYdagsledpvlFkkKaeNKtaLqeelGLpedddaPligiVsRLte
                          ++++                         ++ + i + +RL+ 
      T20374   195 -------RNQFF------------------------NNPTTIVFLGRLVY 213  

                   QKGvdLlleiideLlekEFqdaqlViLGtGdPeLE.nafrnlaerhpdsg
                    KG dLl ei +    +  + ++++i+G G   +E +++++  + h    
      T20374   214 RKGADLLCEIVPKVCAR-HKSVRFIIGGDGPKRIElEEMLERFKLHE--- 259  

                   nvavligfdepLArriYAGaDfilMPSrFEPCGLtQLiaMrYGTvPIVRe
                   +v++l ++    + r+     +++  S  E++  +  +a   G  ++ ++
      T20374   260 RVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAASCGLHVVSTR 309  

                   TGGLaDTVvdldydeenleekgtGflFkepdaeallnalsRAla.....l
                   +GG    V++    ++     g ++  +ep + +l +al +A+ +++++l
      T20374   310 VGG----VPE--VLPI-----GEFISLEEPVPDDLVDALLKAVDrrekgL 348  

                   YrqelNEICmFmQYIRYCPHpdewqnlvtraMaNCYYHVFadfSWdkSPA
                    ++                 p e+   v+++++           W +  A
      T20374   349 LMD-----------------PTEKHEAVSKMYN-----------WPDV-A 369  

                   keYvelYegllaktrd<-*
                    + + +Y++++ + ++   
      T20374   370 ARTQVIYQKAVES-EP    384  

COG0438: domain 1 of 1, from 201 to 384: score 126.3, E = 5.6e-34
                   *->dkpvilfvGRlvpeKgldllieafaklkeeipellpdlklvivGgts
                      ++ +i+f+GRlv++Kg dll e+++k+++++      ++++i G   
      T20374   201    NPTTIVFLGRLVYRKGADLLCEIVPKVCARH----KSVRFIIGG--- 240  

                   yiaaeacdGpeeerlrlleklakklglednVeflGfvpdprvldeelpel
                          dGp++ +   le++ ++ +l ++V+ lG +p     + ++ ++
      T20374   241 -------DGPKRIE---LEEMLERFKLHERVVILGMLP-----HNQVKRV 275  

                   lkaadvfvlPSrysekrgedrEgfglvllEAmAaGtPViatdvgslelga
                   l++  +f+ +S++        E+f+++++EA ++G+ V++t+vg      
      T20374   276 LNQGQIFINTSLT--------EAFCMSIVEAASCGLHVVSTRVG------ 311  

                   neereladkGipEvvedgarylfgenGregkrrlnlllvdpgdeddidsi
                            G+pEv+ +         G++       +  ++  +d     
      T20374   312 ---------GVPEVL-P--------IGEF-------ISLEEPVPD----- 331  

                   ealaeaierlledpelreregvsllgrearrrvaerfswekiakrllkly
                    +l++a++++ + +e+   ++      e+ ++v++ ++w  +a r++ +y
      T20374   332 -DLVDALLKAVDRREKGLLMD----PTEKHEAVSKMYNWPDVAARTQVIY 376  

                   eellekre<-*
                   ++++e ++   
      T20374   377 QKAVESEP    384  

//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/cogs/cogs.hmm-f
Sequence file:            T20374.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T20374  hypothetical protein D2085.6 - Caenorhabditis elegans.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
COG0438                                                 124.6    7.6e-38   1
COG0564                                                   2.0        9.9   1
COG0212                                                   0.6         40   1
COG0840                                                   0.5         52   1
COG0334                                                   0.1         29   1
COG3201                                                  -0.3         54   1
COG2823                                                  -0.6         90   1
COG2119                                                  -1.0         81   1
COG0524                                                  -1.1         88   1
COG2403                                                  -1.6         87   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
COG0334    1/1      29    45 ..   236   253 ..     0.1       29
COG2403    1/1      31    47 ..   161   177 ..    -1.6       87
COG0524    1/1      18    48 ..   224   253 ..    -1.1       88
COG3201    1/1      77   101 ..   233   257 .]    -0.3       54
COG2119    1/1     119   133 ..   244   258 .]    -1.0       81
COG0840    1/1     247   257 ..   383   393 .]     0.5       52
COG2823    1/1     272   284 ..   198   210 .]    -0.6       90
COG0564    1/1     319   345 ..   315   341 .]     2.0      9.9
COG0438    1/1     201   384 ..     1   255 []   124.6  7.6e-38
COG0212    1/1     387   397 ..   149   159 ..     0.6       40

Alignments of top-scoring domains:
COG0334: domain 1 of 1, from 29 to 45: score 0.1, E = 29
                   *->qyaAeklleesGAkVVav<-*
                      +++A++l+e+ G++VV++   
      T20374    29    YFLAQCLIEL-GHRVVVI    45   

COG2403: domain 1 of 1, from 31 to 47: score -1.6, E = 87
                   *->vAqlLrelGyrVvavRH<-*
                      +Aq L elG rVv++ H   
      T20374    31    LAQCLIELGHRVVVITH    47   

COG0524: domain 1 of 1, from 18 to 48: score -1.1, E = 88
                   *->desad.l.r.aeaaaarlllnekgaklVvvTlG<-*
                      +++a +++++++  a++l   e+g ++Vv+T+G   
      T20374    18    CPNAGgVeThIYFLAQCLI--ELGHRVVVITHG    48   

COG3201: domain 1 of 1, from 77 to 101: score -0.3, E = 54
                   *->tflailglriWlrdaalResrAlkq<-*
                      t+  i g+  Wlr   lRe++   +   
      T20374    77    TLGSIVGSMPWLRKVLLRENVQIIH    101  

COG2119: domain 1 of 1, from 119 to 133: score -1.0, E = 81
                   *->lfALlllwdvaegvs<-*
                      l++L++ +++ ++++   
      T20374   119    LMGLRTVFTDHSLFG    133  

COG0840: domain 1 of 1, from 247 to 257: score 0.5, E = 52
                   *->eLqelverFkv<-*
                      eL+e++erFk+   
      T20374   247    ELEEMLERFKL    257  

COG2823: domain 1 of 1, from 272 to 284: score -0.6, E = 90
                   *->VKkVvklfkkyvn<-*
                      VK+V + +++++n   
      T20374   272    VKRVLNQGQIFIN    284  

COG0564: domain 1 of 1, from 319 to 345: score 2.0, E = 9.9
                   *->ngeemefeaplpedflellvkllkeei<-*
                      +ge++ +e p+p+d++ +l k +++     
      T20374   319    IGEFISLEEPVPDDLVDALLKAVDRRE    345  

COG0438: domain 1 of 1, from 201 to 384: score 124.6, E = 7.6e-38
                   *->dkpvilfvGRlvpeKgldllieafaklkeeipellpdlklvivGgts
                      ++ +i+f+GRlv++Kg dll e+++k+++++      ++++i G   
      T20374   201    NPTTIVFLGRLVYRKGADLLCEIVPKVCARH----KSVRFIIGG--- 240  

                   yiaaeacdGpeeerlrlleklakklglednVeflGfvpdprvldeelpel
                          dGp++ +   le++ ++ +l ++V+ lG +p     + ++ ++
      T20374   241 -------DGPKRIE---LEEMLERFKLHERVVILGMLP-----HNQVKRV 275  

                   lkaadvfvlPSrysekrgedrEgfglvllEAmAaGtPViatdvgslelga
                   l++  +f+ +S++        E+f+++++EA ++G+ V++t+vg      
      T20374   276 LNQGQIFINTSLT--------EAFCMSIVEAASCGLHVVSTRVG------ 311  

                   neereladkGipEvvedgarylfgenGregkrrlnlllvdpgdeddidsi
                            G+pEv+ +         G++       +  ++  +d     
      T20374   312 ---------GVPEVL-P--------IGEF-------ISLEEPVPD----- 331  

                   ealaeaierlledpelreregvsllgrearrrvaerfswekiakrllkly
                    +l++a++++ + +e+   ++      e+ ++v++ ++w  +a r++ +y
      T20374   332 -DLVDALLKAVDRREKGLLMD----PTEKHEAVSKMYNWPDVAARTQVIY 376  

                   eellekre<-*
                   ++++e ++   
      T20374   377 QKAVESEP    384  

COG0212: domain 1 of 1, from 387 to 397: score 0.6, E = 40
                   *->RLGyGgGYYDR<-*
                      RLG+ +GYYD+   
      T20374   387    RLGRLKGYYDQ    397  

//