analysis of sequence from T00731.fa
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.
MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR
LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV
LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR
RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL
AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD
SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG
LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR
PPRESGEA

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

sec.str. with predator

> T00731
              .         .         .         .         .
1    MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFS   50
     ___HHHHHHHHHH________EEE___EEEEEEE______HHHHHHHHHH

              .         .         .         .         .
51   LLGVSRTVKRLGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYG  100
     H_____EEEE_____HHHHHHHH___HHHHH____EEEE___________

              .         .         .         .         .
101  DNVEVDYRGYEVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGH  150
     ___________EEEHHHHHHH__________HHHHH_____EEEEEE___

              .         .         .         .         .
151  GGDEFLKFQDAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQ  200
     ____EEE_HHHHHHHH_HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

              .         .         .         .         .
201  LQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTLAFFERLNIYD  250
     H_____EEE_________EEEEE_____________HHHHHHHHHH____

              .         .         .         .         .
251  NASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD  300
     ______________HHHHHHHH_______EEEEEE_____HHHHHHH___

              .         .         .         .         .
301  SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVY  350
     __HHHHHHHHHHHHHH_________HHHHHHHHH________HHHHHHEE

              .         .         .         .         .
351  TLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIE  400
     EE_____EEEHHHHHHHH___EEEE_____HHHHHHHHHHHHH_____EE

              .         .        
401  IPCYVVVAEAKMPIFTNAGRPPRESGEA                        428
     E__EEEEEE___________________


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~


method         :         1
alpha-contents :      29.0 %
beta-contents  :      44.0 %
coil-contents  :      27.0 %
class          :     mixed


method         :         2
alpha-contents :      33.1 %
beta-contents  :      31.2 %
coil-contents  :      35.7 %
class          :     mixed


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

GPI: learning from metazoa
-17.80  -0.33  -0.56  -0.03  -4.00   0.00   0.00   0.00  -1.47 -10.02  -3.65 -12.00 -12.00  -8.00 -12.00   0.00  -81.86
  2.19   0.00   0.00   0.00   0.00   0.00 -24.00   0.00  -0.44 -11.85  -3.65 -12.00 -12.00   0.00 -12.00   0.00  -73.75
ID: T00731	AC: xxx Len:  428 1:I   417 Sc:  -73.75 Pv: 6.472730e-01 NO_GPI_SITE
GPI: learning from protozoa
-13.59   0.00  -0.08   0.00  -4.00   0.00 -24.00   0.00  -0.06 -10.01 -12.33 -12.00 -12.00   0.00 -12.00   0.00  -100.07
-18.10  -0.55   0.00   0.00  -4.00   0.00 -28.00   0.00   0.00 -10.01 -12.33 -12.00   0.00   0.00 -12.00   0.00  -96.99
ID: T00731	AC: xxx Len:  428 1:I   418 Sc:  -96.99 Pv: 8.183256e-01 NO_GPI_SITE

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

# SignalP euk predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
T00731       0.765 399 Y  0.708 399 Y  0.968   5 Y  0.200 N
# SignalP gram- predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
T00731       0.472 209 N  0.305  19 N  0.992   9 Y  0.882 Y
# SignalP gram+ predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
T00731       0.514 385 Y  0.431  61 Y  0.991  45 Y  0.725 Y

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

low complexity regions: SEG 12 2.2 2.5
>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.

                                  1-379  MKILTLVMLLCYSFVSSTGDTTIHTNNWAV
                                         LVCTSRFCSLHSLVLTFIFSLLGVSRTVKR
                                         LGIPDERIILMLADDMACNARNEYPAQVFN
                                         NENHKLNLYGDNVEVDYRGYEVTVENFLRV
                                         LTGRHENAVPRSKRLLSDEGSHILLYMTGH
                                         GGDEFLKFQDAEELQSHDLADAVKQMKEKR
                                         RFKELMIMVDTCQAATLFNQLQSPGVLAIG
                                         SSLKGENSYSHHLDSDIGVSVVDRFTYYTL
                                         AFFERLNIYDNASLNSLFRSYDPRLLMSTA
                                         YYRTDLYQPHLVEVPVTNFFGSVMETIHTD
                                         SAYKAFSSKISERKINSEMPFNQLSEHDLK
                                         EELENTNIPNDELIAEVTVYTLFPGLSYFG
                                         LSTLLRYMNLSRVRVLSMI
               ddvfafwlvfvllld  380-394  
                                395-428  STNRIEIPCYVVVAEAKMPIFTNAGRPPRE
                                         SGEA

low complexity regions: SEG 25 3.0 3.3
>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.

                                  1-428  MKILTLVMLLCYSFVSSTGDTTIHTNNWAV
                                         LVCTSRFCSLHSLVLTFIFSLLGVSRTVKR
                                         LGIPDERIILMLADDMACNARNEYPAQVFN
                                         NENHKLNLYGDNVEVDYRGYEVTVENFLRV
                                         LTGRHENAVPRSKRLLSDEGSHILLYMTGH
                                         GGDEFLKFQDAEELQSHDLADAVKQMKEKR
                                         RFKELMIMVDTCQAATLFNQLQSPGVLAIG
                                         SSLKGENSYSHHLDSDIGVSVVDRFTYYTL
                                         AFFERLNIYDNASLNSLFRSYDPRLLMSTA
                                         YYRTDLYQPHLVEVPVTNFFGSVMETIHTD
                                         SAYKAFSSKISERKINSEMPFNQLSEHDLK
                                         EELENTNIPNDELIAEVTVYTLFPGLSYFG
                                         LSTLLRYMNLSRVRVLSMIDDVFAFWLVFV
                                         LLLDSTNRIEIPCYVVVAEAKMPIFTNAGR
                                         PPRESGEA

low complexity regions: SEG 45 3.4 3.75
>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.

                                  1-428  MKILTLVMLLCYSFVSSTGDTTIHTNNWAV
                                         LVCTSRFCSLHSLVLTFIFSLLGVSRTVKR
                                         LGIPDERIILMLADDMACNARNEYPAQVFN
                                         NENHKLNLYGDNVEVDYRGYEVTVENFLRV
                                         LTGRHENAVPRSKRLLSDEGSHILLYMTGH
                                         GGDEFLKFQDAEELQSHDLADAVKQMKEKR
                                         RFKELMIMVDTCQAATLFNQLQSPGVLAIG
                                         SSLKGENSYSHHLDSDIGVSVVDRFTYYTL
                                         AFFERLNIYDNASLNSLFRSYDPRLLMSTA
                                         YYRTDLYQPHLVEVPVTNFFGSVMETIHTD
                                         SAYKAFSSKISERKINSEMPFNQLSEHDLK
                                         EELENTNIPNDELIAEVTVYTLFPGLSYFG
                                         LSTLLRYMNLSRVRVLSMIDDVFAFWLVFV
                                         LLLDSTNRIEIPCYVVVAEAKMPIFTNAGR
                                         PPRESGEA


low complexity regions: XNU
# Score cutoff = 21, Search from offsets 1 to 4
# both members of each repeat flagged
# lambda = 0.347, K = 0.200, H = 0.664
>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.
MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFSLLGVSRTVKR
LGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRV
LTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR
RFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTL
AFFERLNIYDNASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD
SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVYTLFPGLSYFG
LSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIEIPCYVVVAEAKMPIFTNAGR
PPRESGEA
    1 -  428 MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR 
             LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV 
             LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR 
             RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL 
             AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD 
             SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG 
             LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR 
             PPRESGEA

low complexity regions: DUST
>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.
MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFSLLGVSRTVKR
LGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRV
LTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR
RFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTL
AFFERLNIYDNASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD
SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVYTLFPGLSYFG
LSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIEIPCYVVVAEAKMPIFTNAGR
PPRESGEA

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

coiled coil prediction for T00731
sequence: 428 amino acids, 0 residue(s) in coiled coil state

    .    |     .    |     .    |     .    |     .    |     .   60
MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  120
LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  180
LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~3 3333333333 3333333333 * 21 M'95 -w border
---------- ---------- ---------- ---------b cdefgabcde fgabcdefga * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~1 1111111111 1111111111 * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~1 1111111111 1111111111 * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~2555 5555555555 * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  240
RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL
2~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
b--------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
1~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
1111111~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
5222~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  300
AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  360
SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~1111 1111111111 11111111~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ------bcde fgabcdefga bcdefgab-- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~4444 4444444444 444444431~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~111 1111111111 11111111~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~11111111 111111~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  420
LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .   
PPRESGEA
~~~~~~~~
--------
~~~~~~~~
~~~~~~~~
~~~~~~~~
~~~~~~~~


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

prediction of transmembrane regions with toppred2

     ***********************************
     *TOPPREDM with eukaryotic function*
     ***********************************

T00731.fa.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: T00731.fa.___inter___

 (1 sequences)
MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFS
LLGVSRTVKRLGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYG
DNVEVDYRGYEVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGH
GGDEFLKFQDAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQ
LQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTLAFFERLNIYD
NASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD
SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVY
TLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIE
IPCYVVVAEAKMPIFTNAGRPPRESGEA


(p)rokaryotic or (e)ukaryotic: e


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 4 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1     1    21   1.555 Certain
     2    37    57   1.664 Certain
     3   193   213   0.619 Putative
     4   345   365   1.446 Certain
     5   375   395   0.997 Putative

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     2     4     5
 Loop length     0    15   287     9    33
 K+R profile  2.00           +        4.00      
                    1.00        3.00      
CYT-EXT prof     -        0.50           -      
                       -           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 2.00
Tm probability: 0.99
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:   0.50
-> Orientation: N-out

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     2     3     4
 Loop length     0    15   135   131    63
 K+R profile  2.00           +           +      
                    1.00           +      
CYT-EXT prof     -       -0.28        0.93      
                       -        0.80      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.05
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.16
-> Orientation: N-in

----------------------------------------------------------------------
Structure 3

Transmembrane segments included in this structure:
     Segment       1     2     4
 Loop length     0    15   287    63
 K+R profile  2.00           +      
                    1.00           +      
CYT-EXT prof     -        0.50      
                       -        0.93      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 1.00
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.43
-> Orientation: N-in

----------------------------------------------------------------------
Structure 4

Transmembrane segments included in this structure:
     Segment       1     2     3     4     5
 Loop length     0    15   135   131     9    33
 K+R profile  2.00           +        3.00      
                    1.00           +        4.00      
CYT-EXT prof     -       -0.28           -      
                       -        0.80           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.05
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -1.09
-> Orientation: N-in

----------------------------------------------------------------------

"T00731" 428 
 1 21 #t 1.55521
 37 57 #t 1.66354
 193 213 #f 0.61875
 345 365 #t 1.44583
 375 395 #f 0.996875


     ************************************
     *TOPPREDM with prokaryotic function*
     ************************************

T00731.fa.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: T00731.fa.___inter___

 (1 sequences)
MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFS
LLGVSRTVKRLGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYG
DNVEVDYRGYEVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGH
GGDEFLKFQDAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQ
LQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTLAFFERLNIYD
NASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD
SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVY
TLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIE
IPCYVVVAEAKMPIFTNAGRPPRESGEA


(p)rokaryotic or (e)ukaryotic: p


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 4 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1     1    21   1.555 Certain
     2    37    57   1.664 Certain
     3   193   213   0.619 Putative
     4   345   365   1.446 Certain
     5   375   395   0.997 Putative

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     2     3     4     5
 Loop length     0    15   135   131     9    33
 K+R profile  1.00           +        3.00      
                    1.00           +        4.00      
CYT-EXT prof     -       -0.28           -      
                       -        0.80           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -1.00
Tm probability: 0.05
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -1.09
-> Orientation: N-in

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     2     4     5
 Loop length     0    15   287     9    33
 K+R profile  1.00           +        4.00      
                    1.00        3.00      
CYT-EXT prof     -        0.50           -      
                       -           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.99
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:   0.50
-> Orientation: N-out

----------------------------------------------------------------------
Structure 3

Transmembrane segments included in this structure:
     Segment       1     2     3     4
 Loop length     0    15   135   131    63
 K+R profile  1.00           +           +      
                    1.00           +      
CYT-EXT prof     -       -0.28        0.93      
                       -        0.80      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.05
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.16
-> Orientation: N-in

----------------------------------------------------------------------
Structure 4

Transmembrane segments included in this structure:
     Segment       1     2     4
 Loop length     0    15   287    63
 K+R profile  1.00           +      
                    1.00           +      
CYT-EXT prof     -        0.50      
                       -        0.93      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 1.00
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 2.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.43
-> Orientation: N-in

----------------------------------------------------------------------

"T00731" 428 
 1 21 #t 1.55521
 37 57 #t 1.66354
 193 213 #f 0.61875
 345 365 #t 1.44583
 375 395 #f 0.996875


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

SAPS.  Version of April 11, 1996.
Date run: Thu Feb 21 12:37:25 2002

File: /people/b_eisen/T00731.fa.___saps___
ID   T00731
DE   hypothetical protein F22O13.26 - Arabidopsis thaliana.

number of residues:  428;   molecular weight:  48.8 kdal
 
         1  MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR 
        61  LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV 
       121  LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR 
       181  RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL 
       241  AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD 
       301  SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG 
       361  LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR 
       421  PPRESGEA

--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s)

A  : 23( 5.4%); C  :  6( 1.4%); D  : 24( 5.6%); E  : 29( 6.8%); F  : 24( 5.6%)
G  : 19( 4.4%); H  : 12( 2.8%); I  : 20( 4.7%); K  : 15( 3.5%); L  : 53(12.4%)
M  : 14( 3.3%); N  : 25( 5.8%); P  : 14( 3.3%); Q  :  9( 2.1%); R  : 24( 5.6%)
S  : 36( 8.4%); T  : 26( 6.1%); V  : 34( 7.9%); W  :  2( 0.5%); Y  : 19( 4.4%)

KR      :   39 (  9.1%);   ED      :   53 ( 12.4%);   AGP     :   56 ( 13.1%);
KRED    :   92 ( 21.5%);   KR-ED   :  -14 ( -3.3%);   FIKMNY  :  117 ( 27.3%);
LVIFM   :  145 ( 33.9%);   ST      :   62 ( 14.5%).

--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS
 
         1  0+00000000 000000000- 0000000000 00000+0000 0000000000 00000+00++ 
        61  0000--+000 000--00000 +0-0000000 0-00+00000 -00-0-0+00 -000-000+0 
       121  000+0-0000 +0++000--0 0000000000 00--00+00- 0--0000-00 -00+00+-++ 
       181  +0+-00000- 0000000000 0000000000 000+0-0000 000-0-0000 00-+000000 
       241  000-+0000- 00000000+0 0-0+000000 00+0-00000 00-0000000 0000-0000- 
       301  000+0000+0 0-++000-00 00000-0-0+ --0-000000 --000-0000 0000000000 
       361  00000+0000 0+0+00000- -000000000 000-000+0- 00000000-0 +00000000+ 
       421  00+-00-0

A. CHARGE CLUSTERS.


Positive charge clusters (cmin =  9/30 or 11/45 or 14/60):  none


Negative charge clusters (cmin = 10/30 or 14/45 or 17/60):  none


Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60):  none


B. HIGH SCORING (UN)CHARGED SEGMENTS.

There are no high scoring positive charge segments.
There are no high scoring negative charge segments.
There are no high scoring mixed charge segments.

________________________________
High scoring uncharged segments:

score=   1.00 frequency=   0.785  ( LAGSVTIPNFQYHMCW )
score=   0.00 frequency=   0.000  ( BZX )
score=  -8.00 frequency=   0.215  ( KEDR )

 Expected score/letter:  -0.935
 M_0.01=  42.90; M_0.05=  34.59

 1) From    3 to   55:  length= 53, score=35.00  * 
       3  ILTLVMLLCY SFVSSTGDTT IHTNNWAVLV CTSRFCSLHS LVLTFIFSLL 
      53  GVS
    L: 10(18.9%);  S:  8(15.1%);  V:  6(11.3%);  T:  7(13.2%);


C. CHARGE RUNS AND PATTERNS.

pattern  (+)|  (-)|  (*)|  (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)|
lmin0     4 |   5 |   7 |  38 |   9 |  10 |  13 |  11 |  12 |  15 |   7 |   9 | 
lmin1     6 |   6 |   9 |  46 |  11 |  12 |  15 |  13 |  14 |  18 |   8 |  11 | 
lmin2     7 |   8 |  10 |  51 |  12 |  13 |  17 |  15 |  16 |  20 |  10 |  12 | 
 (Significance level: 0.010000; Minimal displayed length:  6)
There are no charge runs or patterns exceeding the given minimal lengths.

Run count statistics:

  +  runs >=   3:   1, at  179;
  -  runs >=   3:   0
  *  runs >=   5:   1, at  177;
  0  runs >=  25:   0

--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES

1. HIGH SCORING SEGMENTS.
There are no high scoring hydrophobic segments.
There are no high scoring transmembrane segments.


2. SPACINGS OF C.


H2N-10-C-21-C-4-C-39-C-113-C-210-C-25-COOH


2*. SPACINGS OF C and H. (additional deluxe function for ALEX)


H2N-10-C-12-H-8-C-4-C-2-H-36-C-15-H-30-H-16-H-7-H-16-H-24-C-28-H-H-57-H-17-H-28-H-75-C-25-COOH

--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.

A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length:  4

Aligned matching blocks:


[  22-  25]   TIHT
[ 296- 299]   TIHT

______________________________

[ 111- 114]   EVTV
[ 346- 349]   EVTV

______________________________

[ 198- 201]   FNQL
[ 321- 324]   FNQL


B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
   (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length:  8

--------------------------------------------------------------------------------

MULTIPLETS.

A. AMINO ACID ALPHABET.

1. Total number of amino acid multiplets:  29  (Expected range:  10-- 40)

2. Histogram of spacings between consecutive amino acid multiplets:
   (1-5) 6   (6-10) 9   (11-20) 10   (>=21) 5

3. Clusters of amino acid multiplets (cmin = 12/30 or 15/45 or 18/60):  none


B. CHARGE ALPHABET.

1. Total number of charge multiplets:  12  (Expected range:   0-- 18)
   4 +plets (f+: 9.1%), 8 -plets (f-: 12.4%)
   Total number of charge altplets: 8 (Critical number: 21)

2. Histogram of spacings between consecutive charge multiplets:
   (1-5) 2   (6-10) 3   (11-20) 3   (>=21) 5

--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.

A. AMINO ACID ALPHABET (core:  4; !-core: 5)

Location	Period	Element		Copies	Core	Errors
  43-  78	 9	L........ 	 4	 4  	 0
 318- 349	 8	E.......  	 4	 4  	 0
 343- 378	 9	L........ 	 4	 4  	 0


B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core:  5; !-core: 6)
   and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core:  6; !-core:10)

Location	Period	Element		Copies	Core	Errors
  23-  64	 7	i.0.0.0   	 6	 6  	/0/./2/./1/./2/
  65- 109	 9	-.0.000.0 	 5	 5  	/0/./1/./0/1/0/./0/
 168- 185	 3	*..       	 6	 6  	 0
 343- 423	 9	i......0. 	 8	 6  	/1/././././././2/./
 361- 393	 3	i..       	10	 8  	 1
 368- 397	 5	i....     	 6	 6  	 0
 375- 422	 8	i..0....  	 6	 6  	/0/././2/././././
 387- 393	 1	i         	 7	 7  	 0


--------------------------------------------------------------------------------
SPACING ANALYSIS.

There are no unusual spacings.


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Pfam (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/Pfam
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model         Description                               Score    E-value  N 
--------      -----------                               -----    ------- ---
Peptidase_C13 Peptidase C13 family                      627.4   8.1e-185   1
NodD_C_term   NodD transcription activator carboxyl t     7.5       0.39   1
Birna_VP3     Birnavirus VP3 protein                      3.9        3.5   1
MATH          MATH domain                                 1.8         34   1
Peptidase_A3  Cauliflower mosaic virus peptidase (A3)     0.5         66   1
SAP           SAP domain                                 -3.9         53   1
DUF190        Uncharacterized ACR, COG1993              -42.9         56   1
CMAS          Cyclopropane-fatty-acyl-phospholipid sy  -106.7         32   1

Parsed for domains:
Model         Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------      ------- ----- -----    ----- -----      -----  -------
Birna_VP3       1/1     165   179 ..   244   258 .]     3.9      3.5
DUF190          1/1     113   195 ..     1   103 []   -42.9       56
CMAS            1/1     103   239 ..     1   174 []  -106.7       32
NodD_C_term     1/1     263   282 ..     1    20 [.     7.5     0.39
Peptidase_C13   1/1       2   329 ..     1   364 []   627.4 8.1e-185
Peptidase_A3    1/1     308   340 ..   172   207 ..     0.5       66
MATH            1/1     331   351 ..   136   159 .]     1.8       34
SAP             1/1     321   351 ..     1    35 []    -3.9       53

Alignments of top-scoring domains:
Birna_VP3: domain 1 of 1, from 165 to 179: score 3.9, E = 3.5
                   *->QmkdLrhlarqmkrr<-*
                      Q  dL ++++qmk +   
      T00731   165    QSHDLADAVKQMKEK    179  

DUF190: domain 1 of 1, from 113 to 195: score -42.9, E = 56
                   *->vkkklLrIYtsEddkfEGkplYkalverLkeSeGirGATVlrGIaGF
                      + +  Lr+ t+    +E        v r k+    +G   l    G 
      T00731   113    TVENFLRVLTGR---HE------NAVPRSKRLLSDEGSHILLYMTGH 150  

                   GkkkevhsedlfrLsveLPVvvEvVDeeekIkrvLeeikel..iknhGLI
                   G++      d + L +            + ++  ++  ke++++k   +I
      T00731   151 GGDEFLKFQDAEELQS------------HDLADAVKQMKEKrrFKE-LMI 187  

                   TlEdvkVl<-*
                    +  ++++   
      T00731   188 MVDTCQAA    195  

CMAS: domain 1 of 1, from 103 to 239: score -106.7, E = 32
                   *->kevlLqDwedfdepvDrIVSvGaFEHvGGhenYdtFFkklyrilpad
                      +ev+++++e   e++ r+ +       G   +++    +++r+l ++
      T00731   103    VEVDYRGYEVTVENFLRVLT-------G---RHENAVPRSKRLLSDE 139  

                   GlmLLHtItslhpkelserGlkltmslaRFlkFIdkyIFPGGeLPs...i
                   G    H  +  +     e           FlkF d +     eL s++  
      T00731   140 GS---HILLYMTGHGGDE-----------FLKFQDAE-----ELQShdlA 170  

                   emIvesaqeaGFt..vedvqsLrpHYAkTLdlWaenLqank..deAialg
                    ++ +  ++  F++ +++v+  +          a++L ++++++++ a+g
      T00731   171 DAVKQMKEKRRFKelMIMVDTCQ----------AATLFNQLqsPGVLAIG 210  

                   qsEevyrmymlYLtGCakaFRkGyidvhQftltK<-*
                    s +  + y + L        +G   v  ft+     
      T00731   211 SSLKGENSYSHHLDS-----DIGVSVVDRFTYYT    239  

NodD_C_term: domain 1 of 1, from 263 to 282: score 7.5, E = 0.39
                   *->pelfmssaHprakLFeerlV<-*
                      p l ms+a+ r  L++ +lV   
      T00731   263    PRLLMSTAYYRTDLYQPHLV    282  

Peptidase_C13: domain 1 of 1, from 2 to 329: score 627.4, E = 8.1e-185
                   *->avfllvvLlilavvaaRdnfgdnislpsEevkffrDddghTnnWAVL
                      ++++lv+Ll+++         +++s++  ++++      hTnnWAVL
      T00731     2    KILTLVMLLCYS---------FVSSTG--DTTI------HTNNWAVL 31   

                   VAGSnGwfNYRHqA.fifDVChaYqllkrlGipDEnIIvmmyDDIAcNer
                   V++S++++ ++ + +fif+++++++++krlGipDE+II+m++DD+AcN+r
      T00731    32 VCTSRFCSLHSLVLtFIFSLLGVSRTVKRLGIPDERIILMLADDMACNAR 81   

                   NPrPGvviNhpnngtDvYggdVpvDYrGeeVTveNFlrVLtGdksavtgg
                   N++P++v+N++n+++++Yg++V+vDYrG+eVTveNFlrVLtG++++++++
      T00731    82 NEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRVLTGRHENAVPR 131  

                   SGKvLlSdpnDhIFIYyTDHGGpGvLkFPdseeLyakDLadalkkmhekk
                    +K+LlSd+++hI++Y+T+HGG+++LkF+d+eeL+++DLada+k+m+ek+
      T00731   132 -SKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR 180  

                   rYkeLvfyiEACeSGSmFegllspdLNIyAtTASnagEsSYstycDgdip
                   r+keL++++++C+++++F++l+sp  +++A+++S++gE+SYs+++D+di+
      T00731   181 RFKELMIMVDTCQAATLFNQLQSP--GVLAIGSSLKGENSYSHHLDSDIG 228  

                   sPPpvyvtcLgDlYSvaWlEdsekHnlskeTLqqqYksvkkrtclynysy
                       v+v++++++Y++a++E++++  +++++L+++++s+++r+ +     
      T00731   229 ----VSVVDRFTYYTLAFFERLNI--YDNASLNSLFRSYDPRLLM----- 267  

                   GSHVmqygDlyisklklvlftgffpavsNftivdepplrkplevvnqrDa
                    S++++++Dly+++l++v++t+ff++v++ ti+++++   ++++++++++
      T00731   268 -STAYYRTDLYQPHLVEVPVTNFFGSVME-TIHTDSA---YKAFSSKISE 312  

                   dLhtlwrkyqlanngsek<-*
                   +++++++++++ ++ +++   
      T00731   313 RKINSEMPFNQLSE-HDL    329  

Peptidase_A3: domain 1 of 1, from 308 to 340: score 0.5, E = 66
                   *->cslnpgdelgEeeklfntiivkiqlIEpLlEkNVcS<-*
                      +++++ +++  +e+ fn+++ + +l+E+L   N+++   
      T00731   308    SKISE-RKIN-SEMPFNQLS-EHDLKEELENTNIPN    340  

MATH: domain 1 of 1, from 331 to 351: score 1.8, E = 34
                   *->ddLeddyngylvdDsiiiEaeVkI<-*
                      ++Le+ +   +++D +i E++V     
      T00731   331    EELENTN---IPNDELIAEVTVYT    351  

SAP: domain 1 of 1, from 321 to 351: score -3.9, E = 53
                   *->lskLkVseLKeeLkkrGLstsGkKaeLveRLkeal<-*
                      ++ L+  +LKeeL+  +++      eL+   + +    
      T00731   321    FNQLSEHDLKEELENTNIPND----ELIAEVTVYT    351  

//

Start with PfamFrag (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/PfamFrag
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model         Description                               Score    E-value  N 
--------      -----------                               -----    ------- ---
Peptidase_C13 Peptidase C13 family                      625.6   2.8e-184   1
NodD_C_term   NodD transcription activator carboxyl t     7.5       0.39   1
Birna_VP3     Birnavirus VP3 protein                      3.9        3.5   1
LEM           LEM domain                                  3.6         25   1
MATH          MATH domain                                 1.8         34   1
DUF140        Domain of unknown function DUF140           1.0         36   1
UCR_hinge     Ubiquinol-cytochrome C reductase hinge      0.9         75   1
TP_methylase  Tetrapyrrole (Corrin/Porphyrin) Methyla     0.5         41   1
Peptidase_A3  Cauliflower mosaic virus peptidase (A3)     0.5         66   1
Nitrophorin   Nitrophorin                                -0.2         52   1
CMAS          Cyclopropane-fatty-acyl-phospholipid sy    -0.3         80   1
EAV_env_prot  Equine arteritis virus small envelope g    -1.7         89   1

Parsed for domains:
Model         Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------      ------- ----- -----    ----- -----      -----  -------
EAV_env_prot    1/1      35    50 ..     1    17 [.    -1.7       89
DUF140          1/1      36    54 ..   248   266 .]     1.0       36
TP_methylase    1/1      58    77 ..   207   226 .]     0.5       41
CMAS            1/1     103   111 ..     1     9 [.    -0.3       80
Birna_VP3       1/1     165   179 ..   244   258 .]     3.9      3.5
UCR_hinge       1/1     189   202 ..    52    65 .]     0.9       75
Nitrophorin     1/1     249   257 ..   173   181 .]    -0.2       52
NodD_C_term     1/1     263   282 ..     1    20 [.     7.5     0.39
Peptidase_C13   1/1       2   329 ..     1   364 []   625.6 2.8e-184
LEM             1/1     318   339 ..     1    22 [.     3.6       25
Peptidase_A3    1/1     308   340 ..   172   207 ..     0.5       66
MATH            1/1     331   351 ..   136   159 .]     1.8       34

Alignments of top-scoring domains:
EAV_env_prot: domain 1 of 1, from 35 to 50: score -1.7, E = 89
                   *->FsflCylHWLLLLcfFs<-*
                       s  C lH L+L ++Fs   
      T00731    35    -SRFCSLHSLVLTFIFS    50   

DUF140: domain 1 of 1, from 36 to 54: score 1.0, E = 36
                   *->VtsllvifildfvlTaimf<-*
                      ++++l  ++l+f+++++++   
      T00731    36    RFCSLHSLVLTFIFSLLGV    54   

TP_methylase: domain 1 of 1, from 58 to 77: score 0.5, E = 41
                   *->venatkpderilrgtLgeia<-*
                      v++++ pderi+ + + ++a   
      T00731    58    VKRLGIPDERIILMLADDMA    77   

CMAS: domain 1 of 1, from 103 to 111: score -0.3, E = 80
                   *->kevlLqDwe<-*
                      +ev+++++e   
      T00731   103    VEVDYRGYE    111  

Birna_VP3: domain 1 of 1, from 165 to 179: score 3.9, E = 3.5
                   *->QmkdLrhlarqmkrr<-*
                      Q  dL ++++qmk +   
      T00731   165    QSHDLADAVKQMKEK    179  

UCR_hinge: domain 1 of 1, from 189 to 202: score 0.9, E = 75
                   *->lDhCvaaKlFdsLK<-*
                      +D C aa lF+ L    
      T00731   189    VDTCQAATLFNQLQ    202  

Nitrophorin: domain 1 of 1, from 249 to 257: score -0.2, E = 52
                   *->YDdvqltSL<-*
                      YD+ +l SL   
      T00731   249    YDNASLNSL    257  

NodD_C_term: domain 1 of 1, from 263 to 282: score 7.5, E = 0.39
                   *->pelfmssaHprakLFeerlV<-*
                      p l ms+a+ r  L++ +lV   
      T00731   263    PRLLMSTAYYRTDLYQPHLV    282  

Peptidase_C13: domain 1 of 1, from 2 to 329: score 625.6, E = 2.8e-184
                   *->avfllvvLlilavvaaRdnfgdnislpsEevkffrDddghTnnWAVL
                      ++++lv+Ll+++         +++s++  ++++      hTnnWAVL
      T00731     2    KILTLVMLLCYS---------FVSSTG--DTTI------HTNNWAVL 31   

                   VAGSnGwfNYRHqA.fifDVChaYqllkrlGipDEnIIvmmyDDIAcNer
                   V++S++++ ++ + +fif+++++++++krlGipDE+II+m++DD+AcN+r
      T00731    32 VCTSRFCSLHSLVLtFIFSLLGVSRTVKRLGIPDERIILMLADDMACNAR 81   

                   NPrPGvviNhpnngtDvYggdVpvDYrGeeVTveNFlrVLtGdksavtgg
                   N++P++v+N++n+++++Yg++V+vDYrG+eVTveNFlrVLtG++++++++
      T00731    82 NEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRVLTGRHENAVPR 131  

                   SGKvLlSdpnDhIFIYyTDHGGpGvLkFPdseeLyakDLadalkkmhekk
                    +K+LlSd+++hI++Y+T+HGG+++LkF+d+eeL+++DLada+k+m+ek+
      T00731   132 -SKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR 180  

                   rYkeLvfyiEACeSGSmFegllspdLNIyAtTASnagEsSYstycDgdip
                   r+keL++++++C+++++F++l+sp  +++A+++S++gE+SYs+++D+di+
      T00731   181 RFKELMIMVDTCQAATLFNQLQSP--GVLAIGSSLKGENSYSHHLDSDIG 228  

                   sPPpvyvtcLgDlYSvaWlEdsekHnlskeTLqqqYksvkkrtclynysy
                       v+v++++++Y++a++E++++  +++++L+++++s+++r+ +     
      T00731   229 ----VSVVDRFTYYTLAFFERLNI--YDNASLNSLFRSYDPRLLM----- 267  

                   GSHVmqygDlyisklklvlftgffpavsNftivdepplrkplevvnqrDa
                    S++++++Dly+++l++v++t+ff++v++ ti+++++   ++++++++++
      T00731   268 -STAYYRTDLYQPHLVEVPVTNFFGSVME-TIHTDSA---YKAFSSKISE 312  

                   dLhtlwrkyqlanngsek<-*
                   +++++++++++ ++ +++   
      T00731   313 RKINSEMPFNQLSE-HDL    329  

LEM: domain 1 of 1, from 318 to 339: score 3.6, E = 25
                   *->mldvaqLsDaELrseLrkyGis<-*
                       ++  qLs+ +L++eL   +i+   
      T00731   318    EMPFNQLSEHDLKEELENTNIP    339  

Peptidase_A3: domain 1 of 1, from 308 to 340: score 0.5, E = 66
                   *->cslnpgdelgEeeklfntiivkiqlIEpLlEkNVcS<-*
                      +++++ +++  +e+ fn+++ + +l+E+L   N+++   
      T00731   308    SKISE-RKIN-SEMPFNQLS-EHDLKEELENTNIPN    340  

MATH: domain 1 of 1, from 331 to 351: score 1.8, E = 34
                   *->ddLeddyngylvdDsiiiEaeVkI<-*
                      ++Le+ +   +++D +i E++V     
      T00731   331    EELENTN---IPNDELIAEVTVYT    351  

//

Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Prosite
---------------------------------------------------------
|          ppsearch (c) 1994 EMBL Data Library          |
|       based on MacPattern (c) 1990-1994 R. Fuchs      |
---------------------------------------------------------

PROSITE pattern search started: Thu Feb 21 12:39:30 2002

Sequence file: T00731.fa

----------------------------------------
Sequence T00731 (428 residues):

Matching pattern PS00001 ASN_GLYCOSYLATION:
  251: NASL
  369: NLSR
Total matches: 2

Matching pattern PS00005 PKC_PHOSPHO_SITE:
   34: TSR
   57: TVK
  122: TGR
  132: SKR
  212: SLK
  307: SSK
  311: SER
  396: TNR
Total matches: 8

Matching pattern PS00006 CK2_PHOSPHO_SITE:
   17: STGD
  230: SVVD
  292: SVME
  325: SEHD
  377: SMID
Total matches: 5

Matching pattern PS00008 MYRISTYL:
   53: GVSRTV
Total matches: 1

Total no of hits in this sequence: 16

========================================

1314 pattern(s) searched in 1 sequence(s), 428 residues.
Total no of hits in all sequences: 16.
Search time: 00:00 min

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Profile Search

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with motif search against own library
     ***** bioMotif : Version V41a DB, 1999 Nov 11 *****
          SeqTyp=2 : PROTEIN  search; 


>APC D-Box is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>ER-GOLGI-traffic signal is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M minimal SH3 binding  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name

>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. ;LENGTH=428; DIRECT_SEQUENCE
n 1 solutions 
m %_RSXXP 259-263
f

>STATISTICS Total   : 1 solutions in 1 sequences, 428 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>EXTRACELL-M minimal furin protease cleavage site motif  is the MOTIF name

>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. ;LENGTH=428; DIRECT_SEQUENCE
n 2 solutions 
m %_RXXR 131-134
f
m %_RXXR 420-423
f

>STATISTICS Total   : 2 solutions in 1 sequences, 428 units;  out of 1 sequences, 428 units

>EXTRACELL-M extended furin protease cleavage site motif  is the MOTIF name

>T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. ;LENGTH=428; DIRECT_SEQUENCE
n 1 solutions 
m %_RX 131-132 %_K 133-133 %_R 134-134
f

>STATISTICS Total   : 1 solutions in 1 sequences, 428 units;  out of 1 sequences, 428 units

>EXTRACELL-M  zinc binding motif in MMPs is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>EXTRACELL-M g alpha binding go loco is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SV40 LrgT experimentally determined  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS  hDNAtopoII experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS  hDNAtopoII experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS Max experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>PDZ domain binding motif science 278_2075_pawson is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units

>WW domain binding motif science 278_2075_pawson is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 428 units


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~

Start with HMM-search search against own library
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm.lib
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm-f.lib
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

L. Aravind's signalling DB+ PSSM from other authors
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.
         (428 letters)

Searching..................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

CYCLIN Cyclin/TFIIB domain                                         25  0.14
MATH The Meprin associated TRAF homology domain                    22  1.2
SEC14D Sec14 related lipid binding domain                          21  3.6
INSL Insulinase like Metallo protease domain                       21  3.7
AN1 AN1 like cysteine rich zinc coordinating domain                21  3.9
CATH  Cathepsin like protease domain                               21  4.6
KIN Protein kinase domain                                          20  5.6
UBHYD  Ubiquitin C-terminal hydrolase domain                       20  5.7
FKBP FK506 binding protein (Peptidyl prolyl isomerase)             20  5.9
CYCL cyclophilin like peptidyl prolyl isomerases                   20  6.5
14-3-3 14-3-3 protein alpha Helical domain                         20  9.1

>CYCLIN Cyclin/TFIIB domain 
          Length = 317

 Score = 25.4 bits (55), Expect = 0.14
 Identities = 8/44 (18%), Positives = 8/44 (18%)

Query: 229 VSVVDRFTYYTLAFFERLNIYDNASLNSLFRSYDPRLLMSTAYY 272
                                                       
Sbjct: 61  FCSVFKPAMPRSVVGTACMYFKRFYLNNSVMEYHPRIIMLTCAF 104


>MATH The Meprin associated TRAF homology domain 
          Length = 209

 Score = 22.1 bits (47), Expect = 1.2
 Identities = 7/34 (20%), Positives = 7/34 (20%)

Query: 321 FNQLSEHDLKEELENTNIPNDELIAEVTVYTLFP 354
                                             
Sbjct: 133 FKKFIRRDFLLDEANGLLPDDKLTLFCEVSVVQD 166


>SEC14D Sec14 related lipid binding domain 
          Length = 248

 Score = 21.0 bits (44), Expect = 3.6
 Identities = 15/91 (16%), Positives = 15/91 (16%), Gaps = 10/91 (10%)

Query: 111 EVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLA 170
                                                                       
Sbjct: 101 EITFDEILQAYCFILEKLLENEETQI--NGFCIIENFKG------FTMQQAASLRTSDLR 152

Query: 171 DAVKQMKEKRRFKELMIMVDTCQAATLFNQL 201
                                          
Sbjct: 153 KMVDMLQD--SFPARFKAIHFIHQPWYFTTT 181


>INSL Insulinase like Metallo protease domain 
          Length = 433

 Score = 20.6 bits (43), Expect = 3.7
 Identities = 8/76 (10%), Positives = 8/76 (10%), Gaps = 12/76 (15%)

Query: 104 EVDYRGYEVTV-ENFLRVLTGRHENAVPRSKRLLSDEGS-----------HILLYMTGHG 151
                                                                       
Sbjct: 15  VLTAQELYIRDLPNGAKLIVKPRDDTEAVALHVWFRVGSVYEKYDEKGMAHFLEHMLFNG 74

Query: 152 GDEFLKFQDAEELQSH 167
                           
Sbjct: 75  TEKYKYGEIDRIIESL 90


>AN1 AN1 like cysteine rich zinc coordinating domain 
          Length = 57

 Score = 20.8 bits (43), Expect = 3.9
 Identities = 5/11 (45%), Positives = 5/11 (45%)

Query: 33 CTSRFCSLHSL 43
                     
Sbjct: 27 CSRRYCLSHHL 37


>CATH  Cathepsin like protease domain 
          Length = 371

 Score = 20.6 bits (43), Expect = 4.6
 Identities = 9/89 (10%), Positives = 9/89 (10%), Gaps = 13/89 (14%)

Query: 15  VSSTGDTTIHTNNWAVLVCT-SRFCSLHSLVLTFIFSLLGVSRTVKRLGIPDERIILMLA 73
                                                                       
Sbjct: 235 VDVDNGLTVCKDGCEAIVDTGTSLITGPTDEIKQLQKAIGAKPIIKGQYMLP-------- 286

Query: 74  DDMACNARNEYPAQVFNNENHKLNLYGDN 102
                                        
Sbjct: 287 ----CDKLSSLPNVNLVLGGKSYALTPNQ 311


>KIN Protein kinase domain 
          Length = 313

 Score = 20.0 bits (41), Expect = 5.6
 Identities = 11/51 (21%), Positives = 11/51 (21%)

Query: 290 FGSVMETIHTDSAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPN 340
                                                              
Sbjct: 42  YGVVCSAKDNLTGEKVAIKKISKAFDNLKDTKRTLREIHLLRHFKHENLIS 92


 Score = 20.0 bits (41), Expect = 5.9
 Identities = 14/83 (16%), Positives = 14/83 (16%), Gaps = 9/83 (10%)

Query: 162 EELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGEN---- 217
                                                                       
Sbjct: 112 SELMDTDLHQ---IITSPQPLSDDHCQYFVYQMLRGLKHIHSANV--LHRDLKPSNLLIN 166

Query: 218 SYSHHLDSDIGVSVVDRFTYYTL 240
                                  
Sbjct: 167 EDCLLKICDLGLARVEDATHQGF 189


>UBHYD  Ubiquitin C-terminal hydrolase domain 
          Length = 884

 Score = 19.9 bits (41), Expect = 5.7
 Identities = 10/53 (18%), Positives = 10/53 (18%), Gaps = 6/53 (11%)

Query: 283 EVPVTNFFGSVMETI--HTDSAYKAFSSKISERKINSEMPFNQLSEHDLKEEL 333
                                                                
Sbjct: 274 EAIEHNYGGHDDDLSVRHCTNAYMLV----YIRESKLSEVLQAVTDHDIPQQL 322


>FKBP FK506 binding protein (Peptidyl prolyl isomerase) 
          Length = 149

 Score = 20.2 bits (42), Expect = 5.9
 Identities = 2/11 (18%), Positives = 2/11 (18%)

Query: 100 GDNVEVDYRGY 110
                      
Sbjct: 8   NSAVLVHFTLK 18


>CYCL cyclophilin like peptidyl prolyl isomerases 
          Length = 165

 Score = 20.1 bits (42), Expect = 6.5
 Identities = 9/28 (32%), Positives = 9/28 (32%), Gaps = 7/28 (25%)

Query: 96  LNLYGDNVEVDYRGYEVTVENFLRVLTG 123
                                       
Sbjct: 22  FELFADKV-------PKTAENFRALSTG 42


>14-3-3 14-3-3 protein alpha Helical domain 
          Length = 270

 Score = 19.5 bits (40), Expect = 9.1
 Identities = 8/22 (36%), Positives = 8/22 (36%), Gaps = 1/22 (4%)

Query: 2   KILTLVM-LLCYSFVSSTGDTT 22
                                 
Sbjct: 217 KDSTLIMQLLRDNLTLWTSDAE 238


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 105
Number of sequences better than 10.0: 11 
Number of calls to ALIGN: 12 
Length of query: 428 
Total length of test sequences: 20182  
Effective length of test sequences: 16435.0
Effective search space size: 6453006.9
Initial X dropoff for ALIGN: 25.0 bits

Y. Wolf's SCOP PSSM
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana.
         (428 letters)

Searching.................................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

gi|478310 [306..563] beta/alpha (TIM)-barrel                       31  0.045
gi|1790450 [741..896] Flavodoxin-like                              26  0.86
gi|999515 [1..176] NAD(P)-binding Rossmann-fold domains            26  1.2
gi|2088870 [834..1045] (Phosphotyrosine) protein phosphatase...    25  2.1
gi|730428 [2..438] P-loop containing nucleotide triphosphate...    25  2.4
gi|1942733 [1..168] Lysozyme-like                                  25  2.7
gi|2194029 [140..303] Ferredoxin reductase-like, C-terminal ...    24  3.6
gi|1172572 [1..540] Phosphoenolpyruvate carboxykinase (ATP-o...    24  4.7
gi|2707940 [1..184] Ribonuclease H-like motif                      24  5.0
gi|555731 [3..383] Serpins                                         24  5.7
gi|1817676 [157..389] Protein kinases (PK), catalytic core         24  6.3
gi|1902913 [26..315] Protein kinases (PK), catalytic core          24  7.5
gi|1708972 [46..317] FAD/NAD(P)-binding domain                     23  9.5

>gi|478310 [306..563] beta/alpha (TIM)-barrel 
          Length = 258

 Score = 30.8 bits (69), Expect = 0.045
 Identities = 13/138 (9%), Positives = 13/138 (9%), Gaps = 14/138 (10%)

Query: 247 NIYDNASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTDSAYKAF 306
                                                                       
Sbjct: 127 QSTQGGYFQTALNVKDILTVVNMQYYNSGTMLGC-----DGKVYAQGTVDFLTALACIQL 181

Query: 307 SSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVYTLFPGLSYFGLSTLLR 366
                                                                       
Sbjct: 182 EGGLAPSQVGLGLPASTRA-------AGGGYVSPSVVNAALD--CLTKATNCGSFKPSKT 232

Query: 367 YMNLSRVRVLSMIDDVFA 384
                             
Sbjct: 233 YPDLRGAMTWSTNWDATA 250


>gi|1790450 [741..896] Flavodoxin-like 
          Length = 156

 Score = 26.5 bits (58), Expect = 0.86
 Identities = 15/101 (14%), Positives = 15/101 (14%), Gaps = 20/101 (19%)

Query: 98  LYGDN-VEVDYRGYEVTVENFLRV-------------LTGRHENAVPRSKRLLSDEGSHI 143
                                                                       
Sbjct: 29  VLQCNNYEIVDLGVMVPAEKILRTAKEVNADLIGLSGLITPSLDEMVNVAKEMERQGFTI 88

Query: 144 LLYMTGHGGDEFLKFQDAEELQSH------DLADAVKQMKE 178
                                                    
Sbjct: 89  PLLIGGATTSKAHTAVKIEQNYSGPTVYVQNASRTVGVVAA 129


>gi|999515 [1..176] NAD(P)-binding Rossmann-fold domains 
          Length = 176

 Score = 25.9 bits (56), Expect = 1.2
 Identities = 26/94 (27%), Positives = 26/94 (27%), Gaps = 19/94 (20%)

Query: 140 GSHILLYMTGHG------------GDEFL--KFQDAEELQSHDLADAVKQMKEKRRFKEL 185
                                                                       
Sbjct: 14  GQNLILNMNDHGFVVCAFNRTVSKVDDFLANEAKGTKVLGAHSLEEMVSKLKKPRR---I 70

Query: 186 MIMVDTCQAATLFNQLQSPGVLAIGS-SLKGENS 218
                                             
Sbjct: 71  ILLVKAGQAVDNFIEKLVP-LLDIGDIIIDGGNS 103


>gi|2088870 [834..1045] (Phosphotyrosine) protein phosphatases II 
          Length = 212

 Score = 25.4 bits (55), Expect = 2.1
 Identities = 9/66 (13%), Positives = 9/66 (13%), Gaps = 2/66 (3%)

Query: 67  RIILMLADDM-ACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRVLTGRH 125
                                                                       
Sbjct: 49  TQLVMMCDFDEKCPGKTNSCARYYPESVGESMKFK-NLTVECKSKIAEKDFETRELEVKF 107

Query: 126 ENAVPR 131
                 
Sbjct: 108 DGHEPH 113


>gi|730428 [2..438] P-loop containing nucleotide triphosphate hydrolases 
          Length = 437

 Score = 24.9 bits (54), Expect = 2.4
 Identities = 19/90 (21%), Positives = 19/90 (21%), Gaps = 11/90 (12%)

Query: 67  RIILMLADDMACNAR-----NEYPAQVFNNENHKLNL------YGDNVEVDYRGYEVTVE 115
                                                                       
Sbjct: 18  KIVDLLTEDAKYVVRYQGGHNAGHTLVIDGEKTVLHLIPSGILRDNVKCVIGNGVVLSPE 77

Query: 116 NFLRVLTGRHENAVPRSKRLLSDEGSHILL 145
                                         
Sbjct: 78  ALLKEMKPLEERGIPVRERLFISEACPLIL 107


>gi|1942733 [1..168] Lysozyme-like 
          Length = 168

 Score = 24.8 bits (53), Expect = 2.7
 Identities = 16/58 (27%), Positives = 16/58 (27%), Gaps = 3/58 (5%)

Query: 160 DAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFN---QLQSPGVLAIGSSLK 214
                                                                     
Sbjct: 62  EAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLR 119


>gi|2194029 [140..303] Ferredoxin reductase-like, C-terminal NADP-linked domain 
          Length = 164

 Score = 24.5 bits (53), Expect = 3.6
 Identities = 10/55 (18%), Positives = 10/55 (18%), Gaps = 5/55 (9%)

Query: 133 KRLLSDEGSHILLYMTGHGG--DEFLK-FQDAEELQSHDLADAVKQMKEKRRFKE 184
                                                                  
Sbjct: 109 WQLIKNQKTHT--YICGLRGMEEGIDAALSAAAAKEGVTWSDYQKDLKKAGRWHV 161


>gi|1172572 [1..540] Phosphoenolpyruvate carboxykinase (ATP-oxaloacetate carboxy-liase) 
          Length = 540

 Score = 24.2 bits (52), Expect = 4.7
 Identities = 20/90 (22%), Positives = 20/90 (22%), Gaps = 7/90 (7%)

Query: 98  LYGDNVEVDYRGYE---VTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMT--GHGG 152
                                                                       
Sbjct: 32  LYQEELDPSLTGYERGVLTNLGAVAVDTGIFTGRSPKDKYIVRDDTTRDTFWWADKGKGK 91

Query: 153 DEFLKFQDAEELQSHDLADAVKQMKEKRRF 182
                                         
Sbjct: 92  NDNKPL--SPETWQHLKGLVTRQLSGKRLF 119


>gi|2707940 [1..184] Ribonuclease H-like motif 
          Length = 184

 Score = 24.1 bits (52), Expect = 5.0
 Identities = 7/43 (16%), Positives = 7/43 (16%), Gaps = 6/43 (13%)

Query: 73  ADDMACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVE 115
                                                      
Sbjct: 77  LDEYLNRLKPHYSVRLIKIGS------GLNETVSIGNFGGTVK 113


>gi|555731 [3..383] Serpins 
          Length = 381

 Score = 23.8 bits (51), Expect = 5.7
 Identities = 10/50 (20%), Positives = 10/50 (20%), Gaps = 1/50 (2%)

Query: 292 SVMETIHTDSAYKAFSSKISERKINSEMP-FNQLSEHDLKEELENTNIPN 340
                                                             
Sbjct: 247 GAIEVLNGNKILSHYVDKLEETSVSLKMPKFTLTKKLQLVGTLKSIGIKN 296


>gi|1817676 [157..389] Protein kinases (PK), catalytic core 
          Length = 233

 Score = 23.5 bits (49), Expect = 6.3
 Identities = 11/112 (9%), Positives = 11/112 (9%), Gaps = 1/112 (0%)

Query: 290 FGSVMETIHTDSAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTV 349
                                                                       
Sbjct: 6   LGWIYLALDRNVNGRPVVLKGLVHSGDAEAQAMAMAERQFLAEVVHPSIVQIF-NFVEHT 64

Query: 350 YTLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIEI 401
                                                               
Sbjct: 65  DRHGDPVGYIVMEYVGGQSLKRSKGQKLPVAEAIAYLLEILPALSYLHSIGL 116


>gi|1902913 [26..315] Protein kinases (PK), catalytic core 
          Length = 290

 Score = 23.5 bits (49), Expect = 7.5
 Identities = 19/147 (12%), Positives = 19/147 (12%), Gaps = 10/147 (6%)

Query: 115 ENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVK 174
                                                                       
Sbjct: 56  AIIQEVCFLKKLSGHPNIVQFC-----SAASIGKEESDTGQAEFLLLTELCKGQLVEFLR 110

Query: 175 QMKEKRRFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGENSYS----HHLDSDIGVS 230
                                                                       
Sbjct: 111 RVECKGPLSCDSILKIFYQTCRAVQHMHRQKPPIIHRDLKVENLLLSNQGTIKLCDFGSA 170

Query: 231 -VVDRFTYYTLAFFERLNIYDNASLNS 256
                                      
Sbjct: 171 TTISHYPDYSWSAQKRAMVEEEITRNT 197


>gi|1708972 [46..317] FAD/NAD(P)-binding domain 
          Length = 272

 Score = 23.0 bits (48), Expect = 9.5
 Identities = 7/36 (19%), Positives = 7/36 (19%), Gaps = 4/36 (11%)

Query: 29  AVLVCTSRFCSLHSLVLTFIFSLLGVSRTVKRLGIP 64
                                               
Sbjct: 235 EVILSAGPIGSPQLLLLSGV----GPESYLTSLNIS 266


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 1187
Number of sequences better than 10.0: 13 
Number of calls to ALIGN: 13 
Length of query: 428 
Total length of test sequences: 256703  
Effective length of test sequences: 207231.0
Effective search space size: 80033993.1
Initial X dropoff for ALIGN: 25.0 bits

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

calculation of internal repeats with prospero
***** PROSPERO v1.3  Thu Feb 21 12:40:10 2002 *****

Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford
For help see http://www.well.ox.ac.uk/ariadne  For usage use -help
using gap penalty 11+1k
using matrix BLOSUM62
printing all alignments with eval < 0.100000
using sequence1 T00731
using self-comparison


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

TIGRFAM
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/tigrfam/tigrfam.hmm
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/tigrfam/tigrfam.hmm-f
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
SMART
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/iprscan/data/smart.HMMs
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
SAP      Putative DNA-binding (bihelical) motif predi    -6.1         85   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
SAP        1/1     321   351 ..     1    35 []    -6.1       85

Alignments of top-scoring domains:
SAP: domain 1 of 1, from 321 to 351: score -6.1, E = 85
                   *->lskLkVseLkdeLkkrGLstsGrKaeLvkRLleal<-*
                      ++ L+  +Lk+eL+  + + +    eL++ ++ ++   
      T00731   321    FNQLSEHDLKEELENTNIPND----ELIAEVTVYT    351  

//
COG
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/cogs/cogs.hmm
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
COG2965                                                 -49.1         26   1
COG2143                                                 -96.1         76   1
COG1985                                                -112.1         69   1
COG2232                                                -206.6         54   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
COG2965    1/1      43   144 ..     1   103 []   -49.1       26
COG2143    1/1      28   181 ..     1   242 []   -96.1       76
COG1985    1/1      43   237 ..     1   234 []  -112.1       69
COG2232    1/1       1   336 [.     1   399 []  -206.6       54

Alignments of top-scoring domains:
COG2965: domain 1 of 1, from 43 to 144: score -49.1, E = 26
                   *->mGmtNrvsLsGvVekapvrrkSPSGIphcdfiLeHRStQeEaGfqRq
                      + +t   sL G V++  +r+    GIp   +iL     +   +  + 
      T00731    43    LVLTFIFSLLG-VSRTVKRL----GIPDERIIL-----MLADDMACN 79   

                   vwlemPVriSGrqae.........eltqsitqGSkIlVeGFlaqhkrr..
                   +  e P+++  +++ + +  +++ e  +    G  ++Ve Fl   + r++
      T00731    80 ARNEYPAQVFNNENHklnlygdnvEVDY---RGYEVTVENFLRVLTGRhe 126  

                   sGlpk.LvLhAeQiekID<-*
                   + +p++ +L  +  + I    
      T00731   127 NAVPRsKRLLSDEGSHIL    144  

COG2143: domain 1 of 1, from 28 to 181: score -96.1, E = 76
                   *->mFSLSYvmRvl.lilLliislFllAcksdNKDKLDENLLSSGsqSSK
                         L   +R+ +l  L + ++F+l   s                   
      T00731    28    WAVLVCTSRFCsLHSLVLTFIFSLLGVSR------------------ 56   

                   ELfekksnldKKSYAGLEDlvedlksikpedKYlllmFeseeCiYCeklK
                                           ++ +p ++  l+      C   +   
      T00731    57 ---------------------TVKRLGIPDERIILMLADDMACNARNEYP 85   

                   KdvfnkkrlrEylkehFsiveldikdsk.pvkfkvGdkg.NdEKeeklSe
                     vfn +  +  l +  + ve+d+ +++ +v  +  ++ +  E     S 
      T00731    86 AQVFNNENHKLNLYG--DNVEVDYRGYEvTVENFLRVLTgRHENAVPRS- 132  

                   kELArkfkVrsTPtfvFfDkkGkkIlelPGYlPpeeFllvlkYVaeekyk
                                   ++D     +l++ G              +  ++ 
      T00731   133 -------------KRLLSDEGSHILLYMTGH-------------GGDEFL 156  

                   dtktYLKKDDPFVGEPLiiEiFKEdeDfvkklkedikkkdtlskekrr<-
                   ++++                     e +     +d++k     kekrr  
      T00731   157 KFQD--------------------AEELQSHDLADAVKQM---KEKRR   181  

                   *
                    
      T00731     -   -    

COG1985: domain 1 of 1, from 43 to 237: score -112.1, E = 69
                   *->rgrPfVilKlAmSLDGKtAtasGeSkwItgeeaRadVhrlRaesdAI
                            +l +  SL G    +s+  k+   ++ R+ ++         
      T00731    43    -----LVLTFIFSLLG----VSRTVKRLGIPDERIILM--------- 71   

                   lVGsgTVLaDn...........................PsLtvRwaelpe
                          LaD+   + +++ + +  ++++++ +  +++   + R++e+  
      T00731    72 -------LADDmacnarneypaqvfnnenhklnlygdnVEVDYRGYEVTV 114  

                   gtqryargasrqPlRVvlDsr.lrvppearvldtgeAptlvvtterapee
                              + lRV   ++++ vp + r+l ++ +++l+  t++    
      T00731   115 ----------ENFLRVLTGRHeNAVPRSKRLLSDEGSHILLYMTGH---- 150  

                   rekkekledvgvevvvagdgrVDlkkllelLaerg.insvmVEGGgtLag
                       e l+   +e+++++d    l+ ++++ +e+ +++++m+        
      T00731   151 -GGDEFLKFQDAEELQSHD----LADAVKQMKEKRrFKELMI-------- 187  

                   sflkegLVDElilyiAPkilGGddartlvdglgfrkladalqlakikeve
                    +++         + A  ++        ++ +g+ ++ + l+  +   + 
      T00731   188 -MVDT--------CQAATLFN------QLQSPGVLAIGSSLKG-ENSYSH 221  

                   qiGpdlkvtarvkpke<-*
                   +++ d+ v  +++ ++   
      T00731   222 HLDSDIGVSVVDRFTY    237  

COG2232: domain 1 of 1, from 1 to 336: score -206.6, E = 54
                   *->mNNFtLFLFSCLYFisknekvLVlGvNtRpVveSakklGFeVYSvsy
                      m   tL +  C  F+s      +  +N   V    ++       v  
      T00731     1    MKILTLVMLLCYSFVSSTGDTTIHTNNW-AVLVCTSRFCSLHSLVLT 46   

                   YvdaDLkaytERRcklversdeslGRlkENydeekLleiaedlaeevDai
                   ++   L        + v r +   +      de  +l  a+d+a  +   
      T00731    47 FIFSLLGVS-----RTVKRLG--IP------DERIILMLADDMACNARNE 83   

                   vvlsgafefetekVrGndNViGNGPKkvdevsnkYkkyk.rvkNLkfkip
                    +   +f  e+ k   +           d v+  Y+ y+  v N  + + 
      T00731    84 YP-AQVFNNENHKLNLYG----------DNVEVDYRGYEvTVEN-FLRVL 121  

                   eTklikdklelyell.eeGekKyIlKPVvGaGGeeVvkieendkdfllqe
                     + +++      ll+ eG    Il    G GG           +fl+ +
      T00731   122 TGRHENAVPRSKRLLsDEGSH--ILLYMTGHGG----------DEFLKFQ 159  

                   yikGvPvsasvlarGesalavlisRnifatfkkqiiskFvYAGNmTPFiv
                     +       +l     a av    +   +fk+ +i        m     
      T00731   160 DAE-------ELQSHDLADAVKQM-KEKRRFKELMI--------MVD--- 190  

                   eeelskeleeLaseviesf..eLkGssGVDfvl.kdkelYiveiNPRiqG
                          +  L s  ++  +++LkG +     l+ d +  +v+   R+  
      T00731   191 TCQAATLFNQLQSPGVLAIgsSLKGENSYSHHLdSDIGVSVVD---RF-- 235  

                   tyesvEaSldvNLvkvhleAfdgklaekvkPrky....avkrILFApadv
                   ty +       N+ +   +A  + l    +Pr   ++ + +  L+ p   
      T00731   236 TYYTLAFFERLNIYD---NASLNSLFRSYDPRLLmstaYYRTDLYQP--- 279  

                   kikenlakrdFvhDvPkkgavie..kgePLvtVLAkenskeaveslae.e
                      e    + F       g v e+   +       +  s   ++s    +
      T00731   280 HLVEVPVTN-FF------GSVMEtiHTDSAYKAFSSKISERKINSEMPfN 322  

                   vlerekkkldleri<-*
                    l+  ++k++le+    
      T00731   323 QLSEHDLKEELENT    336  

//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/cogs/cogs.hmm-f
Sequence file:            T00731.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  T00731  hypothetical protein F22O13.26 - Arabidopsis thaliana.

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
COG1053                                                   2.8        1.4   1
COG1235                                                   2.7        8.9   1
COG1182                                                   2.5         25   1
COG1131                                                   2.4        8.3   1
COG2241                                                   0.1         37   1
COG2258                                                   0.1         58   1
COG0285                                                   0.0         34   1
COG0462                                                  -0.8         69   1
COG2920                                                  -0.9         98   1
COG3274                                                  -1.1         59   1
COG2414                                                  -1.8         78   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
COG2414    1/1      57    68 ..   674   685 .]    -1.8       78
COG2241    1/1      60    78 ..   216   234 ..     0.1       37
COG2920    1/1      90   110 ..     1    21 [.    -0.9       98
COG1235    1/1      99   115 ..   299   315 .]     2.7      8.9
COG0285    1/1     113   127 ..   472   486 .]     0.0       34
COG2258    1/1     100   128 ..   181   215 ..     0.1       58
COG3274    1/1     129   137 ..   357   365 .]    -1.1       59
COG1131    1/1      92   154 ..   267   325 .]     2.4      8.3
COG1053    1/1     149   202 ..   458   511 ..     2.8      1.4
COG1182    1/1     206   223 ..     1    21 [.     2.5       25
COG0462    1/1     246   259 ..   315   328 .]    -0.8       69

Alignments of top-scoring domains:
COG2414: domain 1 of 1, from 57 to 68: score -1.8, E = 78
                   *->tLkeLGledeva<-*
                      t k+LG++de++   
      T00731    57    TVKRLGIPDERI    68   

COG2241: domain 1 of 1, from 60 to 78: score 0.1, E = 37
                   *->rLtapdERitagtLkdlal<-*
                      rL+ pdERi   +++d a+   
      T00731    60    RLGIPDERIILMLADDMAC    78   

COG2920: domain 1 of 1, from 90 to 110: score -0.9, E = 98
                   *->knvmtmLeyeGkeietDkdGY<-*
                       n   +L++ G ++e D  GY   
      T00731    90    NNENHKLNLYGDNVEVDYRGY    110  

COG1235: domain 1 of 1, from 99 to 115: score 2.7, E = 8.9
                   *->laeevevaydgmeiyli<-*
                      ++++vev+y+g+e+ +    
      T00731    99    YGDNVEVDYRGYEVTVE    115  

COG0285: domain 1 of 1, from 113 to 127: score 0.0, E = 34
                   *->lvgevlellqrkkdk<-*
                      +v ++l++l++++++   
      T00731   113    TVENFLRVLTGRHEN    127  

COG2258: domain 1 of 1, from 100 to 128: score 0.1, E = 58
                   *->GDplklverprepapTvlelnrllfsPHqikpknp<-*
                      GD++++++r   +++Tv    r+l++    + +n+   
      T00731   100    GDNVEVDYRG--YEVTVENFLRVLTG----RHENA    128  

COG3274: domain 1 of 1, from 129 to 137: score -1.1, E = 59
                   *->iprsnkLvs<-*
                      +prs++L+s   
      T00731   129    VPRSKRLLS    137  

COG1131: domain 1 of 1, from 92 to 154: score 2.4, E = 8.3
                   *->lvglkgveevvglgvgleveveeggnkvlvevd.ae.av.ellalli
                      ++  k+ ++++ + v+++  +++++n ++v++ ++e+av+   +ll+
      T00731    92    ENH-KLNLYGDNVEVDYRGYEVTVENFLRVLTGrHEnAVpRSKRLLS 137  

                   .eginvlsi.veepsLE<-*
                   +eg  +l + + +   E   
      T00731   138 dEGSHILLYmTGHGGDE    154  

COG1053: domain 1 of 1, from 149 to 202: score 2.8, E = 1.4
                   *->GryaaeyakeaspskeaeseaeeerakkkeeeerldeLlkaeG.env
                      G +  e++k +++++e++s++++++ k+++e++r++eL+ + ++  +
      T00731   149    GHGGDEFLK-FQDAEELQSHDLADAVKQMKEKRRFKELMIMVDtCQA 194  

                   aairkelq<-*
                   a +   lq   
      T00731   195 ATLFNQLQ    202  

COG1182: domain 1 of 1, from 206 to 223: score 2.5, E = 25
                   *->MskVLviksSirgeeSvSrqL<-*
                         VL+i sS  ge+S+S+ L   
      T00731   206    ---VLAIGSSLKGENSYSHHL    223  

COG0462: domain 1 of 1, from 246 to 259: score -0.8, E = 69
                   *->rrihngeSVSsLFd<-*
                      ++i+ + S+ sLF+   
      T00731   246    LNIYDNASLNSLFR    259  

//