analysis of sequence from NP_566299.1.fa
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)
     MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS
     HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF
     DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC
     TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL
     TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI
     EAHQLWKNAE FELSLKPERV IRESCSFLFI FDIDKSSDSE PFDLGLTWKR PSKWSCQQAP
     LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC TIKANIFQIF PWYIKVYYHT
     LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP CEVKSVAISI EYDKGFLHID
     EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS LKEKSLVRSY TEVLLVPLTT
     PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK SQAGKKTGGL KQLLSRITAK
     IRGRPIEAPS SSEAESSVLS SKLILKIILV AGAAAAWQYF STDE

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

sec.str. with predator

> NP_566299.1
              .         .         .         .         .
1    MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFH   50
     ___HHHHHHHHHHHHHHHHHHH____HHHHHHHHHHH_____________

              .         .         .         .         .
51   FENRAPPSNSHGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGG  100
     _________________HHHHHHHHHHHHHHHHHHHH_____________

              .         .         .         .         .
101  FDPLSSMNAKPVGVELWAVFDVPQSQVDTSWKNLTHALSGLFCASINFLE  150
     ___________EEEEEEEEEE_________HHHHHH_____EEEEEEE__

              .         .         .         .         .
151  SSTSYAAPTWGFGPNSDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGIS  200
     __________________________EEEEE_____EEEE__________

              .         .         .         .         .
201  ALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTLTVVLQPETTS  250
     ______EEEEE______EEEEEE__________EEEE____EEE______

              .         .         .         .         .
251  VESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI  300
     __________HHHHHHH____________________________HHHHH

              .         .         .         .         .
301  EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKR  350
     HHHHHHHHHHHHH______HHHHH__EEEEE_____________EEEE__

              .         .         .         .         .
351  PSKWSCQQAPLHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQC  400
     _______________EEE_____HHHHHHHHHH_________________

              .         .         .         .         .
401  TIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSS  450
     HHHHHHHH____EEEEE___EEE___________EEEEE___________

              .         .         .         .         .
451  GMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFDIPSALISFP  500
     HHHHHHH____EEEEEEEEEE_____EEEE______________EEEE__

              .         .         .         .         .
501  DHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTTPDFSMPYNVI  550
     ____HHHHHHH______HHHHHHHHHHHHHHEEEEE___________EEE

              .         .         .         .         .
551  TITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK  600
     EHHHHHHHHH__HHHHHHHHHHHHHHHHHHH________HHHHHHHHHHH

              .         .         .         .    
601  IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE        644
     HH___________HHHHHHHHHHHHHHHHHHHHHHHHHHH____


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~


method         :         1
alpha-contents :      37.5 %
beta-contents  :      25.3 %
coil-contents  :      37.2 %
class          :     mixed


method         :         2
alpha-contents :      21.5 %
beta-contents  :      28.1 %
coil-contents  :      50.5 %
class          :     mixed


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

GPI: learning from metazoa
  0.33  -0.01  -0.03   0.00   0.00   0.00  -4.00  -0.07   0.00  -1.66  -4.30   0.00 -12.00   0.00   0.00   0.00  -21.75
  5.22  -0.45  -0.64  -1.00   0.00   0.00   0.00  -2.77  -0.31  -3.16  -4.30   0.00   0.00   0.00 -12.00   0.00  -19.40
ID: NP_566299.1	AC: xxx Len:  644 1:I   616 Sc:  -19.40 Pv: 4.089127e-02 NO_GPI_SITE
GPI: learning from protozoa
  0.71   0.00   0.00  -0.01   0.00   0.00 -12.00   0.00   0.00  -1.95 -11.99   0.00 -12.00   0.00 -12.00   0.00  -49.24
-11.77   0.00  -0.01   0.00  -4.00   0.00   0.00  -5.14   0.00  -2.09 -14.11   0.00   0.00   0.00 -12.00   0.00  -49.12
ID: NP_566299.1	AC: xxx Len:  644 1:I   615 Sc:  -49.12 Pv: 1.349119e-01 NO_GPI_SITE

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

# SignalP euk predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
NP_566299.1  0.876  25 Y  0.834  25 Y  0.989   9 Y  0.913 Y
# SignalP gram- predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
NP_566299.1  0.669 157 Y  0.458 157 Y  0.994   9 Y  0.265 N
# SignalP gram+ predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
NP_566299.1  0.734 157 Y  0.597  23 Y  0.992  10 Y  0.950 Y

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

low complexity regions: SEG 12 2.2 2.5
>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

                                  1-2    MA
                   sllrslillli    3-13   
                                 14-509  VQSFLVAIAFGSKEVEEFSEALLLKPLPDR
                                         KVLAHFHFENRAPPSNSHGRHHHLFPKAIS
                                         QLVQKFRVKEMELSFTQGRWNHEHWGGFDP
                                         LSSMNAKPVGVELWAVFDVPQSQVDTSWKN
                                         LTHALSGLFCASINFLESSTSYAAPTWGFG
                                         PNSDKLRYGSLPREAVCTENLTPWLKLLPC
                                         RDKDGISALMNRPSVYRGFYHSQRLHLSTV
                                         ESGQEGLGSGIVLEQTLTVVLQPETTSVES
                                         NMQPSWSLSSLFGRQVVGRCVLAKSSNVYL
                                         QLEGLLGYESKNVDTEIEAHQLWKNAEFEL
                                         SLKPERVIRESCSFLFIFDIDKSSDSEPFD
                                         LGLTWKRPSKWSCQQAPLHSSRFLMGSGNE
                                         RGAIAILLKATESQEKLSGRDLTNGQCTIK
                                         ANIFQIFPWYIKVYYHTLQIFVDQQQKTDS
                                         EVLKKINVSPSTDKVSSGMMEMMLELPCEV
                                         KSVAISIEYDKGFLHIDEYPPDANQGFDIP
                                         SALISFPDHHASLDFQ
             eelsnspllsslkeksl  510-526  
                                527-606  VRSYTEVLLVPLTTPDFSMPYNVITITCTI
                                         FALYFGSLLNVLRRRIGEEERFLKSQAGKK
                                         TGGLKQLLSRITAKIRGRPI
      eapssseaessvlssklilkiilv  607-630  
                                631-644  AGAAAAWQYFSTDE

low complexity regions: SEG 25 3.0 3.3
>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

                                  1-1    M
asllrslilllivqsflvaiafgskeveef    2-37   
                        sealll
                                 38-214  KPLPDRKVLAHFHFENRAPPSNSHGRHHHL
                                         FPKAISQLVQKFRVKEMELSFTQGRWNHEH
                                         WGGFDPLSSMNAKPVGVELWAVFDVPQSQV
                                         DTSWKNLTHALSGLFCASINFLESSTSYAA
                                         PTWGFGPNSDKLRYGSLPREAVCTENLTPW
                                         LKLLPCRDKDGISALMNRPSVYRGFYH
sqrlhlstvesgqeglgsgivleqtltvvl  215-264  
          qpettsvesnmqpswslssl
                                265-493  FGRQVVGRCVLAKSSNVYLQLEGLLGYESK
                                         NVDTEIEAHQLWKNAEFELSLKPERVIRES
                                         CSFLFIFDIDKSSDSEPFDLGLTWKRPSKW
                                         SCQQAPLHSSRFLMGSGNERGAIAILLKAT
                                         ESQEKLSGRDLTNGQCTIKANIFQIFPWYI
                                         KVYYHTLQIFVDQQQKTDSEVLKKINVSPS
                                         TDKVSSGMMEMMLELPCEVKSVAISIEYDK
                                         GFLHIDEYPPDANQGFDIP
salisfpdhhasldfqeelsnspllsslke  494-535  
                  kslvrsytevll
                                536-592  VPLTTPDFSMPYNVITITCTIFALYFGSLL
                                         NVLRRRIGEEERFLKSQAGKKTGGLKQ
llsritakirgrpieapssseaessvlssk  593-636  
                lilkiilvagaaaa
                                637-644  WQYFSTDE

low complexity regions: SEG 45 3.4 3.75
>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

                                  1-3    MAS
llrslilllivqsflvaiafgskeveefse    4-83   
alllkplpdrkvlahfhfenrappsnshgr
          hhhlfpkaisqlvqkfrvke
                                 84-214  MELSFTQGRWNHEHWGGFDPLSSMNAKPVG
                                         VELWAVFDVPQSQVDTSWKNLTHALSGLFC
                                         ASINFLESSTSYAAPTWGFGPNSDKLRYGS
                                         LPREAVCTENLTPWLKLLPCRDKDGISALM
                                         NRPSVYRGFYH
sqrlhlstvesgqeglgsgivleqtltvvl  215-293  
qpettsvesnmqpswslsslfgrqvvgrcv
           lakssnvylqlegllgyes
                                294-567  KNVDTEIEAHQLWKNAEFELSLKPERVIRE
                                         SCSFLFIFDIDKSSDSEPFDLGLTWKRPSK
                                         WSCQQAPLHSSRFLMGSGNERGAIAILLKA
                                         TESQEKLSGRDLTNGQCTIKANIFQIFPWY
                                         IKVYYHTLQIFVDQQQKTDSEVLKKINVSP
                                         STDKVSSGMMEMMLELPCEVKSVAISIEYD
                                         KGFLHIDEYPPDANQGFDIPSALISFPDHH
                                         ASLDFQEELSNSPLLSSLKEKSLVRSYTEV
                                         LLVPLTTPDFSMPYNVITITCTIFALYFGS
                                         LLNV
lrrrigeeerflksqagkktgglkqllsri  568-636  
takirgrpieapssseaessvlssklilki
                     ilvagaaaa
                                637-644  WQYFSTDE


low complexity regions: XNU
# Score cutoff = 21, Search from offsets 1 to 4
# both members of each repeat flagged
# lambda = 0.347, K = 0.200, H = 0.664
>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)
MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFHFENRAPPSNS
HGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGGFDPLSSMNAKPVGVELWAVF
DVPQSQVDTSWKNLTHALSGLFCASINFLESSTSYAAPTWGFGPNSDKLRYGSLPREAVC
TENLTPWLKLLPCRDKDGISALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTL
TVVLQPETTSVESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI
EAHQLWKNAEFELSLKPERVIRESCSflfifdidkssdsepFDLGLTWKRPSKWSCQQAP
LHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQCTIKANIFQIFPWYIKVYYHT
LQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHID
EYPPDANQGFDIPSALISFPDHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTT
PDFSMPYNVITITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGlkqllsritak
irgrpieapssseaessvlssklilkiilVAGAAAAWQYFSTDE
    1 -  326 MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS 
             HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF 
             DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC 
             TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL 
             TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI 
             EAHQLWKNAE FELSLKPERV IRESCS
  327 -  341   flfi fdidkssdse p
  342 -  589 FDLGLTWKR PSKWSCQQAP LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC T
             IKANIFQIF PWYIKVYYHT LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP C
             EVKSVAISI EYDKGFLHID EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS L
             KEKSLVRSY TEVLLVPLTT PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK S
             QAGKKTGG
  590 -  629   l kqllsritak irgrpieaps sseaessvls sklilkiil
  630 -  644 V AGAAAAWQYF STDE

low complexity regions: DUST
>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)
MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFHFENRAPPSNS
HGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGGFDPLSSMNAKPVGVELWAVF
DVPQSQVDTSWKNLTHALSGLFCASINFLESSTSYAAPTWGFGPNSDKLRYGSLPREAVC
TENLTPWLKLLPCRDKDGISALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTL
TVVLQPETTSVESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI
EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKRPSKWSCQQAP
LHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQCTIKANIFQIFPWYIKVYYHT
LQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHID
EYPPDANQGFDIPSALISFPDHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTT
PDFSMPYNVITITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK
IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

coiled coil prediction for NP_566299.1
sequence: 644 amino acids, 0 residue(s) in coiled coil state

    .    |     .    |     .    |     .    |     .    |     .   60
MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  120
HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  180
DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  240
TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  300
TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~222222222 * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  360
EAHQLWKNAE FELSLKPERV IRESCSFLFI FDIDKSSDSE PFDLGLTWKR PSKWSCQQAP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
22222~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  420
LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC TIKANIFQIF PWYIKVYYHT
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  480
LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP CEVKSVAISI EYDKGFLHID
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  540
EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS LKEKSLVRSY TEVLLVPLTT
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  600
PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK SQAGKKTGGL KQLLSRITAK
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~1111111 1111111~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     
IRGRPIEAPS SSEAESSVLS SKLILKIILV AGAAAAWQYF STDE
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~
---------- ---------- ---------- ---------- ----
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

prediction of transmembrane regions with toppred2

     ***********************************
     *TOPPREDM with eukaryotic function*
     ***********************************

NP_566299.1.fa.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: NP_566299.1.fa.___inter___

 (1 sequences)
MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFH
FENRAPPSNSHGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGG
FDPLSSMNAKPVGVELWAVFDVPQSQVDTSWKNLTHALSGLFCASINFLE
SSTSYAAPTWGFGPNSDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGIS
ALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTLTVVLQPETTS
VESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI
EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKR
PSKWSCQQAPLHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQC
TIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSS
GMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFDIPSALISFP
DHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTTPDFSMPYNVI
TITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK
IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE


(p)rokaryotic or (e)ukaryotic: e


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 2 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1     6    26   1.786 Certain
     2   133   153   0.921 Putative
     3   548   568   1.815 Certain
     4   622   642   1.055 Certain

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     3     4
 Loop length     5   521    53     2
 K+R profile  2.00       14.00      
                       +        0.00      
CYT-EXT prof     -           -      
                    0.85           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 16.00
Tm probability: 1.00
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 3.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.85
-> Orientation: N-in

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     2     3     4
 Loop length     5   106   394    53     2
 K+R profile  2.00           +        0.00      
                       +       14.00      
CYT-EXT prof     -        0.75           -      
                    0.93           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -12.00
Tm probability: 0.80
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 3.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.18
-> Orientation: N-in

----------------------------------------------------------------------

"NP_566299" 644 
 6 26 #t 1.78646
 133 153 #f 0.920833
 548 568 #t 1.81458
 622 642 #t 1.05521


     ************************************
     *TOPPREDM with prokaryotic function*
     ************************************

NP_566299.1.fa.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: NP_566299.1.fa.___inter___

 (1 sequences)
MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFH
FENRAPPSNSHGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGG
FDPLSSMNAKPVGVELWAVFDVPQSQVDTSWKNLTHALSGLFCASINFLE
SSTSYAAPTWGFGPNSDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGIS
ALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTLTVVLQPETTS
VESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI
EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKR
PSKWSCQQAPLHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQC
TIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSS
GMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFDIPSALISFP
DHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTTPDFSMPYNVI
TITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK
IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE


(p)rokaryotic or (e)ukaryotic: p


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 2 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1     6    26   1.786 Certain
     2   133   153   0.921 Putative
     3   548   568   1.815 Certain
     4   622   642   1.055 Certain

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     3     4
 Loop length     5   521    53     2
 K+R profile  2.00       14.00      
                       +        0.00      
CYT-EXT prof     -           -      
                    0.85           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 16.00
Tm probability: 1.00
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 3.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.85
-> Orientation: N-in

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     2     3     4
 Loop length     5   106   394    53     2
 K+R profile  2.00           +        0.00      
                       +       14.00      
CYT-EXT prof     -        0.75           -      
                    0.93           -      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: -12.00
Tm probability: 0.80
-> Orientation: N-out

Charge-difference over N-terminal Tm (+-15 residues): 3.00
 (NEG-POS)/(NEG+POS): -1.0000
                 NEG: 0.0000
                 POS: 1.0000
-> Orientation: N-in

CYT-EXT difference:  -0.18
-> Orientation: N-in

----------------------------------------------------------------------

"NP_566299" 644 
 6 26 #t 1.78646
 133 153 #f 0.920833
 548 568 #t 1.81458
 622 642 #t 1.05521


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

SAPS.  Version of April 11, 1996.
Date run: Mon Feb 25 11:33:19 2002

File: /people/b_eisen/NP_566299.1.fa.___saps___
ID   NP_566299.1
DE   (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

number of residues:  644;   molecular weight:  72.2 kdal
 
         1  MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS 
        61  HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF 
       121  DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC 
       181  TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL 
       241  TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI 
       301  EAHQLWKNAE FELSLKPERV IRESCSFLFI FDIDKSSDSE PFDLGLTWKR PSKWSCQQAP 
       361  LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC TIKANIFQIF PWYIKVYYHT 
       421  LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP CEVKSVAISI EYDKGFLHID 
       481  EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS LKEKSLVRSY TEVLLVPLTT 
       541  PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK SQAGKKTGGL KQLLSRITAK 
       601  IRGRPIEAPS SSEAESSVLS SKLILKIILV AGAAAAWQYF STDE

--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s)

A  : 37( 5.7%); C  :  9( 1.4%); D  : 24( 3.7%); E  : 46( 7.1%); F  : 33( 5.1%)
G  : 36( 5.6%); H  : 17( 2.6%); I  : 34( 5.3%); K  : 37( 5.7%); L  : 80(12.4%)
M  : 11( 1.7%); N  : 21( 3.3%); P  : 33( 5.1%); Q  : 28( 4.3%); R  : 28( 4.3%)
S+ : 76(11.8%); T  : 29( 4.5%); V  : 38( 5.9%); W  : 12( 1.9%); Y  : 15( 2.3%)

KR      :   65 ( 10.1%);   ED      :   70 ( 10.9%);   AGP     :  106 ( 16.5%);
KRED    :  135 ( 21.0%);   KR-ED   :   -5 ( -0.8%);   FIKMNY  :  151 ( 23.4%);
LVIFM   :  196 ( 30.4%);   ST      :  105 ( 16.3%).

--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS
 
         1  00000+0000 0000000000 00000+-0-- 00-0000+00 0-++000000 0-0+000000 
        61  00+000000+ 0000000+0+ 0+-0-00000 0+000-0000 0-0000000+ 0000-00000 
       121  -000000-00 0+00000000 000000000- 0000000000 000000-+0+ 00000+-000 
       181  0-000000+0 000+-+-000 0000+0000+ 000000+000 000-000-00 000000-000 
       241  000000-000 0-00000000 000000+000 0+0000+000 00000-0000 0-0+00-0-0 
       301  -00000+00- 0-000+0-+0 0+-0000000 0-0-+00-0- 00-00000++ 00+0000000 
       361  0000+00000 00-+000000 0+00-00-+0 00+-000000 00+0000000 0000+00000 
       421  00000-000+ 0-0-00++00 00000-+000 000-000-00 0-0+000000 -0-+00000- 
       481  -000-00000 -000000000 -00000-00- -000000000 0+-+000+00 0-00000000 
       541  0-00000000 0000000000 00000000++ +00---+00+ 0000++0000 +0000+000+ 
       601  0+0+00-000 00-0-00000 0+000+0000 0000000000 00--

A. CHARGE CLUSTERS.


Positive charge clusters (cmin =  9/30 or 12/45 or 15/60):

 1) From  577 to  604:   RFLKSQAGKKTGGLKQLLSRITAKIRGR
                         +00+0000++0000+0000+000+0+0+
    quartile: 4; size: 28, +count:  9, -count:  0, 0count: 19; t-value:  3.87
    L:  4 (14.3%);  G:  4 (14.3%);  K:  5 (17.9%);  R:  4 (14.3%);
    LVIFM:  7 (25.0%);


Negative charge clusters (cmin = 10/30 or 13/45 or 16/60):  none


Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60):  none


B. HIGH SCORING (UN)CHARGED SEGMENTS.

There are no high scoring positive charge segments.
There are no high scoring negative charge segments.
There are no high scoring mixed charge segments.
There are no high scoring uncharged segments.


C. CHARGE RUNS AND PATTERNS.

pattern  (+)|  (-)|  (*)|  (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)|
lmin0     5 |   5 |   7 |  40 |  10 |  10 |  13 |  11 |  12 |  15 |   7 |   9 | 
lmin1     6 |   6 |   9 |  49 |  12 |  12 |  15 |  14 |  14 |  19 |   9 |  11 | 
lmin2     7 |   7 |  10 |  54 |  13 |  13 |  17 |  16 |  16 |  21 |  10 |  13 | 
 (Significance level: 0.010000; Minimal displayed length:  6)
There are no charge runs or patterns exceeding the given minimal lengths.

Run count statistics:

  +  runs >=   3:   1, at  569;
  -  runs >=   3:   1, at  574;
  *  runs >=   5:   0
  0  runs >=  27:   0

--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES

1. HIGH SCORING SEGMENTS.
There are no high scoring hydrophobic segments.

____________________________________
High scoring transmembrane segments:

   5.00 (LVIF)   2.00 (AGM)   0.00 (BZX)  -1.00 (YCW)  -2.00 (ST)
  -6.00 (P)  -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED)

 Expected score/letter:  -3.427
 M_0.01=  65.88; M_0.05=  54.48;     M_0.30=  40.92

 1) From    8 to   24:  length= 17, score=54.00 
       8  LILLLIVQSF LVAIAFG
    L:  5(29.4%);  A:  2(11.8%);  V:  2(11.8%);  I:  3(17.6%);
    F:  2(11.8%);


2. SPACINGS OF C.


H2N-142-C-36-C-12-C-79-C-51-C-30-C-43-C-60-C-92-C-90-COOH


2*. SPACINGS OF C and H. (additional deluxe function for ALEX)


H2N-47-H-1-H-10-H-2-H-H-H-28-H-1-H-38-H-6-C-36-C-12-C-20-H-4-H-53-C-29-H-21-C-30-C-5-H-37-C-18-H-41-C-16-H-23-H-H-50-C-90-COOH

--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.

A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length:  5

B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
   (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length:  9

--------------------------------------------------------------------------------

MULTIPLETS.

A. AMINO ACID ALPHABET.

1. Total number of amino acid multiplets:  45  (Expected range:  20-- 58)

2. Histogram of spacings between consecutive amino acid multiplets:
   (1-5) 18   (6-10) 11   (11-20) 9   (>=21) 8

3. Clusters of amino acid multiplets (cmin = 12/30 or 15/45 or 18/60):  none


B. CHARGE ALPHABET.

1. Total number of charge multiplets:  10  (Expected range:   2-- 24)
   5 +plets (f+: 10.1%), 5 -plets (f-: 10.9%)
   Total number of charge altplets: 16 (Critical number: 27)

2. Histogram of spacings between consecutive charge multiplets:
   (1-5) 2   (6-10) 1   (11-20) 1   (>=21) 7

--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.

A. AMINO ACID ALPHABET (core:  4; !-core: 5)

Location	Period	Element		Copies	Core	Errors
 633- 636	 1	A         	 4	 4  	 0


B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core:  5; !-core: 6)
   and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core:  6; !-core: 9)

Location	Period	Element		Copies	Core	Errors
   1-  24	 4	i...      	 6	 6  	 0
   1-  54	 9	i.....0.. 	 6	 6  	/0/./././././2/././
   8-  14	 1	i         	 7	 7  	 0
 471- 520	10	-00000000.	 5	 5  	/0/0/1/1/1/0/1/0/0/./
 490- 543	 9	i..000... 	 6	 6  	/0/././2/2/1/./././


--------------------------------------------------------------------------------
SPACING ANALYSIS.

Location (Quartile) Spacing     Rank       P-value   Interpretation

  53-  59  (1.)     N(   6)N    22 of  22   0.0081   large minimal spacing


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Pfam (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/Pfam
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model        Description                                Score    E-value  N 
--------     -----------                                -----    ------- ---
PDGF         Platelet-derived growth factor (PDGF)        0.5         61   1
Paramyx_ncap Paramyxovirus nucleocapsid protein           0.2         16   1
MIP          Major intrinsic protein                     -0.2         37   1
PsbN         Photosystem II reaction centre N protein   -13.4         96   1
NTR          NTR/C345C module                           -34.1         95   1

Parsed for domains:
Model        Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------     ------- ----- -----    ----- -----      -----  -------
Paramyx_ncap   1/1       1    10 [.     1    10 [.     0.2       16
PDGF           1/1     119   130 ..     7    19 ..     0.5       61
PsbN           1/1     134   178 ..     1    44 []   -13.4       96
NTR            1/1     303   403 ..     1   123 []   -34.1       95
MIP            1/1     629   639 ..   258   268 .]    -0.2       37

Alignments of top-scoring domains:
Paramyx_ncap: domain 1 of 1, from 1 to 10: score 0.2, E = 16
                   *->mAsLLksLaL<-*
                      mAsLL+sL L   
  NP_566299.     1    MASLLRSLIL    10   

PDGF: domain 1 of 1, from 119 to 130: score 0.5, E = 61
                   *->lveIfreyvDrTe<-*
                      ++++++++vD T+   
  NP_566299.   119    VFDVPQSQVD-TS    130  

PsbN: domain 1 of 1, from 134 to 178: score -13.4, E = 96
                   *->MEtiAtvltIFlas..LLlsiTgYSiYt.sFGPpSkeLrDPFEEHEd
                      +     ++ +F as ++L s T+Y+  t +FGP+S +Lr      E 
  NP_566299.   134    LTH--ALSGLFCASinFLESSTSYAAPTwGFGPNSDKLRYGSLPREA 178  

                   <-*
                      
  NP_566299.     -     -    

NTR: domain 1 of 1, from 303 to 403: score -34.1, E = 95
                   *->lkkaCkpdRvayvykVkvldeeeedwfdvdkRqEiiytvtileViKs
                      ++   +++ +   +k  ++ + e+ +f         ++++i    Ks
  NP_566299.   303    HQLWKNAE-FELSLKP-ERVIRESCSF--------LFIFDID---KS 336  

                   GsgddergpgslrtfisdisCrcplilvkgkdYLiMGqsstwdekgglqy
                   +  +++   g ++   s+ sC+++++   +    +MG+ +   e+g + +
  NP_566299.   337 S-DSEPFDLGLTWKRPSKWSCQQAPLHSSRF---LMGSGN---ERGAIAI 379  

                   ilgsdvitWiEeWprelkcqqrrlqk<-*
                   +l     + +E+++ ++  + +   k   
  NP_566299.   380 LLKAT--ESQEKLSGRDLTNGQCTIK    403  

MIP: domain 1 of 1, from 629 to 639: score -0.2, E = 37
                   *->liGAalaalvY<-*
                      l+++a+aa+ Y   
  NP_566299.   629    LVAGAAAAWQY    639  

//

Start with PfamFrag (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/PfamFrag
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model         Description                               Score    E-value  N 
--------      -----------                               -----    ------- ---
KIX           KIX domain                                  3.6         15   1
PsbN          Photosystem II reaction centre N protei     1.6         93   1
Glucokinase   Glucokinase                                 1.0         19   1
CDI           Cyclin-dependent kinase inhibitor           0.7         80   1
PDGF          Platelet-derived growth factor (PDGF)       0.5         61   1
complex1_24kD Respiratory-chain NADH dehydrogenase 24     0.3         69   1
Paramyx_ncap  Paramyxovirus nucleocapsid protein          0.2         16   1
HypA          Hydrogenase expression/synthesis hypA f     0.1         56   1
MIP           Major intrinsic protein                    -0.2         37   1
DUF212        Uncharacterized BCR, COG1963               -0.2         79   1

Parsed for domains:
Model         Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------      ------- ----- -----    ----- -----      -----  -------
Paramyx_ncap    1/1       1    10 [.     1    10 [.     0.2       16
complex1_24kD   1/1      64    80 ..   145   161 .]     0.3       69
HypA            1/1      79    87 ..   113   121 .]     0.1       56
PDGF            1/1     119   130 ..     7    19 ..     0.5       61
PsbN            1/1     161   170 ..    27    36 ..     1.6       93
Glucokinase     1/1     230   239 ..   344   353 .]     1.0       19
CDI             1/1     237   258 ..    85   108 .]     0.7       80
KIX             1/1     417   434 ..    64    81 .]     3.6       15
DUF212          1/1     509   521 ..     1    13 [.    -0.2       79
MIP             1/1     629   639 ..   258   268 .]    -0.2       37

Alignments of top-scoring domains:
Paramyx_ncap: domain 1 of 1, from 1 to 10: score 0.2, E = 16
                   *->mAsLLksLaL<-*
                      mAsLL+sL L   
  NP_566299.     1    MASLLRSLIL    10   

complex1_24kD: domain 1 of 1, from 64 to 80: score 0.3, E = 69
                   *->yEdLTpekieeLLdrlk<-*
                      + +L p++i +L  +++   
  NP_566299.    64    HHHLFPKAISQLVQKFR    80   

HypA: domain 1 of 1, from 79 to 87: score 0.1, E = 56
                   *->LrIkslEVe<-*
                      +r+k++E++   
  NP_566299.    79    FRVKEMELS    87   

PDGF: domain 1 of 1, from 119 to 130: score 0.5, E = 61
                   *->lveIfreyvDrTe<-*
                      ++++++++vD T+   
  NP_566299.   119    VFDVPQSQVD-TS    130  

PsbN: domain 1 of 1, from 161 to 170: score 1.6, E = 93
                   *->sFGPpSkeLr<-*
                      +FGP+S +Lr   
  NP_566299.   161    GFGPNSDKLR    170  

Glucokinase: domain 1 of 1, from 230 to 239: score 1.0, E = 19
                   *->lGAgvaleqs<-*
                      lG+g+ leq+   
  NP_566299.   230    LGSGIVLEQT    239  

CDI: domain 1 of 1, from 237 to 258: score 0.7, E = 80
                   *->pstslvllqpseaePaeEskedls<-*
                      ++t+ v+lqp    +++Es+  +s   
  NP_566299.   237    EQTLTVVLQP--ETTSVESNMQPS    258  

KIX: domain 1 of 1, from 417 to 434: score 3.6, E = 15
                   *->YYhLlaekiykiqKeLqe<-*
                      YYh l++ +++ qK+  e   
  NP_566299.   417    YYHTLQIFVDQQQKTDSE    434  

DUF212: domain 1 of 1, from 509 to 521: score -0.2, E = 79
                   *->rAlltNevlLSsL<-*
                       + l+N +lLSsL   
  NP_566299.   509    QEELSNSPLLSSL    521  

MIP: domain 1 of 1, from 629 to 639: score -0.2, E = 37
                   *->liGAalaalvY<-*
                      l+++a+aa+ Y   
  NP_566299.   629    LVAGAAAAWQY    639  

//

Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Prosite
---------------------------------------------------------
|          ppsearch (c) 1994 EMBL Data Library          |
|       based on MacPattern (c) 1990-1994 R. Fuchs      |
---------------------------------------------------------

PROSITE pattern search started: Mon Feb 25 11:35:38 2002

Sequence file: NP_566299.1.fa

----------------------------------------
Sequence NP_566299.1 (644 residues):

Matching pattern PS00001 ASN_GLYCOSYLATION:
  133: NLTH
Total matches: 1

Matching pattern PS00004 CAMP_PHOSPHO_SITE:
  349: KRPS
Total matches: 1

Matching pattern PS00005 PKC_PHOSPHO_SITE:
  130: SWK
  166: SDK
  215: SQR
  314: SLK
  347: TWK
  363: SSR
  391: SGR
  401: TIK
  445: TDK
  520: SLK
  598: TAK
  620: SSK
Total matches: 12

Matching pattern PS00006 CK2_PHOSPHO_SITE:
  125: SQVD
  221: STVE
  225: SGQE
  249: TSVE
  298: TEIE
  337: SDSE
  370: SGNE
  391: SGRD
  431: TDSE
  498: SFPD
  520: SLKE
  529: SYTE
  539: TTPD
  610: SSSE
  612: SEAE
  641: STDE
Total matches: 16

Matching pattern PS00008 MYRISTYL:
  140: GLFCAS
  229: GLGSGI
  371: GNERGA
  562: GSLLNV
  632: GAAAAW
Total matches: 5

Matching pattern PS00009 AMIDATION:
  583: AGKK
Total matches: 1

Total no of hits in this sequence: 36

========================================

1314 pattern(s) searched in 1 sequence(s), 644 residues.
Total no of hits in all sequences: 36.
Search time: 00:00 min

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Profile Search

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with motif search against own library
     ***** bioMotif : Version V41a DB, 1999 Nov 11 *****
          SeqTyp=2 : PROTEIN  search; 


>APC D-Box is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>ER-GOLGI-traffic signal is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M minimal SH3 binding  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>EXTRACELL-M minimal furin protease cleavage site motif  is the MOTIF name

>NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) ;LENGTH=644; DIRECT_SEQUENCE
n 1 solutions 
m %_RXXR 319-322
f

>STATISTICS Total   : 1 solutions in 1 sequences, 644 units;  out of 1 sequences, 644 units

>EXTRACELL-M extended furin protease cleavage site motif  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>EXTRACELL-M  zinc binding motif in MMPs is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>EXTRACELL-M g alpha binding go loco is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SV40 LrgT experimentally determined  is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS  hDNAtopoII experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS  hDNAtopoII experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS Max experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>PDZ domain binding motif science 278_2075_pawson is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units

>WW domain binding motif science 278_2075_pawson is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 644 units


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~

Start with HMM-search search against own library
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm.lib
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm-f.lib
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

L. Aravind's signalling DB+ PSSM from other authors
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= NP_566299.1 (NM_111594) expressed protein [Arabidopsis
thaliana] (gb|AAK92821.1|)
         (644 letters)

Searching..................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

AAA AAA+ ATPase Module                                             26  0.18
S1  S1 RNA binding domain                                          25  0.28
AP2  A plant specific DNA binding domain (Apetala 2 like)          24  0.43
INSL Insulinase like Metallo protease domain                       22  2.5
ARM Armadillo repeat                                               21  5.4
CALC Calcineurin like Phosphoesterase domain                       20  8.0
UBHYD  Ubiquitin C-terminal hydrolase domain                       20  9.4
CALMO Calmodulin like EF-hand domains                              20  9.4

>AAA AAA+ ATPase Module 
          Length = 298

 Score = 25.7 bits (55), Expect = 0.18
 Identities = 21/172 (12%), Positives = 21/172 (12%), Gaps = 13/172 (7%)

Query: 369 GSGNERGAIAILLKATESQEKLSGRDLTNGQCTIKANIFQIFPWYIKVYYHTLQIFVDQQ 428
                                                                       
Sbjct: 90  GTGKTLLARAVAHHTDCTFIRVSGSELVQKFIGEGARMVRELFVMAREHAPSI-IFMDEI 148

Query: 429 QKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAI------------SIEYDKGF 476
                                                                       
Sbjct: 149 DSIGSRLEGGSGGDSEVQRTMLELLNQLDGFEATKNIKVIMATNRIDILDSALLRPGRID 208

Query: 477 LHIDEYPPDANQGFDIPSALISFPDHHASLDFQEELSNSPLLSSLKEKSLVR 528
                                                               
Sbjct: 209 RKIEFPPPNEEARLDILKIHSRKMNLTRGINLRKIAELMPGASGAEVKGVCT 260


>S1  S1 RNA binding domain 
          Length = 305

 Score = 24.9 bits (54), Expect = 0.28
 Identities = 12/74 (16%), Positives = 12/74 (16%), Gaps = 11/74 (14%)

Query: 418 YHTLQI-FVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVA-----ISIE 471
                                                                       
Sbjct: 160 VLKAHILEANQDNNKLVLTQRRIQQAESMGKIAAGNIYE-----GKVAKIQPYGVFVEIE 214

Query: 472 YDKGFLHIDEYPPD 485
                         
Sbjct: 215 GVTGLLHVSQVSGT 228


>AP2  A plant specific DNA binding domain (Apetala 2 like) 
          Length = 218

 Score = 24.4 bits (52), Expect = 0.43
 Identities = 8/28 (28%), Positives = 8/28 (28%)

Query: 493 PSALISFPDHHASLDFQEELSNSPLLSS 520
                                       
Sbjct: 85  ASAILNFPDLAGSFPRPSSLSPRDIQVA 112


>INSL Insulinase like Metallo protease domain 
          Length = 433

 Score = 21.8 bits (46), Expect = 2.5
 Identities = 20/79 (25%), Positives = 20/79 (25%), Gaps = 11/79 (13%)

Query: 462 EVKSVAISIEYDKGFLHIDEYPPDANQ------GFDIPSALISFPDHHASLDFQEELSN- 514
                                                                       
Sbjct: 231 PVPKVQIPTEPEQIGIRFKKLKDPRIEKAYWIIGWRVPA--IGKTDYKGLLVFSEILCGG 288

Query: 515 --SPLLSSLKEKSLVRSYT 531
                              
Sbjct: 289 RISVFYRELREKGLVYSYS 307


>ARM Armadillo repeat 
          Length = 532

 Score = 20.6 bits (43), Expect = 5.4
 Identities = 4/34 (11%), Positives = 4/34 (11%)

Query: 566 NVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITA 599
                                             
Sbjct: 50  REGMQALQGFPSASAASVDKKLDSLKDMVAGVWS 83


>CALC Calcineurin like Phosphoesterase domain 
          Length = 274

 Score = 20.1 bits (41), Expect = 8.0
 Identities = 20/144 (13%), Positives = 20/144 (13%), Gaps = 15/144 (10%)

Query: 8   LILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFHFENRAPPSNSHGRHHHL 67
                                                                       
Sbjct: 123 ALLLDSQVYGVPHGQLSQHQLDLLKETLGKNPERYTLVVLHHHLLPTNSAWLDQHNLRN- 181

Query: 68  FPKAISQLVQKFRVKEM----ELSFTQGRWNHEHWGGFDPLSSMNAKPVGVEL------- 116
                                                                       
Sbjct: 182 SHELAEVLAPFTNVKAILYGHIHQEVNSEWNGYQVMA-TPATCIQFKPDCQYFSLDTLQP 240

Query: 117 -WAVFDV-PQSQVDTSWKNLTHAL 138
                                   
Sbjct: 241 GWREIELHSDGSIRTEVKRIQQAE 264


>UBHYD  Ubiquitin C-terminal hydrolase domain 
          Length = 884

 Score = 19.9 bits (41), Expect = 9.4
 Identities = 9/50 (18%), Positives = 9/50 (18%), Gaps = 11/50 (22%)

Query: 45  VLAHFHFENRAPPSNSHG-------RHHHLFPKAISQLVQKFRVKEMELS 87
                                                             
Sbjct: 773 TVAHFHKE----VFGTFGIPFLLRIHQGEHFREVMKRIQSLLDIQEKEFE 818


>CALMO Calmodulin like EF-hand domains 
          Length = 147

 Score = 20.0 bits (41), Expect = 9.4
 Identities = 11/83 (13%), Positives = 11/83 (13%)

Query: 428 QQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDAN 487
                                                                       
Sbjct: 42  LSPSEAEVNDLMNEIDVDGNHQIEFSEFLALMSRQLKSNDSEQELLEAFKVFDKNGDGLI 101

Query: 488 QGFDIPSALISFPDHHASLDFQE 510
                                  
Sbjct: 102 SAAELKHVLTSIGEKLTDAEVDD 124


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 105
Number of sequences better than 10.0: 8 
Number of calls to ALIGN: 8 
Length of query: 644 
Total length of test sequences: 20182  
Effective length of test sequences: 16335.0
Effective search space size: 9924239.9
Initial X dropoff for ALIGN: 25.0 bits

Y. Wolf's SCOP PSSM
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= NP_566299.1 (NM_111594) expressed protein [Arabidopsis
thaliana] (gb|AAK92821.1|)
         (644 letters)

Searching.................................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

gi|1127167 [1..256] Chorismate mutase II                           27  1.2
gi|2128579 [47..289] Sugar phosphatases                            26  2.5
gi|1902913 [26..315] Protein kinases (PK), catalytic core          25  3.8
gi|3656 [176..456] Cytochrome P450                                 24  7.4
gi|2808703 [93..336] P-loop containing nucleotide triphospha...    24  8.1
gi|1123091 [57..237] Protein kinases (PK), catalytic core          24  8.2
gi|2117285 [256..463] Cytochrome P450                              24  8.8
gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyl...    24  9.2

>gi|1127167 [1..256] Chorismate mutase II 
          Length = 256

 Score = 26.8 bits (58), Expect = 1.2
 Identities = 13/63 (20%), Positives = 13/63 (20%), Gaps = 2/63 (3%)

Query: 166 SDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGISALMNRPSVYRGFYHSQRLHLSTVES 225
                                                                       
Sbjct: 155 SRRIHFGKFVAEAKFQSDIPLYTKLIKSKDVEGIMKNITNSAVEEKI--LERLTKKAEVY 212

Query: 226 GQE 228
              
Sbjct: 213 GVD 215


>gi|2128579 [47..289] Sugar phosphatases 
          Length = 243

 Score = 25.8 bits (56), Expect = 2.5
 Identities = 9/67 (13%), Positives = 9/67 (13%), Gaps = 2/67 (2%)

Query: 226 GQEGLGSGIVLEQTLTVVLQP-ETTSVESNMQPSWSLS-SLFGRQVVGRCVLAKSSNVYL 283
                                                                       
Sbjct: 22  SEEIGLKVVGDELEYIFILDPIDGTYNALKSIPIYSTSIAVAKIKGEDKKLIRENINNID 81

Query: 284 QLEGLLG 290
                  
Sbjct: 82  WIKSFIA 88


>gi|1902913 [26..315] Protein kinases (PK), catalytic core 
          Length = 290

 Score = 25.1 bits (53), Expect = 3.8
 Identities = 13/169 (7%), Positives = 13/169 (7%), Gaps = 13/169 (7%)

Query: 327 FLFIFDIDKSSDSEPFDLGLTWKRPSKWSCQQAP----LHSSRFLMGSGNERGAIAILLK 382
                                                                       
Sbjct: 99  ELCKGQLVEFLRRVECKGPLSCDSILKIFYQTCRAVQHMHRQKPPIIHRDLKVENLLLSN 158

Query: 383 ATESQEKLSG--RDLTNGQCTIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKIN 440
                                                                       
Sbjct: 159 QGTIKLCDFGSATTISHYPDYSWSAQKRAMVEEEITRNTT----PMYRTPEIVDLYSNFP 214

Query: 441 VSPSTDKVSSG--MMEMML-ELPCEVKSVAISIEYDKGFLHIDEYPPDA 486
                                                            
Sbjct: 215 IGEKQDIWALGCILYLLCFRQHPFEDGAKLRIVNGKYSIPVNDTRYTVF 263


>gi|3656 [176..456] Cytochrome P450 
          Length = 281

 Score = 23.9 bits (51), Expect = 7.4
 Identities = 7/75 (9%), Positives = 7/75 (9%)

Query: 405 NIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVK 464
                                                                       
Sbjct: 40  PFFLTFPFLDVLPIPSRKKAFKDVVSFRELLVKRVQDELVNNYKFEQTTFAASDLIRAHN 99

Query: 465 SVAISIEYDKGFLHI 479
                          
Sbjct: 100 NEIIDYKQLTDNIVI 114


>gi|2808703 [93..336] P-loop containing nucleotide triphosphate hydrolases 
          Length = 244

 Score = 24.0 bits (51), Expect = 8.1
 Identities = 19/200 (9%), Positives = 19/200 (9%), Gaps = 20/200 (10%)

Query: 445 TDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFD-IPSALISFPDHH 503
                                                                       
Sbjct: 29  LDPQGNASTALGITDRQSGTPSSYEMLIGEVSLHTALRRSPHSERLFCIPATIDLAGAEI 88

Query: 504 ASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLV-PLTTPDFSMP------YNVITITCTI 556
                                                                       
Sbjct: 89  ELVSMVAREN---RLRTALAALDNFDFDYVFVDCPPSLGLLTINALVAAPEVMIPIQCEY 145

Query: 557 FALY----FGSLLNVLRRRIGEEERFLK---SQAGKKTGGLKQLLSRITAKIRGRPI--E 607
                                                                       
Sbjct: 146 YALEGVSQLMRNIEMVKAHLNPQLEVTTVILTMYDGRTKLADQVADEVRQYFGSKVLRTV 205

Query: 608 APSSSEAESSVLSSKLILKI 627
                               
Sbjct: 206 IPRSVKVSEAPGYSMTIIDY 225


>gi|1123091 [57..237] Protein kinases (PK), catalytic core 
          Length = 181

 Score = 23.9 bits (50), Expect = 8.2
 Identities = 5/51 (9%), Positives = 5/51 (9%), Gaps = 2/51 (3%)

Query: 411 PWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPC 461
                                                              
Sbjct: 73  ILYESPEMLKNREKNRVRRVDQDWMRQTQTRRQLGDVYAFGL--VMYEIIF 121


>gi|2117285 [256..463] Cytochrome P450 
          Length = 208

 Score = 23.8 bits (51), Expect = 8.8
 Identities = 6/49 (12%), Positives = 6/49 (12%), Gaps = 6/49 (12%)

Query: 304 QLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSE---PFDLGLTWK 349
                                                            
Sbjct: 139 DIHPEPT---TFKYDRFLNPNGSRKVDFYKAGQKIHHYTMPWGSGVSIC 184


>gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyltransferases 
          Length = 402

 Score = 23.6 bits (50), Expect = 9.2
 Identities = 14/113 (12%), Positives = 14/113 (12%), Gaps = 18/113 (15%)

Query: 366 FLMGSGNERGAIAILLKA---TESQEKLSGRDLTNG-QCTIKANIFQ---IFPWYIKVYY 418
                                                                       
Sbjct: 272 VLTGRNLKPGWIDYESNHSGLWMPKERAKELRDFYATPHLVVAHTKGTKVVAAWDERAYP 331

Query: 419 HTLQIFVDQQQKT--DSEVLKKINVSPSTDK---------VSSGMMEMMLELP 460
                                                                
Sbjct: 332 WREEFHLLPKEGVELDPLFLVEWLNSDKIQEYVKTLYRDFVPHLTLRMLERIP 384


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 1187
Number of sequences better than 10.0: 8 
Number of calls to ALIGN: 8 
Length of query: 644 
Total length of test sequences: 256703  
Effective length of test sequences: 206078.0
Effective search space size: 123871885.5
Initial X dropoff for ALIGN: 25.0 bits

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

calculation of internal repeats with prospero
***** PROSPERO v1.3  Mon Feb 25 11:36:22 2002 *****

Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford
For help see http://www.well.ox.ac.uk/ariadne  For usage use -help
using gap penalty 11+1k
using matrix BLOSUM62
printing all alignments with eval < 0.100000
using sequence1 NP_566299.1
using self-comparison


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

TIGRFAM
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/tigrfam/tigrfam.hmm
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/tigrfam/tigrfam.hmm-f
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model     Description                                   Score    E-value  N 
--------  -----------                                   -----    ------- ---
TIGR00570 cdk7: cdk-activating kinase assembly factor     1.0         22   1
TIGR01096 3A0103s03R: lysine-arginine-ornithine-bindi    -0.2         56   1
TIGR00893 2A0114: d-galactonate transporter              -0.3         47   1

Parsed for domains:
Model     Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------  ------- ----- -----    ----- -----      -----  -------
TIGR00570   1/1     134   146 ..   310   322 .]     1.0       22
TIGR00893   1/1     142   159 ..     1    19 [.    -0.3       47
TIGR01096   1/1     622   636 ..     1    17 [.    -0.2       56

Alignments of top-scoring domains:
TIGR00570: domain 1 of 1, from 134 to 146: score 1.0, E = 22
                   *->LQEAFsGLfyvps<-*
                      L+ A+sGLf+      
  NP_566299.   134    LTHALSGLFCASI    146  

TIGR00893: domain 1 of 1, from 142 to 159: score -0.3, E = 47
                   *->LvtvinYLDRanlSfAapt<-*
                       + +in+L  +++S+Aapt   
  NP_566299.   142    FCASINFLE-SSTSYAAPT    159  

TIGR01096: domain 1 of 1, from 622 to 636: score -0.2, E = 56
                   *->klvllaaLvaggdassa<-*
                      kl+l+ +Lvag  a++a   
  NP_566299.   622    KLILKIILVAG--AAAA    636  

//
SMART
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/iprscan/data/smart.HMMs
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
POLAc    DNA polymerase A domain                       -131.0         64   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
POLAc      1/1     489   644 .]     1   247 []  -131.0       64

Alignments of top-scoring domains:
POLAc: domain 1 of 1, from 489 to 644: score -131.0, E = 64
                   *->GreIRraFvAepGyrwvlvsADYSQIELRiLAHLSgDhFklHGgtAl
                      G  I +a +  p+++   +s D+ Q+EL      S+           
  NP_566299.   489    GFDIPSALISFPDHH---ASLDF-QEEL------SNS---------- 515  

                   GwenLieaFnnGr.....................DiHtkTAaaiFgVpee
                     + L   +++   ++ ++    + ++++ + +  + t T + iF     
  NP_566299.   516 --PLLSSLKEKSLvrsytevllvplttpdfsmpyNVITIT-CTIFA---- 558  

                   evTpelRraAKaiNFGiiYGmgqkFAfgLaeqlgpsIsraEAEElkelik
                                    +Y ++      L++ l+ +I ++E      l  
  NP_566299.   559 -----------------LYFGS------LLNVLRRRIGEEER----FLKS 581  

                   kYfarfPgtrvkryikrtkkveearrkGyvtTlfGRRryipdinqSrnpv
                   +  + + g   k+++ r+           +  + GR    p++  S+ ++
  NP_566299.   582 QAGKKTGG--LKQLLSRI-----------TAKIRGRPIEAPSS--SEAES 616  

                   lragIsaLenlknnaaaERaAvNapIQGsAADilKlAmikidkalkekgL
                                  +++              ilK+ +++   a   +  
  NP_566299.   617 S-------------VLSSK------------LILKIILVAGAAAAWQY-- 639  

                   raRllLqWVHDElvfEvpeee<-*
                         +           ++e   
  NP_566299.   640 -----FS-----------TDE    644  

//
COG
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/cogs/cogs.hmm
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
COG1238                                                 -93.6         94   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
COG1238    1/1     496   642 ..     1   175 []   -93.6       94

Alignments of top-scoring domains:
COG1238: domain 1 of 1, from 496 to 642: score -93.6, E = 94
                   *->MmkifgelyketlellihryayagLFlvsFleAtllPgpsEvflaam
                       ++ f +    ++ ++ + ++   L l+s  e +l    +Ev+l+++
  NP_566299.   496    LIS-FPD--HHASLDFQEELSNSPL-LSSLKEKSLVRSYTEVLLVPL 538  

                   slalgsFqlnalllalvAtl.GnvLGglvgYaLGrflpekvakklfgeGg
                    +  + F  + +++ +++t+ +  +G+l++++  r ++e ++ k ++   
  NP_566299.   539 -TTPD-FSMPYNVITITCTIfALYFGSLLNVLRRRIGEEERFLKSQA--- 583  

                   lekleKaeawlrrLVLEeyrGvwaLllaGflPipgdvfclaaGi.lrlpf
                        K +  l++        ++++ +       g ++   + ++++ ++
  NP_566299.   584 ----GKKTGGLKQ-----LLSRITAKIR------GRPIEAPSSSeAESSV 618  

                   lpfvlfillGrllRyllvaalavlgggrlk<-*
                   l++ l++ +       lva++a+ ++ +     
  NP_566299.   619 LSSKLILKI------ILVAGAAAAWQYFST    642  

//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/cogs/cogs.hmm-f
Sequence file:            NP_566299.1.fa
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  NP_566299.1  (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|)

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
COG0842                                                   4.4        3.6   1
COG0837                                                   1.4         14   1
COG0109                                                   0.8         26   1
COG2801                                                  -0.3         50   1
COG0174                                                  -1.1         68   1
COG0441                                                  -1.8         87   1

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
COG2801    1/1      69    85 ..     1    17 [.    -0.3       50
COG0837    1/1     230   244 ..   368   382 .]     1.4       14
COG0109    1/1     230   259 ..     1    30 [.     0.8       26
COG0174    1/1     408   420 ..   493   505 .]    -1.1       68
COG0842    1/1     543   576 ..   332   365 .]     4.4      3.6
COG0441    1/1     589   609 ..   649   669 .]    -1.8       87

Alignments of top-scoring domains:
COG2801: domain 1 of 1, from 69 to 85: score -0.3, E = 50
                   *->dsaieelaqefgvklmc<-*
                      ++ai  l+q f+vk m+   
  NP_566299.    69    PKAISQLVQKFRVKEME    85   

COG0837: domain 1 of 1, from 230 to 244: score 1.4, E = 14
                   *->lGAAaalrqtlaheq<-*
                      lG+++ l+qtl +++   
  NP_566299.   230    LGSGIVLEQTLTVVL    244  

COG0109: domain 1 of 1, from 230 to 259: score 0.8, E = 26
                   *->lvdplvrksarssiaisesarvkasqqstl<-*
                      l+   v++++  +++ +e+++v++++q+ +   
  NP_566299.   230    LGSGIVLEQTLTVVLQPETTSVESNMQPSW    259  

COG0174: domain 1 of 1, from 408 to 420: score -1.1, E = 68
                   *->avhpwEferYlsl<-*
                      +++pw+++ Y+++   
  NP_566299.   408    QIFPWYIKVYYHT    420  

COG0842: domain 1 of 1, from 543 to 576: score 4.4, E = 3.6
                   *->lsdvwfsllvLallgllllllgllllrrrekkar<-*
                      +s  ++++ +   +++l+++  l +lrrr+++++   
  NP_566299.   543    FSMPYNVITITCTIFALYFGSLLNVLRRRIGEEE    576  

COG0441: domain 1 of 1, from 589 to 609: score -1.8, E = 87
                   *->sldefieklkkeienrrlkpl<-*
                      +l +++ ++ ++i++r+ ++    
  NP_566299.   589    GLKQLLSRITAKIRGRPIEAP    609  

//