analysis of sequence from NP_566299.1.fa ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI EAHQLWKNAE FELSLKPERV IRESCSFLFI FDIDKSSDSE PFDLGLTWKR PSKWSCQQAP LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC TIKANIFQIF PWYIKVYYHT LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP CEVKSVAISI EYDKGFLHID EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS LKEKSLVRSY TEVLLVPLTT PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK SQAGKKTGGL KQLLSRITAK IRGRPIEAPS SSEAESSVLS SKLILKIILV AGAAAAWQYF STDE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > NP_566299.1 . . . . . 1 MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFH 50 ___HHHHHHHHHHHHHHHHHHH____HHHHHHHHHHH_____________ . . . . . 51 FENRAPPSNSHGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGG 100 _________________HHHHHHHHHHHHHHHHHHHH_____________ . . . . . 101 FDPLSSMNAKPVGVELWAVFDVPQSQVDTSWKNLTHALSGLFCASINFLE 150 ___________EEEEEEEEEE_________HHHHHH_____EEEEEEE__ . . . . . 151 SSTSYAAPTWGFGPNSDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGIS 200 __________________________EEEEE_____EEEE__________ . . . . . 201 ALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTLTVVLQPETTS 250 ______EEEEE______EEEEEE__________EEEE____EEE______ . . . . . 251 VESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI 300 __________HHHHHHH____________________________HHHHH . . . . . 301 EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKR 350 HHHHHHHHHHHHH______HHHHH__EEEEE_____________EEEE__ . . . . . 351 PSKWSCQQAPLHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQC 400 _______________EEE_____HHHHHHHHHH_________________ . . . . . 401 TIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSS 450 HHHHHHHH____EEEEE___EEE___________EEEEE___________ . . . . . 451 GMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFDIPSALISFP 500 HHHHHHH____EEEEEEEEEE_____EEEE______________EEEE__ . . . . . 501 DHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTTPDFSMPYNVI 550 ____HHHHHHH______HHHHHHHHHHHHHHEEEEE___________EEE . . . . . 551 TITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK 600 EHHHHHHHHH__HHHHHHHHHHHHHHHHHHH________HHHHHHHHHHH . . . . 601 IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE 644 HH___________HHHHHHHHHHHHHHHHHHHHHHHHHHH____ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 37.5 % beta-contents : 25.3 % coil-contents : 37.2 % class : mixed method : 2 alpha-contents : 21.5 % beta-contents : 28.1 % coil-contents : 50.5 % class : mixed ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa 0.33 -0.01 -0.03 0.00 0.00 0.00 -4.00 -0.07 0.00 -1.66 -4.30 0.00 -12.00 0.00 0.00 0.00 -21.75 5.22 -0.45 -0.64 -1.00 0.00 0.00 0.00 -2.77 -0.31 -3.16 -4.30 0.00 0.00 0.00 -12.00 0.00 -19.40 ID: NP_566299.1 AC: xxx Len: 644 1:I 616 Sc: -19.40 Pv: 4.089127e-02 NO_GPI_SITE GPI: learning from protozoa 0.71 0.00 0.00 -0.01 0.00 0.00 -12.00 0.00 0.00 -1.95 -11.99 0.00 -12.00 0.00 -12.00 0.00 -49.24 -11.77 0.00 -0.01 0.00 -4.00 0.00 0.00 -5.14 0.00 -2.09 -14.11 0.00 0.00 0.00 -12.00 0.00 -49.12 ID: NP_566299.1 AC: xxx Len: 644 1:I 615 Sc: -49.12 Pv: 1.349119e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? NP_566299.1 0.876 25 Y 0.834 25 Y 0.989 9 Y 0.913 Y # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? NP_566299.1 0.669 157 Y 0.458 157 Y 0.994 9 Y 0.265 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? NP_566299.1 0.734 157 Y 0.597 23 Y 0.992 10 Y 0.950 Y ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) 1-2 MA sllrslillli 3-13 14-509 VQSFLVAIAFGSKEVEEFSEALLLKPLPDR KVLAHFHFENRAPPSNSHGRHHHLFPKAIS QLVQKFRVKEMELSFTQGRWNHEHWGGFDP LSSMNAKPVGVELWAVFDVPQSQVDTSWKN LTHALSGLFCASINFLESSTSYAAPTWGFG PNSDKLRYGSLPREAVCTENLTPWLKLLPC RDKDGISALMNRPSVYRGFYHSQRLHLSTV ESGQEGLGSGIVLEQTLTVVLQPETTSVES NMQPSWSLSSLFGRQVVGRCVLAKSSNVYL QLEGLLGYESKNVDTEIEAHQLWKNAEFEL SLKPERVIRESCSFLFIFDIDKSSDSEPFD LGLTWKRPSKWSCQQAPLHSSRFLMGSGNE RGAIAILLKATESQEKLSGRDLTNGQCTIK ANIFQIFPWYIKVYYHTLQIFVDQQQKTDS EVLKKINVSPSTDKVSSGMMEMMLELPCEV KSVAISIEYDKGFLHIDEYPPDANQGFDIP SALISFPDHHASLDFQ eelsnspllsslkeksl 510-526 527-606 VRSYTEVLLVPLTTPDFSMPYNVITITCTI FALYFGSLLNVLRRRIGEEERFLKSQAGKK TGGLKQLLSRITAKIRGRPI eapssseaessvlssklilkiilv 607-630 631-644 AGAAAAWQYFSTDE low complexity regions: SEG 25 3.0 3.3 >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) 1-1 M asllrslilllivqsflvaiafgskeveef 2-37 sealll 38-214 KPLPDRKVLAHFHFENRAPPSNSHGRHHHL FPKAISQLVQKFRVKEMELSFTQGRWNHEH WGGFDPLSSMNAKPVGVELWAVFDVPQSQV DTSWKNLTHALSGLFCASINFLESSTSYAA PTWGFGPNSDKLRYGSLPREAVCTENLTPW LKLLPCRDKDGISALMNRPSVYRGFYH sqrlhlstvesgqeglgsgivleqtltvvl 215-264 qpettsvesnmqpswslssl 265-493 FGRQVVGRCVLAKSSNVYLQLEGLLGYESK NVDTEIEAHQLWKNAEFELSLKPERVIRES CSFLFIFDIDKSSDSEPFDLGLTWKRPSKW SCQQAPLHSSRFLMGSGNERGAIAILLKAT ESQEKLSGRDLTNGQCTIKANIFQIFPWYI KVYYHTLQIFVDQQQKTDSEVLKKINVSPS TDKVSSGMMEMMLELPCEVKSVAISIEYDK GFLHIDEYPPDANQGFDIP salisfpdhhasldfqeelsnspllsslke 494-535 kslvrsytevll 536-592 VPLTTPDFSMPYNVITITCTIFALYFGSLL NVLRRRIGEEERFLKSQAGKKTGGLKQ llsritakirgrpieapssseaessvlssk 593-636 lilkiilvagaaaa 637-644 WQYFSTDE low complexity regions: SEG 45 3.4 3.75 >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) 1-3 MAS llrslilllivqsflvaiafgskeveefse 4-83 alllkplpdrkvlahfhfenrappsnshgr hhhlfpkaisqlvqkfrvke 84-214 MELSFTQGRWNHEHWGGFDPLSSMNAKPVG VELWAVFDVPQSQVDTSWKNLTHALSGLFC ASINFLESSTSYAAPTWGFGPNSDKLRYGS LPREAVCTENLTPWLKLLPCRDKDGISALM NRPSVYRGFYH sqrlhlstvesgqeglgsgivleqtltvvl 215-293 qpettsvesnmqpswslsslfgrqvvgrcv lakssnvylqlegllgyes 294-567 KNVDTEIEAHQLWKNAEFELSLKPERVIRE SCSFLFIFDIDKSSDSEPFDLGLTWKRPSK WSCQQAPLHSSRFLMGSGNERGAIAILLKA TESQEKLSGRDLTNGQCTIKANIFQIFPWY IKVYYHTLQIFVDQQQKTDSEVLKKINVSP STDKVSSGMMEMMLELPCEVKSVAISIEYD KGFLHIDEYPPDANQGFDIPSALISFPDHH ASLDFQEELSNSPLLSSLKEKSLVRSYTEV LLVPLTTPDFSMPYNVITITCTIFALYFGS LLNV lrrrigeeerflksqagkktgglkqllsri 568-636 takirgrpieapssseaessvlssklilki ilvagaaaa 637-644 WQYFSTDE low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFHFENRAPPSNS HGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGGFDPLSSMNAKPVGVELWAVF DVPQSQVDTSWKNLTHALSGLFCASINFLESSTSYAAPTWGFGPNSDKLRYGSLPREAVC TENLTPWLKLLPCRDKDGISALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTL TVVLQPETTSVESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI EAHQLWKNAEFELSLKPERVIRESCSflfifdidkssdsepFDLGLTWKRPSKWSCQQAP LHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQCTIKANIFQIFPWYIKVYYHT LQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHID EYPPDANQGFDIPSALISFPDHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTT PDFSMPYNVITITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGlkqllsritak irgrpieapssseaessvlssklilkiilVAGAAAAWQYFSTDE 1 - 326 MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI EAHQLWKNAE FELSLKPERV IRESCS 327 - 341 flfi fdidkssdse p 342 - 589 FDLGLTWKR PSKWSCQQAP LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC T IKANIFQIF PWYIKVYYHT LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP C EVKSVAISI EYDKGFLHID EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS L KEKSLVRSY TEVLLVPLTT PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK S QAGKKTGG 590 - 629 l kqllsritak irgrpieaps sseaessvls sklilkiil 630 - 644 V AGAAAAWQYF STDE low complexity regions: DUST >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFHFENRAPPSNS HGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGGFDPLSSMNAKPVGVELWAVF DVPQSQVDTSWKNLTHALSGLFCASINFLESSTSYAAPTWGFGPNSDKLRYGSLPREAVC TENLTPWLKLLPCRDKDGISALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTL TVVLQPETTSVESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKRPSKWSCQQAP LHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQCTIKANIFQIFPWYIKVYYHT LQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHID EYPPDANQGFDIPSALISFPDHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTT PDFSMPYNVITITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for NP_566299.1 sequence: 644 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~222222222 * 14 M'95 -w local . | . | . | . | . | . 360 EAHQLWKNAE FELSLKPERV IRESCSFLFI FDIDKSSDSE PFDLGLTWKR PSKWSCQQAP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. 22222~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 420 LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC TIKANIFQIF PWYIKVYYHT ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 480 LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP CEVKSVAISI EYDKGFLHID ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 540 EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS LKEKSLVRSY TEVLLVPLTT ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 600 PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK SQAGKKTGGL KQLLSRITAK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~1111111 1111111~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | IRGRPIEAPS SSEAESSVLS SKLILKIILV AGAAAAWQYF STDE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ---------- ---------- ---------- ---------- ---- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** NP_566299.1.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: NP_566299.1.fa.___inter___ (1 sequences) MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFH FENRAPPSNSHGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGG FDPLSSMNAKPVGVELWAVFDVPQSQVDTSWKNLTHALSGLFCASINFLE SSTSYAAPTWGFGPNSDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGIS ALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTLTVVLQPETTS VESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKR PSKWSCQQAPLHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQC TIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSS GMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFDIPSALISFP DHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTTPDFSMPYNVI TITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 2 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 6 26 1.786 Certain 2 133 153 0.921 Putative 3 548 568 1.815 Certain 4 622 642 1.055 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 3 4 Loop length 5 521 53 2 K+R profile 2.00 14.00 + 0.00 CYT-EXT prof - - 0.85 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 16.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 3.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.85 -> Orientation: N-in ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 2 3 4 Loop length 5 106 394 53 2 K+R profile 2.00 + 0.00 + 14.00 CYT-EXT prof - 0.75 - 0.93 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -12.00 Tm probability: 0.80 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 3.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.18 -> Orientation: N-in ---------------------------------------------------------------------- "NP_566299" 644 6 26 #t 1.78646 133 153 #f 0.920833 548 568 #t 1.81458 622 642 #t 1.05521 ************************************ *TOPPREDM with prokaryotic function* ************************************ NP_566299.1.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: NP_566299.1.fa.___inter___ (1 sequences) MASLLRSLILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFH FENRAPPSNSHGRHHHLFPKAISQLVQKFRVKEMELSFTQGRWNHEHWGG FDPLSSMNAKPVGVELWAVFDVPQSQVDTSWKNLTHALSGLFCASINFLE SSTSYAAPTWGFGPNSDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGIS ALMNRPSVYRGFYHSQRLHLSTVESGQEGLGSGIVLEQTLTVVLQPETTS VESNMQPSWSLSSLFGRQVVGRCVLAKSSNVYLQLEGLLGYESKNVDTEI EAHQLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSEPFDLGLTWKR PSKWSCQQAPLHSSRFLMGSGNERGAIAILLKATESQEKLSGRDLTNGQC TIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSS GMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFDIPSALISFP DHHASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLVPLTTPDFSMPYNVI TITCTIFALYFGSLLNVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITAK IRGRPIEAPSSSEAESSVLSSKLILKIILVAGAAAAWQYFSTDE (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 2 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 6 26 1.786 Certain 2 133 153 0.921 Putative 3 548 568 1.815 Certain 4 622 642 1.055 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 3 4 Loop length 5 521 53 2 K+R profile 2.00 14.00 + 0.00 CYT-EXT prof - - 0.85 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 16.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 3.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.85 -> Orientation: N-in ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 2 3 4 Loop length 5 106 394 53 2 K+R profile 2.00 + 0.00 + 14.00 CYT-EXT prof - 0.75 - 0.93 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -12.00 Tm probability: 0.80 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 3.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.18 -> Orientation: N-in ---------------------------------------------------------------------- "NP_566299" 644 6 26 #t 1.78646 133 153 #f 0.920833 548 568 #t 1.81458 622 642 #t 1.05521 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ SAPS. Version of April 11, 1996. Date run: Mon Feb 25 11:33:19 2002 File: /people/b_eisen/NP_566299.1.fa.___saps___ ID NP_566299.1 DE (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) number of residues: 644; molecular weight: 72.2 kdal 1 MASLLRSLIL LLIVQSFLVA IAFGSKEVEE FSEALLLKPL PDRKVLAHFH FENRAPPSNS 61 HGRHHHLFPK AISQLVQKFR VKEMELSFTQ GRWNHEHWGG FDPLSSMNAK PVGVELWAVF 121 DVPQSQVDTS WKNLTHALSG LFCASINFLE SSTSYAAPTW GFGPNSDKLR YGSLPREAVC 181 TENLTPWLKL LPCRDKDGIS ALMNRPSVYR GFYHSQRLHL STVESGQEGL GSGIVLEQTL 241 TVVLQPETTS VESNMQPSWS LSSLFGRQVV GRCVLAKSSN VYLQLEGLLG YESKNVDTEI 301 EAHQLWKNAE FELSLKPERV IRESCSFLFI FDIDKSSDSE PFDLGLTWKR PSKWSCQQAP 361 LHSSRFLMGS GNERGAIAIL LKATESQEKL SGRDLTNGQC TIKANIFQIF PWYIKVYYHT 421 LQIFVDQQQK TDSEVLKKIN VSPSTDKVSS GMMEMMLELP CEVKSVAISI EYDKGFLHID 481 EYPPDANQGF DIPSALISFP DHHASLDFQE ELSNSPLLSS LKEKSLVRSY TEVLLVPLTT 541 PDFSMPYNVI TITCTIFALY FGSLLNVLRR RIGEEERFLK SQAGKKTGGL KQLLSRITAK 601 IRGRPIEAPS SSEAESSVLS SKLILKIILV AGAAAAWQYF STDE -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 37( 5.7%); C : 9( 1.4%); D : 24( 3.7%); E : 46( 7.1%); F : 33( 5.1%) G : 36( 5.6%); H : 17( 2.6%); I : 34( 5.3%); K : 37( 5.7%); L : 80(12.4%) M : 11( 1.7%); N : 21( 3.3%); P : 33( 5.1%); Q : 28( 4.3%); R : 28( 4.3%) S+ : 76(11.8%); T : 29( 4.5%); V : 38( 5.9%); W : 12( 1.9%); Y : 15( 2.3%) KR : 65 ( 10.1%); ED : 70 ( 10.9%); AGP : 106 ( 16.5%); KRED : 135 ( 21.0%); KR-ED : -5 ( -0.8%); FIKMNY : 151 ( 23.4%); LVIFM : 196 ( 30.4%); ST : 105 ( 16.3%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 00000+0000 0000000000 00000+-0-- 00-0000+00 0-++000000 0-0+000000 61 00+000000+ 0000000+0+ 0+-0-00000 0+000-0000 0-0000000+ 0000-00000 121 -000000-00 0+00000000 000000000- 0000000000 000000-+0+ 00000+-000 181 0-000000+0 000+-+-000 0000+0000+ 000000+000 000-000-00 000000-000 241 000000-000 0-00000000 000000+000 0+0000+000 00000-0000 0-0+00-0-0 301 -00000+00- 0-000+0-+0 0+-0000000 0-0-+00-0- 00-00000++ 00+0000000 361 0000+00000 00-+000000 0+00-00-+0 00+-000000 00+0000000 0000+00000 421 00000-000+ 0-0-00++00 00000-+000 000-000-00 0-0+000000 -0-+00000- 481 -000-00000 -000000000 -00000-00- -000000000 0+-+000+00 0-00000000 541 0-00000000 0000000000 00000000++ +00---+00+ 0000++0000 +0000+000+ 601 0+0+00-000 00-0-00000 0+000+0000 0000000000 00-- A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): 1) From 577 to 604: RFLKSQAGKKTGGLKQLLSRITAKIRGR +00+0000++0000+0000+000+0+0+ quartile: 4; size: 28, +count: 9, -count: 0, 0count: 19; t-value: 3.87 L: 4 (14.3%); G: 4 (14.3%); K: 5 (17.9%); R: 4 (14.3%); LVIFM: 7 (25.0%); Negative charge clusters (cmin = 10/30 or 13/45 or 16/60): none Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 5 | 5 | 7 | 40 | 10 | 10 | 13 | 11 | 12 | 15 | 7 | 9 | lmin1 6 | 6 | 9 | 49 | 12 | 12 | 15 | 14 | 14 | 19 | 9 | 11 | lmin2 7 | 7 | 10 | 54 | 13 | 13 | 17 | 16 | 16 | 21 | 10 | 13 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 1, at 569; - runs >= 3: 1, at 574; * runs >= 5: 0 0 runs >= 27: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -3.427 M_0.01= 65.88; M_0.05= 54.48; M_0.30= 40.92 1) From 8 to 24: length= 17, score=54.00 8 LILLLIVQSF LVAIAFG L: 5(29.4%); A: 2(11.8%); V: 2(11.8%); I: 3(17.6%); F: 2(11.8%); 2. SPACINGS OF C. H2N-142-C-36-C-12-C-79-C-51-C-30-C-43-C-60-C-92-C-90-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-47-H-1-H-10-H-2-H-H-H-28-H-1-H-38-H-6-C-36-C-12-C-20-H-4-H-53-C-29-H-21-C-30-C-5-H-37-C-18-H-41-C-16-H-23-H-H-50-C-90-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 5 B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 9 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 45 (Expected range: 20-- 58) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 18 (6-10) 11 (11-20) 9 (>=21) 8 3. Clusters of amino acid multiplets (cmin = 12/30 or 15/45 or 18/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 10 (Expected range: 2-- 24) 5 +plets (f+: 10.1%), 5 -plets (f-: 10.9%) Total number of charge altplets: 16 (Critical number: 27) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 2 (6-10) 1 (11-20) 1 (>=21) 7 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 633- 636 1 A 4 4 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 1- 24 4 i... 6 6 0 1- 54 9 i.....0.. 6 6 /0/./././././2/././ 8- 14 1 i 7 7 0 471- 520 10 -00000000. 5 5 /0/0/1/1/1/0/1/0/0/./ 490- 543 9 i..000... 6 6 /0/././2/2/1/./././ -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 53- 59 (1.) N( 6)N 22 of 22 0.0081 large minimal spacing ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- PDGF Platelet-derived growth factor (PDGF) 0.5 61 1 Paramyx_ncap Paramyxovirus nucleocapsid protein 0.2 16 1 MIP Major intrinsic protein -0.2 37 1 PsbN Photosystem II reaction centre N protein -13.4 96 1 NTR NTR/C345C module -34.1 95 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- Paramyx_ncap 1/1 1 10 [. 1 10 [. 0.2 16 PDGF 1/1 119 130 .. 7 19 .. 0.5 61 PsbN 1/1 134 178 .. 1 44 [] -13.4 96 NTR 1/1 303 403 .. 1 123 [] -34.1 95 MIP 1/1 629 639 .. 258 268 .] -0.2 37 Alignments of top-scoring domains: Paramyx_ncap: domain 1 of 1, from 1 to 10: score 0.2, E = 16 *->mAsLLksLaL<-* mAsLL+sL L NP_566299. 1 MASLLRSLIL 10 PDGF: domain 1 of 1, from 119 to 130: score 0.5, E = 61 *->lveIfreyvDrTe<-* ++++++++vD T+ NP_566299. 119 VFDVPQSQVD-TS 130 PsbN: domain 1 of 1, from 134 to 178: score -13.4, E = 96 *->MEtiAtvltIFlas..LLlsiTgYSiYt.sFGPpSkeLrDPFEEHEd + ++ +F as ++L s T+Y+ t +FGP+S +Lr E NP_566299. 134 LTH--ALSGLFCASinFLESSTSYAAPTwGFGPNSDKLRYGSLPREA 178 <-* NP_566299. - - NTR: domain 1 of 1, from 303 to 403: score -34.1, E = 95 *->lkkaCkpdRvayvykVkvldeeeedwfdvdkRqEiiytvtileViKs ++ +++ + +k ++ + e+ +f ++++i Ks NP_566299. 303 HQLWKNAE-FELSLKP-ERVIRESCSF--------LFIFDID---KS 336 GsgddergpgslrtfisdisCrcplilvkgkdYLiMGqsstwdekgglqy + +++ g ++ s+ sC+++++ + +MG+ + e+g + + NP_566299. 337 S-DSEPFDLGLTWKRPSKWSCQQAPLHSSRF---LMGSGN---ERGAIAI 379 ilgsdvitWiEeWprelkcqqrrlqk<-* +l + +E+++ ++ + + k NP_566299. 380 LLKAT--ESQEKLSGRDLTNGQCTIK 403 MIP: domain 1 of 1, from 629 to 639: score -0.2, E = 37 *->liGAalaalvY<-* l+++a+aa+ Y NP_566299. 629 LVAGAAAAWQY 639 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- KIX KIX domain 3.6 15 1 PsbN Photosystem II reaction centre N protei 1.6 93 1 Glucokinase Glucokinase 1.0 19 1 CDI Cyclin-dependent kinase inhibitor 0.7 80 1 PDGF Platelet-derived growth factor (PDGF) 0.5 61 1 complex1_24kD Respiratory-chain NADH dehydrogenase 24 0.3 69 1 Paramyx_ncap Paramyxovirus nucleocapsid protein 0.2 16 1 HypA Hydrogenase expression/synthesis hypA f 0.1 56 1 MIP Major intrinsic protein -0.2 37 1 DUF212 Uncharacterized BCR, COG1963 -0.2 79 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- Paramyx_ncap 1/1 1 10 [. 1 10 [. 0.2 16 complex1_24kD 1/1 64 80 .. 145 161 .] 0.3 69 HypA 1/1 79 87 .. 113 121 .] 0.1 56 PDGF 1/1 119 130 .. 7 19 .. 0.5 61 PsbN 1/1 161 170 .. 27 36 .. 1.6 93 Glucokinase 1/1 230 239 .. 344 353 .] 1.0 19 CDI 1/1 237 258 .. 85 108 .] 0.7 80 KIX 1/1 417 434 .. 64 81 .] 3.6 15 DUF212 1/1 509 521 .. 1 13 [. -0.2 79 MIP 1/1 629 639 .. 258 268 .] -0.2 37 Alignments of top-scoring domains: Paramyx_ncap: domain 1 of 1, from 1 to 10: score 0.2, E = 16 *->mAsLLksLaL<-* mAsLL+sL L NP_566299. 1 MASLLRSLIL 10 complex1_24kD: domain 1 of 1, from 64 to 80: score 0.3, E = 69 *->yEdLTpekieeLLdrlk<-* + +L p++i +L +++ NP_566299. 64 HHHLFPKAISQLVQKFR 80 HypA: domain 1 of 1, from 79 to 87: score 0.1, E = 56 *->LrIkslEVe<-* +r+k++E++ NP_566299. 79 FRVKEMELS 87 PDGF: domain 1 of 1, from 119 to 130: score 0.5, E = 61 *->lveIfreyvDrTe<-* ++++++++vD T+ NP_566299. 119 VFDVPQSQVD-TS 130 PsbN: domain 1 of 1, from 161 to 170: score 1.6, E = 93 *->sFGPpSkeLr<-* +FGP+S +Lr NP_566299. 161 GFGPNSDKLR 170 Glucokinase: domain 1 of 1, from 230 to 239: score 1.0, E = 19 *->lGAgvaleqs<-* lG+g+ leq+ NP_566299. 230 LGSGIVLEQT 239 CDI: domain 1 of 1, from 237 to 258: score 0.7, E = 80 *->pstslvllqpseaePaeEskedls<-* ++t+ v+lqp +++Es+ +s NP_566299. 237 EQTLTVVLQP--ETTSVESNMQPS 258 KIX: domain 1 of 1, from 417 to 434: score 3.6, E = 15 *->YYhLlaekiykiqKeLqe<-* YYh l++ +++ qK+ e NP_566299. 417 YYHTLQIFVDQQQKTDSE 434 DUF212: domain 1 of 1, from 509 to 521: score -0.2, E = 79 *->rAlltNevlLSsL<-* + l+N +lLSsL NP_566299. 509 QEELSNSPLLSSL 521 MIP: domain 1 of 1, from 629 to 639: score -0.2, E = 37 *->liGAalaalvY<-* l+++a+aa+ Y NP_566299. 629 LVAGAAAAWQY 639 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Mon Feb 25 11:35:38 2002 Sequence file: NP_566299.1.fa ---------------------------------------- Sequence NP_566299.1 (644 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 133: NLTH Total matches: 1 Matching pattern PS00004 CAMP_PHOSPHO_SITE: 349: KRPS Total matches: 1 Matching pattern PS00005 PKC_PHOSPHO_SITE: 130: SWK 166: SDK 215: SQR 314: SLK 347: TWK 363: SSR 391: SGR 401: TIK 445: TDK 520: SLK 598: TAK 620: SSK Total matches: 12 Matching pattern PS00006 CK2_PHOSPHO_SITE: 125: SQVD 221: STVE 225: SGQE 249: TSVE 298: TEIE 337: SDSE 370: SGNE 391: SGRD 431: TDSE 498: SFPD 520: SLKE 529: SYTE 539: TTPD 610: SSSE 612: SEAE 641: STDE Total matches: 16 Matching pattern PS00008 MYRISTYL: 140: GLFCAS 229: GLGSGI 371: GNERGA 562: GSLLNV 632: GAAAAW Total matches: 5 Matching pattern PS00009 AMIDATION: 583: AGKK Total matches: 1 Total no of hits in this sequence: 36 ======================================== 1314 pattern(s) searched in 1 sequence(s), 644 residues. Total no of hits in all sequences: 36. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >ER-GOLGI-traffic signal is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M minimal SH3 binding is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >EXTRACELL-M minimal furin protease cleavage site motif is the MOTIF name >NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) ;LENGTH=644; DIRECT_SEQUENCE n 1 solutions m %_RXXR 319-322 f >STATISTICS Total : 1 solutions in 1 sequences, 644 units; out of 1 sequences, 644 units >EXTRACELL-M extended furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >EXTRACELL-M zinc binding motif in MMPs is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >EXTRACELL-M g alpha binding go loco is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SV40 LrgT experimentally determined is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS Max experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >PDZ domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units >WW domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 644 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB+ PSSM from other authors IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) (644 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value AAA AAA+ ATPase Module 26 0.18 S1 S1 RNA binding domain 25 0.28 AP2 A plant specific DNA binding domain (Apetala 2 like) 24 0.43 INSL Insulinase like Metallo protease domain 22 2.5 ARM Armadillo repeat 21 5.4 CALC Calcineurin like Phosphoesterase domain 20 8.0 UBHYD Ubiquitin C-terminal hydrolase domain 20 9.4 CALMO Calmodulin like EF-hand domains 20 9.4 >AAA AAA+ ATPase Module Length = 298 Score = 25.7 bits (55), Expect = 0.18 Identities = 21/172 (12%), Positives = 21/172 (12%), Gaps = 13/172 (7%) Query: 369 GSGNERGAIAILLKATESQEKLSGRDLTNGQCTIKANIFQIFPWYIKVYYHTLQIFVDQQ 428 Sbjct: 90 GTGKTLLARAVAHHTDCTFIRVSGSELVQKFIGEGARMVRELFVMAREHAPSI-IFMDEI 148 Query: 429 QKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAI------------SIEYDKGF 476 Sbjct: 149 DSIGSRLEGGSGGDSEVQRTMLELLNQLDGFEATKNIKVIMATNRIDILDSALLRPGRID 208 Query: 477 LHIDEYPPDANQGFDIPSALISFPDHHASLDFQEELSNSPLLSSLKEKSLVR 528 Sbjct: 209 RKIEFPPPNEEARLDILKIHSRKMNLTRGINLRKIAELMPGASGAEVKGVCT 260 >S1 S1 RNA binding domain Length = 305 Score = 24.9 bits (54), Expect = 0.28 Identities = 12/74 (16%), Positives = 12/74 (16%), Gaps = 11/74 (14%) Query: 418 YHTLQI-FVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVA-----ISIE 471 Sbjct: 160 VLKAHILEANQDNNKLVLTQRRIQQAESMGKIAAGNIYE-----GKVAKIQPYGVFVEIE 214 Query: 472 YDKGFLHIDEYPPD 485 Sbjct: 215 GVTGLLHVSQVSGT 228 >AP2 A plant specific DNA binding domain (Apetala 2 like) Length = 218 Score = 24.4 bits (52), Expect = 0.43 Identities = 8/28 (28%), Positives = 8/28 (28%) Query: 493 PSALISFPDHHASLDFQEELSNSPLLSS 520 Sbjct: 85 ASAILNFPDLAGSFPRPSSLSPRDIQVA 112 >INSL Insulinase like Metallo protease domain Length = 433 Score = 21.8 bits (46), Expect = 2.5 Identities = 20/79 (25%), Positives = 20/79 (25%), Gaps = 11/79 (13%) Query: 462 EVKSVAISIEYDKGFLHIDEYPPDANQ------GFDIPSALISFPDHHASLDFQEELSN- 514 Sbjct: 231 PVPKVQIPTEPEQIGIRFKKLKDPRIEKAYWIIGWRVPA--IGKTDYKGLLVFSEILCGG 288 Query: 515 --SPLLSSLKEKSLVRSYT 531 Sbjct: 289 RISVFYRELREKGLVYSYS 307 >ARM Armadillo repeat Length = 532 Score = 20.6 bits (43), Expect = 5.4 Identities = 4/34 (11%), Positives = 4/34 (11%) Query: 566 NVLRRRIGEEERFLKSQAGKKTGGLKQLLSRITA 599 Sbjct: 50 REGMQALQGFPSASAASVDKKLDSLKDMVAGVWS 83 >CALC Calcineurin like Phosphoesterase domain Length = 274 Score = 20.1 bits (41), Expect = 8.0 Identities = 20/144 (13%), Positives = 20/144 (13%), Gaps = 15/144 (10%) Query: 8 LILLLIVQSFLVAIAFGSKEVEEFSEALLLKPLPDRKVLAHFHFENRAPPSNSHGRHHHL 67 Sbjct: 123 ALLLDSQVYGVPHGQLSQHQLDLLKETLGKNPERYTLVVLHHHLLPTNSAWLDQHNLRN- 181 Query: 68 FPKAISQLVQKFRVKEM----ELSFTQGRWNHEHWGGFDPLSSMNAKPVGVEL------- 116 Sbjct: 182 SHELAEVLAPFTNVKAILYGHIHQEVNSEWNGYQVMA-TPATCIQFKPDCQYFSLDTLQP 240 Query: 117 -WAVFDV-PQSQVDTSWKNLTHAL 138 Sbjct: 241 GWREIELHSDGSIRTEVKRIQQAE 264 >UBHYD Ubiquitin C-terminal hydrolase domain Length = 884 Score = 19.9 bits (41), Expect = 9.4 Identities = 9/50 (18%), Positives = 9/50 (18%), Gaps = 11/50 (22%) Query: 45 VLAHFHFENRAPPSNSHG-------RHHHLFPKAISQLVQKFRVKEMELS 87 Sbjct: 773 TVAHFHKE----VFGTFGIPFLLRIHQGEHFREVMKRIQSLLDIQEKEFE 818 >CALMO Calmodulin like EF-hand domains Length = 147 Score = 20.0 bits (41), Expect = 9.4 Identities = 11/83 (13%), Positives = 11/83 (13%) Query: 428 QQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDAN 487 Sbjct: 42 LSPSEAEVNDLMNEIDVDGNHQIEFSEFLALMSRQLKSNDSEQELLEAFKVFDKNGDGLI 101 Query: 488 QGFDIPSALISFPDHHASLDFQE 510 Sbjct: 102 SAAELKHVLTSIGEKLTDAEVDD 124 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 8 Number of calls to ALIGN: 8 Length of query: 644 Total length of test sequences: 20182 Effective length of test sequences: 16335.0 Effective search space size: 9924239.9 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) (644 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|1127167 [1..256] Chorismate mutase II 27 1.2 gi|2128579 [47..289] Sugar phosphatases 26 2.5 gi|1902913 [26..315] Protein kinases (PK), catalytic core 25 3.8 gi|3656 [176..456] Cytochrome P450 24 7.4 gi|2808703 [93..336] P-loop containing nucleotide triphospha... 24 8.1 gi|1123091 [57..237] Protein kinases (PK), catalytic core 24 8.2 gi|2117285 [256..463] Cytochrome P450 24 8.8 gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyl... 24 9.2 >gi|1127167 [1..256] Chorismate mutase II Length = 256 Score = 26.8 bits (58), Expect = 1.2 Identities = 13/63 (20%), Positives = 13/63 (20%), Gaps = 2/63 (3%) Query: 166 SDKLRYGSLPREAVCTENLTPWLKLLPCRDKDGISALMNRPSVYRGFYHSQRLHLSTVES 225 Sbjct: 155 SRRIHFGKFVAEAKFQSDIPLYTKLIKSKDVEGIMKNITNSAVEEKI--LERLTKKAEVY 212 Query: 226 GQE 228 Sbjct: 213 GVD 215 >gi|2128579 [47..289] Sugar phosphatases Length = 243 Score = 25.8 bits (56), Expect = 2.5 Identities = 9/67 (13%), Positives = 9/67 (13%), Gaps = 2/67 (2%) Query: 226 GQEGLGSGIVLEQTLTVVLQP-ETTSVESNMQPSWSLS-SLFGRQVVGRCVLAKSSNVYL 283 Sbjct: 22 SEEIGLKVVGDELEYIFILDPIDGTYNALKSIPIYSTSIAVAKIKGEDKKLIRENINNID 81 Query: 284 QLEGLLG 290 Sbjct: 82 WIKSFIA 88 >gi|1902913 [26..315] Protein kinases (PK), catalytic core Length = 290 Score = 25.1 bits (53), Expect = 3.8 Identities = 13/169 (7%), Positives = 13/169 (7%), Gaps = 13/169 (7%) Query: 327 FLFIFDIDKSSDSEPFDLGLTWKRPSKWSCQQAP----LHSSRFLMGSGNERGAIAILLK 382 Sbjct: 99 ELCKGQLVEFLRRVECKGPLSCDSILKIFYQTCRAVQHMHRQKPPIIHRDLKVENLLLSN 158 Query: 383 ATESQEKLSG--RDLTNGQCTIKANIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKIN 440 Sbjct: 159 QGTIKLCDFGSATTISHYPDYSWSAQKRAMVEEEITRNTT----PMYRTPEIVDLYSNFP 214 Query: 441 VSPSTDKVSSG--MMEMML-ELPCEVKSVAISIEYDKGFLHIDEYPPDA 486 Sbjct: 215 IGEKQDIWALGCILYLLCFRQHPFEDGAKLRIVNGKYSIPVNDTRYTVF 263 >gi|3656 [176..456] Cytochrome P450 Length = 281 Score = 23.9 bits (51), Expect = 7.4 Identities = 7/75 (9%), Positives = 7/75 (9%) Query: 405 NIFQIFPWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPCEVK 464 Sbjct: 40 PFFLTFPFLDVLPIPSRKKAFKDVVSFRELLVKRVQDELVNNYKFEQTTFAASDLIRAHN 99 Query: 465 SVAISIEYDKGFLHI 479 Sbjct: 100 NEIIDYKQLTDNIVI 114 >gi|2808703 [93..336] P-loop containing nucleotide triphosphate hydrolases Length = 244 Score = 24.0 bits (51), Expect = 8.1 Identities = 19/200 (9%), Positives = 19/200 (9%), Gaps = 20/200 (10%) Query: 445 TDKVSSGMMEMMLELPCEVKSVAISIEYDKGFLHIDEYPPDANQGFD-IPSALISFPDHH 503 Sbjct: 29 LDPQGNASTALGITDRQSGTPSSYEMLIGEVSLHTALRRSPHSERLFCIPATIDLAGAEI 88 Query: 504 ASLDFQEELSNSPLLSSLKEKSLVRSYTEVLLV-PLTTPDFSMP------YNVITITCTI 556 Sbjct: 89 ELVSMVAREN---RLRTALAALDNFDFDYVFVDCPPSLGLLTINALVAAPEVMIPIQCEY 145 Query: 557 FALY----FGSLLNVLRRRIGEEERFLK---SQAGKKTGGLKQLLSRITAKIRGRPI--E 607 Sbjct: 146 YALEGVSQLMRNIEMVKAHLNPQLEVTTVILTMYDGRTKLADQVADEVRQYFGSKVLRTV 205 Query: 608 APSSSEAESSVLSSKLILKI 627 Sbjct: 206 IPRSVKVSEAPGYSMTIIDY 225 >gi|1123091 [57..237] Protein kinases (PK), catalytic core Length = 181 Score = 23.9 bits (50), Expect = 8.2 Identities = 5/51 (9%), Positives = 5/51 (9%), Gaps = 2/51 (3%) Query: 411 PWYIKVYYHTLQIFVDQQQKTDSEVLKKINVSPSTDKVSSGMMEMMLELPC 461 Sbjct: 73 ILYESPEMLKNREKNRVRRVDQDWMRQTQTRRQLGDVYAFGL--VMYEIIF 121 >gi|2117285 [256..463] Cytochrome P450 Length = 208 Score = 23.8 bits (51), Expect = 8.8 Identities = 6/49 (12%), Positives = 6/49 (12%), Gaps = 6/49 (12%) Query: 304 QLWKNAEFELSLKPERVIRESCSFLFIFDIDKSSDSE---PFDLGLTWK 349 Sbjct: 139 DIHPEPT---TFKYDRFLNPNGSRKVDFYKAGQKIHHYTMPWGSGVSIC 184 >gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyltransferases Length = 402 Score = 23.6 bits (50), Expect = 9.2 Identities = 14/113 (12%), Positives = 14/113 (12%), Gaps = 18/113 (15%) Query: 366 FLMGSGNERGAIAILLKA---TESQEKLSGRDLTNG-QCTIKANIFQ---IFPWYIKVYY 418 Sbjct: 272 VLTGRNLKPGWIDYESNHSGLWMPKERAKELRDFYATPHLVVAHTKGTKVVAAWDERAYP 331 Query: 419 HTLQIFVDQQQKT--DSEVLKKINVSPSTDK---------VSSGMMEMMLELP 460 Sbjct: 332 WREEFHLLPKEGVELDPLFLVEWLNSDKIQEYVKTLYRDFVPHLTLRMLERIP 384 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 8 Number of calls to ALIGN: 8 Length of query: 644 Total length of test sequences: 256703 Effective length of test sequences: 206078.0 Effective search space size: 123871885.5 Initial X dropoff for ALIGN: 25.0 bits ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ calculation of internal repeats with prospero ***** PROSPERO v1.3 Mon Feb 25 11:36:22 2002 ***** Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford For help see http://www.well.ox.ac.uk/ariadne For usage use -help using gap penalty 11+1k using matrix BLOSUM62 printing all alignments with eval < 0.100000 using sequence1 NP_566299.1 using self-comparison ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ TIGRFAM hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm-f Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR00570 cdk7: cdk-activating kinase assembly factor 1.0 22 1 TIGR01096 3A0103s03R: lysine-arginine-ornithine-bindi -0.2 56 1 TIGR00893 2A0114: d-galactonate transporter -0.3 47 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR00570 1/1 134 146 .. 310 322 .] 1.0 22 TIGR00893 1/1 142 159 .. 1 19 [. -0.3 47 TIGR01096 1/1 622 636 .. 1 17 [. -0.2 56 Alignments of top-scoring domains: TIGR00570: domain 1 of 1, from 134 to 146: score 1.0, E = 22 *->LQEAFsGLfyvps<-* L+ A+sGLf+ NP_566299. 134 LTHALSGLFCASI 146 TIGR00893: domain 1 of 1, from 142 to 159: score -0.3, E = 47 *->LvtvinYLDRanlSfAapt<-* + +in+L +++S+Aapt NP_566299. 142 FCASINFLE-SSTSYAAPT 159 TIGR01096: domain 1 of 1, from 622 to 636: score -0.2, E = 56 *->klvllaaLvaggdassa<-* kl+l+ +Lvag a++a NP_566299. 622 KLILKIILVAG--AAAA 636 // SMART hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/iprscan/data/smart.HMMs Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- POLAc DNA polymerase A domain -131.0 64 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- POLAc 1/1 489 644 .] 1 247 [] -131.0 64 Alignments of top-scoring domains: POLAc: domain 1 of 1, from 489 to 644: score -131.0, E = 64 *->GreIRraFvAepGyrwvlvsADYSQIELRiLAHLSgDhFklHGgtAl G I +a + p+++ +s D+ Q+EL S+ NP_566299. 489 GFDIPSALISFPDHH---ASLDF-QEEL------SNS---------- 515 GwenLieaFnnGr.....................DiHtkTAaaiFgVpee + L +++ ++ ++ + ++++ + + + t T + iF NP_566299. 516 --PLLSSLKEKSLvrsytevllvplttpdfsmpyNVITIT-CTIFA---- 558 evTpelRraAKaiNFGiiYGmgqkFAfgLaeqlgpsIsraEAEElkelik +Y ++ L++ l+ +I ++E l NP_566299. 559 -----------------LYFGS------LLNVLRRRIGEEER----FLKS 581 kYfarfPgtrvkryikrtkkveearrkGyvtTlfGRRryipdinqSrnpv + + + g k+++ r+ + + GR p++ S+ ++ NP_566299. 582 QAGKKTGG--LKQLLSRI-----------TAKIRGRPIEAPSS--SEAES 616 lragIsaLenlknnaaaERaAvNapIQGsAADilKlAmikidkalkekgL +++ ilK+ +++ a + NP_566299. 617 S-------------VLSSK------------LILKIILVAGAAAAWQY-- 639 raRllLqWVHDElvfEvpeee<-* + ++e NP_566299. 640 -----FS-----------TDE 644 // COG hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG1238 -93.6 94 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG1238 1/1 496 642 .. 1 175 [] -93.6 94 Alignments of top-scoring domains: COG1238: domain 1 of 1, from 496 to 642: score -93.6, E = 94 *->MmkifgelyketlellihryayagLFlvsFleAtllPgpsEvflaam ++ f + ++ ++ + ++ L l+s e +l +Ev+l+++ NP_566299. 496 LIS-FPD--HHASLDFQEELSNSPL-LSSLKEKSLVRSYTEVLLVPL 538 slalgsFqlnalllalvAtl.GnvLGglvgYaLGrflpekvakklfgeGg + + F + +++ +++t+ + +G+l++++ r ++e ++ k ++ NP_566299. 539 -TTPD-FSMPYNVITITCTIfALYFGSLLNVLRRRIGEEERFLKSQA--- 583 lekleKaeawlrrLVLEeyrGvwaLllaGflPipgdvfclaaGi.lrlpf K + l++ ++++ + g ++ + ++++ ++ NP_566299. 584 ----GKKTGGLKQ-----LLSRITAKIR------GRPIEAPSSSeAESSV 618 lpfvlfillGrllRyllvaalavlgggrlk<-* l++ l++ + lva++a+ ++ + NP_566299. 619 LSSKLILKI------ILVAGAAAAWQYFST 642 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm-f Sequence file: NP_566299.1.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: NP_566299.1 (NM_111594) expressed protein [Arabidopsis thaliana] (gb|AAK92821.1|) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG0842 4.4 3.6 1 COG0837 1.4 14 1 COG0109 0.8 26 1 COG2801 -0.3 50 1 COG0174 -1.1 68 1 COG0441 -1.8 87 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG2801 1/1 69 85 .. 1 17 [. -0.3 50 COG0837 1/1 230 244 .. 368 382 .] 1.4 14 COG0109 1/1 230 259 .. 1 30 [. 0.8 26 COG0174 1/1 408 420 .. 493 505 .] -1.1 68 COG0842 1/1 543 576 .. 332 365 .] 4.4 3.6 COG0441 1/1 589 609 .. 649 669 .] -1.8 87 Alignments of top-scoring domains: COG2801: domain 1 of 1, from 69 to 85: score -0.3, E = 50 *->dsaieelaqefgvklmc<-* ++ai l+q f+vk m+ NP_566299. 69 PKAISQLVQKFRVKEME 85 COG0837: domain 1 of 1, from 230 to 244: score 1.4, E = 14 *->lGAAaalrqtlaheq<-* lG+++ l+qtl +++ NP_566299. 230 LGSGIVLEQTLTVVL 244 COG0109: domain 1 of 1, from 230 to 259: score 0.8, E = 26 *->lvdplvrksarssiaisesarvkasqqstl<-* l+ v++++ +++ +e+++v++++q+ + NP_566299. 230 LGSGIVLEQTLTVVLQPETTSVESNMQPSW 259 COG0174: domain 1 of 1, from 408 to 420: score -1.1, E = 68 *->avhpwEferYlsl<-* +++pw+++ Y+++ NP_566299. 408 QIFPWYIKVYYHT 420 COG0842: domain 1 of 1, from 543 to 576: score 4.4, E = 3.6 *->lsdvwfsllvLallgllllllgllllrrrekkar<-* +s ++++ + +++l+++ l +lrrr+++++ NP_566299. 543 FSMPYNVITITCTIFALYFGSLLNVLRRRIGEEE 576 COG0441: domain 1 of 1, from 589 to 609: score -1.8, E = 87 *->sldefieklkkeienrrlkpl<-* +l +++ ++ ++i++r+ ++ NP_566299. 589 GLKQLLSRITAKIRGRPIEAP 609 //