Home Icon  Home  Legal Icon  Legal Note  Contact Icon  Contact 
IMP Bioinformatics Group Leftlogo IMP Bioinformatics Group Rightlogo
PTS1 predictor Learning set Mutational variants SPROT 40 prediction
Self-consistency Jackknife 1 Jackknife 2 Jackknife 3

The PTS1 predictor

The learning set

Metaozan LH set

Identifier Sequence
Ce025#1CVWRLDWALGKL
Ce033#1ILWRMATMSASM
Ce036#1VAGQHRSLVARL
Ce039#1WHWKCFEHVARL
Ce051#1GEGAGFNQLAKL
Ce096#1*DPNEAMQWLARL
Ce123#1LGGPAPEFWSSL
Ce145#5QPGSTAMRMSKL
Ce146#3AAPVRLVLGHKL
Ce147#1GKATGRRLKAKL
Ce148#2VVWPLHRLRASL
Ce149#2DRKGWTHQPSKL
Ce150#1AVSFLSMRRARL
Ce152#3GLSDSFYRRSML
Ce153#1LWTALLGRTCSL
Ce154#1AGSAFSSCRSKL
Ce155#2KQWTGKGGRSKL
Ce157#1RERHSYMQRAKL
Ce159#1NLEFWGMMRSKM
Ce161#2YGVGRELRQAKL
Ce167#1EGKIRTLLWCKL
Ce168#1DGAHREAPASKL
Ce169#1GFRGYMSGLCKL
Ce171RKLMSEECRGKL
Ce172MRREVVLLRAKI
Ce173#1TCARFGRLRAKV
Ce175#1RWRLGWSWRAML
Ce176WCWRAPNLGSKL
Ce179#2RTEPKLMLRASL
Ce182#1RGGGPCQWKAKL
Ce185#1GHAPSRPAKARL
Ce186#1VSCRCRRLQSRL
Ce187#1*PDVRRCAGAAKL
Ce188#1GSTGWWLLQAKL
Ce190EHRRSSTSTAKL
Ce191VYTVPGTPRCSL
Ce192VYGGGVTRGHKL
Ce193CHVGSTPWRAAL
Ce194CALVGSHIRSRL
Ce195QCVHAEVHLARL
Ce196*LVDPLITSRSML
Ce199#1RSEEMARSPAKL
Ce200#1RLEMARWPRFKM
Ce201CQVRGACGMAKL
Ce202RSGSMVELRARL
Ce204#1NYPQTARVCAKL
Ce205CVSDLGHIVHKL
Ce208GEKFRDAWRARL
CeD1#1IGRRMAPLRSSL
CeF#1*LELVDPGMRAKL
Hs01IGLDWLVVISKL
Hs02GEYGVGLLWSKL
Hs03WGDPLIMPGSKL
Hs04SLLALAVQCSKL
Hs05RGMSWFSSNSKL
Hs06ALYGWWALGSKL
Hs07GNVGSAVTAAKL
Hs08KLTVRWAWRAKL
Hs09ERVSGRHPQAKL
Hs10GVEVLVPGTAKL
Hs11GPDPLMSILAKL
Hs12*PVKCVRKRLSRL
Hs13ICDAIFQGGSRL
Hs14*PVSWNWMVRSRL
Hs15IQTGGGRAISRL
Hs16RWDIPLRWWSRL
Hs17GAMGMLGGVSRL
Hs18GEALLGGFLSRL
Hs19PPNNIGCTASRL
Hs20*VDPIVKVPESRL
Hs21LAGHAWKALSRL
Hs22*ELVDPEGLRSRL
Hs23NNGRGKVVLSRL
Hs24LRHGVNGLWSRL
Hs25VRGGYLVRTARL
Hs26VAGQHRSLVARL
Hs27NGIEQGKKWARL
Hs28CWRDLPQMIARL
Hs29AILGGEYTMARL
Hs30*KEILELVDPARL
Hs31SPYTGGSLSCKL
Hs32MQIEHSLPFCRL
Hs33LVPLYHLIPCRL
Hs34FWSYWMVESCRL
Hs35VSGWGAGGMSRM
Hs36VWREGWGVRARM
Hs37PLTPRCILICRM
Hs38QDWAGLPLRSHL
Hs39KEEWPWGYRAHM
Hs40EGLIVMLERGKL
Hs41*LVDPLYILWGRL
Hs42YCYVLRVGGGRL
Hs43*LVDPFQVMFGRL
Hs44MGMCTGMLWPRL
Hs45*KEILELVDPPRL
Hs46EQSGDSGLKPKM
Hs47*PAYRLVAVLANL
Hs48WSMELTPIWCNL
Hs49SNMSLTFMSPNL
Hs50IAMNCVQVKSQL
Hs51WARYENIMSSQL
Hs52*VDPSERRLRSML
Hs53*LVDPLGPLFSML
Hs54TTMRGDTLTSLL
Hs55RGMGFPAVRSLL
Hs56CRSGLPCLQSLL
Hs57RAGQGTMWRSLL
Hs58VEYMMKWPRALL
Hs59SWSRSRLVSALL
Hs60GRDRTSVVRCLL
Hs61VVGVGGCVKSYL
Hs62LGNFGVSLCSSL
Hs63NGQVRDWCRPSL
Hs64RIASNCNLVSAL
Hs65MVVRMNPLKCVL

* random C-terminus shorter than 12 residues

Metazoan SW set

SPROT-ID(AC) Sequence
ADAS_CAEEL (O45218)LIDIIGSPHCKL
ADAS_DROME (Q9V778)PPTSSTPPKAKL
AK11_RAT (Q62924)REDGKWAMSCRL
AMAC_HUMAN (Q9UHK6)KIIESNKVKASL
AMAC_MOUSE (O09174)RIVESDKLKANL
AMAC_RAT (P70473)RIIESNKLKANL
AOPP_HUMAN (P30044)TCSLAPNIISQL
AOPP_MOUSE (P99029)TCSLAPNILSQL
CACP_COLLI (P52826)RSLLQSAPKSKL
CACP_HUMAN (P43155)RALLQSHPRAKL
CACP_MOUSE (P47934)RTLLQNHPRAKL
CAOP_CAEEL (P34355)VEKYLKPMTSKL
CAOP_HUMAN (Q15067)SYKHLKSLQSKL
CAOP_RAT (P07872)YHKHLKPLQSKL
CAOQ_RAT (Q63448)NKSVANRLKSQL
CATA_ASCSU (P90682)KNISNLAKYCKY
CATA_BRARE (Q9PT92)GGASAVAAASKM
CATA_CANFA (O97492)GSHLAAREKANL
CATA_CAVPO (Q64405)GSHLSAKEKANL
CATA_DROME (P17336)TEELNLAKSSKF
CATA_HUMAN (P04040)GSHLAAREKANL
CATA_MOUSE (P24270)GSHMAAKGKANL
CATA_RANRU (Q9PWF7)SAHVTANDKANL
CATA_RAT (P04762)GSHIAAKGKANL
DAPT_HUMAN (O15228)KTPIGKPATAKL
DAPT_MOUSE (P98192)KKPIGKPATAKL
DAPT_RAT (Q9ES71)KKPIGKPATAKL
ECH1_HUMAN (Q13011)NKELKTVTFSKL
ECH1_MOUSE (O35459)KRDTKSITFSKL
ECH1_RAT (Q62651)KKDSKSITFSKL
ECHP_CAVPO (P55100)WQSLAGLPSSKL
ECHP_HUMAN (Q08426)WQSLAGSPSSKL
ECHP_RAT (P07896)WQSLAGPHGSKL
HAO1_HUMAN (Q9UJM8)LVRKNPLAVSKI
HAO1_MOUSE (Q9WU19)LVRKNPLAVSKI
HAO2_HUMAN (Q9NYQ3)EINRNLVQFSRL
HAO3_HUMAN (Q9NYQ2)EISPDLIQFSRL
HAO3_MOUSE (Q9JI00)EISPDLIQFSRL
HAO3_RAT (Q07523)EISPDLIQFSRL
HMGL_BOVIN (Q29448)TNSKVAQATCKL
HMGL_CHICK (P35915)TNSKVSQAACRL
HMGL_HUMAN (P35914)TSSKVAQATCKL
HMGL_MOUSE (P38060)TSSKVAQATCKL
HMGL_RAT (P97519)TSSKVAQATCKL
HYES_HUMAN (P34913)SDARNPPVVSKM
HYES_MOUSE (P34914)TEVQNPSVTSKI
HYES_RAT (P80299)TEIQNPSVTSKI
IDHC_HUMAN (O75874)ENLKIKLAQAKL
IDHC_MICME (Q9Z2K9)ENLKAKLAQAKL
IDHC_MICOH (Q9Z2K8)ENLKAKLAQAKL
IDHC_MOUSE (O88844)ENLKAKLAQAKL
IDHC_RAT (P41562)ENLKAKLAQAKL
IDI1_HUMAN (Q13907)NQFVDHEKIYRM
IDI1_MESAU (O35586)SQFVDHEKIHRM
IDI1_MOUSE (P58044)SPFVDHEKIHRL
IDI1_RAT (O35760)SPFVDHEKIHRM
LUCI_LUCCR (P13129)IREILKKPVAKM
LUCI_LUCLA (Q01158)IREILKKPVAKM
LUCI_PHOPY (P08659)LIKAKKGGKSKL
NLTP_CHICK (Q07598)QNLQLQPGKAKL
NLTP_HUMAN (P22307)QNLQLQPGNAKL
NLTP_MOUSE (P32020)QNLQLQPGKAKL
NLTP_RAT (P11915)QSLQLQPDKAKL
O55223 (O55223)MSRFSTLSKAHL
OXDA_HUMAN (P14920)EKKLSRMPPSHL
OXDA_MOUSE (P18894)EKKLSRLPPSHL
OXDA_PIG (P00371)ERNLLTMPPSHL
OXDA_RABIT (P22942)EKKSSRMPPSHL
OXDA_RAT (O35078)EKNLSRMPPSHL
OXDD_BOVIN (P31228)QVLRTPAPKSKL
OXDD_HUMAN (Q99489)HALRTPIPKSNL
P79371 (P79371)ISRFPSLGKAHL
PECI_HUMAN (O75521)AVVNFLSRKSKL
PECI_MOUSE (Q9WUR2)AIMSFVSRKPKL
PMVK_HUMAN (Q15126)LENLIEFIRSRL
PTE1_HUMAN (O14734)IRVKPQVSESKL
PTE1_MOUSE (P58137)IRLKPQVSESKL
PTE2_HUMAN (P49753)LGGREGTIPSKV
PTE2_MOUSE (Q9QYR7)LDGKKKTIPAKL
Q27757 (Q27757)LRQMFEKHKSKL
Q99424 (Q99424)IRPLLQSWRSKL
SPYA_CALJA (P31029)REALQHCPKKKL
SPYA_FELCA (P41689)QEALQRCSRNKL
SPYA_HUMAN (P21549)RAALQHCPKKKL
SPYA_MOUSE (O35423)REALQHCPKNKL
SPYA_RABIT (P31030)REALQHCAQSQL
SPYA_RAT (P09139)REALQHCPKNKL
URIC_DROME (P16163)AQLARKNINSHL
URIC_DROPS (P22673)AQLARKNISSHL
URIC_DROSU (O44111)AQLARKNLNSHL
URIC_DROVI (P23194)AQLSRKSLKSHL
URIC_MOUSE (P25688)TGTVKRKLPSRL
URIC_PAPHA (P25689)TGTVKRKLSSRL
URIC_PIG (P16164)TGTVKRKLTSRL
URIC_RABIT (P11645)TGTVKRKLSSRL
URIC_RAT (P09118)TGTVRRKLPSRL

Fungal LH set

Identifier Sequence
Sc01*LELVDPCERSKL
Sc02RMDATKRRESKL
Sc03*VDPRCLARISKL
Sc04LSRGRSVSRSRL
Sc05AVHGTFSWRSRL
Sc06NGWGFMTRLSRL
Sc07NGRDRGGWWAKL
Sc08LSANALGGLAKL
Sc09RSGRQGGGFAKL
Sc10GWDWAVSPRAKL
Sc11RDRGTGQGLARL
Sc12WTRDGSHRMARL
Sc13SLLGGAAGWARL
Sc14SGSAVCSRVCRL
Sc15EWEEKSFIKCRL
Sc16STGKRSRSGAHL
Sc17VAWVPRKRVCHL
Sc18*PSGGVVARAAKM
Sc19WRATGVSRQAKF
Sc20SSCCVQTPKAKF
Sc21RAPGGVGHKCNL
Sc22ETKGLNAVYGKL
Sc23EWFPVYNRSTKL
Sc24GSESHGSARQKL
Sc25KAGEIPGRMHRL
Sc26RRQWSTGRKLKL
Sc27GPGCCRRRDLKL
Sc28TWGPCDGRRVKL
Sc29ERSVRHRREFRL
Sc30ELGISGARWYKL
Sc31IWDGSRTWAPKL
Sc32*PVWVSLGRRWKL
Sc33PLVGRKGGPWKL
Sc34GGIGRKSCGWKL
Sc35SMNGYQRRQWRL

* random C-terminus shorter than 12 residues

Fungal SW set

Identifier Sequence
ACEA_ASHGO (O94198)EEQFGSSNGAKL
ACEA_CANTR (P20014)TEDQFKETKAKV
ACEA_COPCI (O13439)AGVTESQFTSKL
ACEA_YARLI (P41555)AGVTEDQFKSKL
AHP1_YEAST (P38013)TVSSVESVLAHL
ALOX_CANBO (Q00922)LKTYEQTGAARY
ALOX_PICAN (P04841)LGTYEETGLARF
AOFN_ASPNG (P46882)ELGTKREVKARL
CACP_CANTR (Q00614)TKGLLTDAKPKL
CACP_YEAST (P32796)ALENENKRKAKL
CATA_PICAN (P30263)ELKRKASSPSKI
CATA_YEAST (P15202)KHASELSSNSKF
CISZ_YEAST (P08679)YKELVKNIESKL
DAS_PICAN (P06834)KEKPNHDKVNKL
FAT2_YEAST (P38137)TFAKSSRNKSKL
FOX2_CANTR (P22414)AAIKLVGDKAKI
FOX2_YEAST (Q02207)AAVKLSQAKSKL
MASY_EMENI (P28344)NEISSPGTASKL
MASY_NEUCR (P28345)TSAGNSLPASKL
MASY_YEAST (P30952)STKATPTDLSKL
MASZ_YEAST (P21826)KPSAKPVDLSKL
MDHP_YEAST (P32419)KGKSFILDSSKL
O93884 (O93884)LTEKPKHDQNHL
OXDA_FUSSO (P24552)VDKVGKAAKSKL
OXDA_RHOTO (P80324)QRYHGAARESKL
PEX8_PICAN (Q00925)EHVNESQEKAKL
PEX8_PICPA (Q01962)YENVNAQSTAKL
PEX8_YEAST (P53248)YTTVLSSQSSKL
PTE1_YEAST (P41903)VYGSERDIRAKF
PX18_CANMA (Q00680)SVFKKLDPRPKL
PX18_CANTR (P22009)AVFKKLDPRPKL
Q12598 (Q12598)VVIEKIDADAKL
Q96VB8 (Q96VB8)KKSPRGASKNKF
URIC_ASPFL (Q00511)CTVGRSSLKSKL
URIC_EMENI (P33282)KCTVGRKSKAKL
URIC_PICJA (P78609)KCTVVRKEKTKL
VAOX_PENSI (P56216)WPSQYSHVTWKL

Remaining learning set sequences

Identifier Sequence
ACE1_SOYBN (P45456)DRGSIVVAKARM
ACE2_SOYBN (P45457)DRGSIVVAKARM
ACEA_ARATH (P28297)EGTSLVVAKSRM
ACEA_BRANA (P25248)EGTSLVVAKSRM
ACEA_CUCMA (P93110)EEGSVVVAKSRM
ACEA_CUCSA (P49296)EEGNVVVAKSRM
ACEA_DENCR (Q9SE26)RGGITVNAKSRL
ACEA_GOSHI (P17069)SEGNLVVAKARM
ACEA_LYCES (P49297)GDGSVVIAKARM
ACEA_PINTA (Q43097)IGAGTVLAKSRM
ACEA_RICCO (P15479)SAGSEVVAKARM
ADAS_DICDI (O96759)LFDVVNVKYPKL
ADAS_TRYBB (O97157)KMGIPGALQAHL
CAT1_CUCPE (P48350)KLASHLNVRPSI
CAT1_GOSHI (P17598)KLASLLNVRPSI
CAT1_HORVU (P55307)KLASRLKIKPNM
CAT1_LYCES (P30264)KVASRLTVKPTM
CAT1_MAIZE (P18122)KLPSRLNLKPSM
CAT1_NICPL (P49315)KLASRLNVRPSI
CAT1_RICCO (Q01297)KLATRLNVKPSI
CAT1_SOLTU (P49284)KVASRLTVKPTM
CAT1_TOBAC (P49319)KVASRLTLKPTM
CAT1_WHEAT (Q43206)KLASRLSSKPSM
CAT2_ARATH (P25819)KLASRLNVRPSI
CAT2_CUCPE (P48351)KIASRMNARPNM
CAT2_GOSHI (P30567)KIASRLNVRPSI
CAT2_HORVU (P55308)KVANRLNVKPSM
CAT2_MAIZE (P12365)KLASRLSAKPSM
CAT2_NICPL (P49316)KVASRLTLKPTM
CAT2_RICCO (P49318)KLASRLNVRPNI
CAT2_SOLTU (P55312)KVASRLTVKPTM
CAT2_WHEAT (P55313)KLASRLKIKPNM
CAT3_ARATH (Q42547)KLASRLNVRPSI
CAT3_CUCPE (P48352)KIASRLNVRPNI
CAT3_NICPL (P49317)KIASRLNVRPTM
CATA_DICDI (O77229)NDVIKFAARSNL
CATA_HELAN (P45739)KIASRLNVKPNY
CATA_IPOBA (P07145)KVASRLNIRPTM
CATA_ORYSA (P29611)KIANRLNVKPSM
CATA_PEA (P25890)KLASHLNMRPSI
CATA_PHAAU (P32290)KIASHLNMRPNI
CATA_SECCE (P55310)KVANRLNVKPSM
CATA_SOLME (P55311)KVASRLLVKPTM
CATA_SOYBN (P29756)KIASHLNLKPSI
CATA_TOXGO (Q9XZD5)GLPTAACYPAKM
CATB_ORYSA (P55309)KLASRLNLKPNM
DHAB_HORVU (Q40024)ELYGWYQRPSKL
DHAB_ORYSA (O24174)EPYGWYRPPSKL
G3PG_LEIME (Q27890)YMAAKDAASSKM
G3PG_TRYBB (P22512)RHMAARDRAAKL
G3PG_TRYCR (P22513)RHMASKDRSARL
G6PI_TRYBB (P13377)GLINMFNELSHL
GOX1_ARATH (Q9LRS0)TEWDTPRHLPRL
GOX2_ARATH (Q9LRR9)TEWDTPRPSARL
GOX_SPIOL (P05414)WDGPSSRAVARL
GPDA_TRYBB (P90593)EGLPALPRTSKM
GPDA_TRYBR (Q26756)EGLPALPRTSKM
MASY_BRANA (P13244)IVAHYPINASRL
MASY_CUCMA (P24571)IVIHHPRELSRL
MASY_CUCSA (P08216)IVIHHPRELSKL
MASY_GOSHI (P17432)VIHHPKDVSSKL
MASY_MAIZE (P49081)VAHHPGASPCKL
MASY_RAPSA (Q43827)IVAHYPINVSRL
MASY_RICCO (P17815)IVIHYPKGSSRL
MASY_SOYBN (P45458)IVVHHPRETSKL
PGKC_TRYBB (P07378)GTGTLSNRWSSL
URIC_ARATH (O04420)IEATLSRITSKL
URIC_CANLI (P34798)IQASLRRLWSKL
URIC_PHAVU (P53763)IEASLSRVWSKL
URIC_SOYBN (P04670)IQASLSRLWSKL
URID_CANLI (P34799)IQASLSRLWSKL
URID_SOYBN (O04104)IQASLSRLWSKL

Gapless alignments of the 12 C-terminal resudues in clustalx colors:

Fungal learning set: postscript pdf
Metazoan learning set: postscript pdf
Complete learning set: postscript pdf

Secondary structure prediction using the PREDATOR program

Search for low complexity regions [12-2.2-2.5] [25-3.0-3.3] [45-3.4-3.75] using the SEG program