NMT MyristoylCoA:Protein N-Myristoyltransferase Supplementary material to N-terminal N-Myristoylation of Proteins: I. Refinement of the Sequence Motif and its Taxon-specific Differences II. Prediction of Substrate Proteins from Amino Acid Sequence |
Correlation data with amino acid properties:
for the effective amino acid composition of the total alignment (profile approach)
Below is a complete listing of the learning set, showing the SWISSPROT ID, the corresponding 30 residues of the sequence starting from the myristoylation site indicated by LIPIDPos, as well as the annotation by SWISSPROT and our comment, retrieved via extensive literature search.
Abbreviations:
SWISSPROT comment: | N
... not annotated for myristoylation by SWISSPROT at all ! . ... assumed to be verified by SWISSPROT S ... by similarity P ... potential |
Our comment: | HPLC ... experimental verification by HPLC 3HLC ... experimental verification by HPLC including radioactive labeling 3HGE ... experimental verification by SDS-PAGE including radioactive labeling 3HTC ... experimental verification by thin layer chromatography including radioactive labeling MASS ... experimental verification by mass spectrometry XRAY ... experimental verification through strong evidence from x-ray diffraction or NMR SPEC ... experimental verification not clear, further evidence required |
ID SEQUENCE LIPIDPos SW_Comment Our_Comment
GAG_SIVM1 GARNSVLSGKKADELEKIRLRPGGKKKYML 2 N HPLC
KCRF_STRPU GCAASSQQTTATGGQPAAGEKANPAPANNN 2 N 3HLC
Q26368 GCNTSQELKTKDGAAMDAVSNGEPEPSAPP 2 N 3HGE
GBAZ_HUMAN GCRQSSEEKEAARRSRRIDRHLRSESQRQR 2 N 3HGE
GAG_MPMV GQELSQHERYVEQLKQALKTRGVKVKYADL 2 N 3HGE
Q67940 GQNLSTSNPLGFFPDHQLDPAFRANTANPD 2 N 3HGE
COA2_POVM3 GAALTILVDLIEGLAEVSTLTGLSAEAILS 2 N 3HTC
COA2_SV40 GAALTLLGDLIATVSEAAAATGFSVAEIAA 2 N 3HTC
RASH_RRASV GQSLTTPLSLTLDHWKDVRDRARDQSVEIK 2 N 3HGE
GBA1_SPOSC GCGMSVEEKEGKARNEEIENQLKRDRMQQR 1 S
GCA3_HUMAN GNGKSIAGDQKAVPTQETHVWYRTFMMEYP 1 P SPEC
BASP_BOVIN GGKLSKKKKGYNVNDEKAKDKDKKAEGAGT 1 . MASS
CP23_CHICK GGKLSKKKKGYSVNDEKAKDKDKKAEGAAT 1 P
25A2_HUMAN GNGESQLSSVPAQKLGWFIQEYLKPYEECQ 1 P
42_HUMAN GQALGIKSCDFQAARNNEEHHTKALSSRRL 1 . 3HTC
42_MOUSE GQALSIKSCDFHAAENNEEHYTKAISSQHL 1 S
ANXD_CANFA GNRHAKAKSHHGFDVDHDAKKLNKACKGMG 1 S
ANXD_HUMAN GNRHAKASSPQGFDVDRDAKKLNKACKGMG 1 . 3HLC
APKA_ARATH GICLSAQVKAESSGASTKYDAKDIGSLGSK 2 S
APLC_APLCA GKRASKLKPEEVEELKQQTYFTEAEIKQWH 1 P
ARF1_ARATH GLSFGKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF1_BRARP GILFTRMFSSVFGNKEARILVLGLDNAGKT 1 P
ARF1_CAEEL GNVFGSLFKGLFGKREMRILMVGLDAAGKT 1 P
ARF1_CATRO GLSFTKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF1_CHLRE GLRFTKALSRLFGKKEMRILMVGLDAAGKT 1 P
ARF1_DAUCA GLSFTKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF1_DICDI GLAFGKLFSRFFGKKDMRILMVGLDAAGKT 1 P
ARF1_DROME GNVFANLFKGLFGKKEMRILMVGLDAAGKT 1 P
ARF1_HUMAN GNIFANLFKGLFGKKEMRILMVGLDAAGKT 1 P
ARF1_PLAFO GLYVSRLFNRLFQKKDVRILMVGLDAAGKT 1 P
ARF1_SCHPO GLSISKLFQSLFGKREMRILMVGLDAAGKT 1 P
ARF1_SOLTU GLTISKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF1_XENLA GNMFANLFKGLFGKKEMRILMVGLDAAGKT 1 P
ARF1_YEAST GLFASKLFSNLFGNKEMRILMVGLDGAGKT 1 .
ARF2_CAEEL GGVMSYFRGLFGAREMRILILGLDGAGKTT 1 P
ARF2_DROME GLTISSLLTRLFGKKQMRILMVGLDAAGKT 1 P
ARF2_YEAST GLYASKLFSNLFGNKEMRILMVGLDGAGKT 1 .
ARF3_ARATH GILFTRMFSSVFGNKEARILVLGLDNAGKT 1 P
ARF3_CAEEL GLFFSKISSFMFPNIECRTLMLGLDGAGKT 1 P
ARF3_DROME GKLLSKIFGNKEMRILMLGLDAAGKTTILY 1 P
ARF3_HUMAN GNIFGNLLKSLIGKKEMRILMVGLDAAGKT 1 P
ARF3_YEAST GNSISKVLGKLFGSKEMKILMLGLDKAGKT 1 P
ARF4_HUMAN GLTISSLFSRLFGKKQMRILMVGLDAAGKT 1 P
ARF4_MOUSE GLTISSLFSRLFGKKQMRILMVGLDAAGKT 1 P
ARF4_XENLA GLTISSLFSRLFGKKQMRILMVGLDAAGKT 1 P
ARF5_CHICK GLTVSAIFSRIFGKKQMRILMVGLDAAGKT 1 P
ARF5_HUMAN GLTVSALFSRIFGKKQMRILMVGLDAAGKT 1 P
ARF6_CHICK GKVLSKIFGNKEMRILMLGLDAAGKTTILY 1 S
ARF6_HUMAN GKVLSKIFGNKEMRILMLGLDAAGKTTILY 1 . 3HGE
ARF6_XENLA GKMFSKIFGNKEMRILMRGLDAAGKTTILY 1 S
ARFL_CAEEL GLIMAKLFQSWWIGKKYKIIVVGLDNAGKT 1 P
ARF_AJECA GMAFSKLFDRIWGKKEMRILMVGLDAAGKT 1 . 3HGE
ARF_CANAL GLTISKLFASLLGRREMRILMVGLDAAGKT 1 . 3HGE
ARF_CRYNE GLSVSKLLNGLFGKKEMRILMVGLDAAGKT 1 . 3HGE
ARF_DUGJA GNLVTHLLDRLFGKKEMRILMVGLDAAGKT 1 P
ARF_GIALA GQGASKIFGKLFSKKEVRILMVGLDAAGKT 1 P
ARF_MAIZE GLTFTKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF_ORYSA GLTFTKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF_PLAFA GLYVSRLFNRLFQKKDVRILMVGLDAAGKT 1 P
BASP_HUMAN GGKLSKKKKGYNVNDEKAKEKDKKAEGAAT 1 . MASS
BLK_HUMAN GLVSSKKPDKEKPIKEKDKGQWSPLKVSAQ 1 S
BLK_MOUSE GLLSSKRQVSEKGKGWSPVKIRTQDKAPPP 1 S
CA22_HUMAN GSRASTLLRDEELEEIKKETGFSHSQITRL 1 S
CA22_MOUSE GSRASTLLRDEELEEIKKETGFSHSQITRL 1 P 3HGE
CALB_HUMAN GNEASYPLEMCSHFDADEIKRLGKRFKKLD 1 . MASS
CALB_MOUSE GSEASYPLEMCSHFDADEIKRLGKRFKKLD 1 S
CALB_YEAST GAAPSKIVDGLLEDTNFDRDEIERLRKRFM 1 . 3HGE
CBLP_RAT GNEASYHSEMGTHFDHDEIKRLGRSFKKMD 1 S
ENTK_BOVIN GSKRSVPSRHRSLTTYEVMFAVLFVILVAL 2 P
ENTK_HUMAN GSKRGISSRHHSLSSYEIMFAALFAILVVL 2 P
ENTK_PIG GSKRIIPSRHRSLSTYEVMFTALFAILMVL 2 P
EST2_CAEEL GGFLSHLTPEQNVEALKASCGPVRGNIYKH 1 P
FYN_CHICK GCVQCKDKEATKLTDERDGSLTQSSGYRYG 1 S
FYN_HUMAN GCVQCKDKEATKLTEERDGSLNQSSGYRYG 1 . 3HGE
FYN_MOUSE GCVQCKDKEAAKLTEERDGSLNQSSGYRYG 1 . 3HGE
FYN_XENLA GCVQCKDKEATKLTDERDNSLTQSLGYRYG 1 S
FYN_XIPHE GCVQCKDKEATKLTDDRDASISQGAGYRYG 1 S
GAG_BAEVM GQTLTTPLSLTLTHFSDVRARAHNLSVGVR 2 . 3HGE
GAG_BLVAU GNSPSYNPPAGISPSDWLNLLQSAQRLNPR 2 S
GAG_BLVJ GNSPSYNPPAGISPSDWLNLLQSAQRLNPR 2 .
GAG_FLV GQTVTTPLSLTLDHWSEVRARAHNQGVEVR 76 .
GAG_FRSFV GQTVTTPLSLTLEHWEDVQRTASNQSVDVK 2 .
GAG_FSVGA GQTITTPLSLTLDHWSEVRARAHNQGVEVR 79 .
GAG_FSVGR GQTITTPLSLTLDHWSEVRARAHNQGVEVR 2 . 3HGE
GAG_FSVHZ GQTIATPLSLTLDHWSEVRARAHNQGVEVR 76 .
GAG_FSVMD GQTVTTPPSLTLDHWSEVRTRAHNQGIEVR 79 .
GAG_FSVST GQTVTTPLSLTLDHWSEVRARAHNQGVEVR 76 .
GAG_HV1A2 GARASVLSGGELDKWEKIRLRPGGKKKYKL 1 S
GAG_HV1B1 GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1B5 GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1BR GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1C4 GARASVLSGGELDRWEKIRLRPGGKKQYRL 1 S
GAG_HV1EL GARASVLSGGKLDKWEKIRLRPGGKKKYRL 1 S
GAG_HV1H2 GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1J3 GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1JR GARASVLSGGELDRWEKIRLRPGGKKKYRL 1 S
GAG_HV1LW GARASVLSGGKLDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1MA GARASVLSGGKLDAWEKIRLRPGGKKKYRL 1 S
GAG_HV1MN GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 . MASS
GAG_HV1N5 GARASVLSGGELDKWEKIRLRPGGKKQYKL 1 S
GAG_HV1ND GARASVLSGGKLDTWERIRLRPGGKKKYAL 1 S
GAG_HV1OY GARASVLSGGELDKWEKIRLRPGGKKKYQL 1 S
GAG_HV1PV GARASVLSGGELDRWEKIRLRPGGKKKYKL 1 S
GAG_HV1RH GARASVLSGGKLDKWEKIRLRPRGKKRYKL 1 S
GAG_HV1U4 GARASVLSGKKLDSWEKIRLRPGGNKKYRL 1 S
GAG_HV1W2 GARASVLSGGELDKWEKIRLRPGGKKKYRL 1 S
GAG_HV1Y2 GARASVLSAGELDKWEKIRLRPGGKKQYRL 1 S
GAG_HV1Z2 GARASVLSGGKLDAWEKIRLRPGGKKKYRL 1 S
GAG_MLVAB GQTVTTPLSLTLGHWKDVERIAHNQSVDVK 2 .
GAG_MLVAV GQTVTTPLSLTLEHWEDVQRIASNQSVDVK 2 .
GAG_MLVBM GQTVTTPLSLTLEHWGDVQRIASNQSVGVK 2 S
GAG_MLVCB GQTVTTPLSLTLDHWKDVERTAHNQSVDVK 2 S
GAG_MLVDE GQTITTPLSLTLEHWRDVQCIASNQSVDVK 2 S
GAG_MLVDU GQTVTTPLSLTLDHWKDVQCIASNQSVDVK 2 S
GAG_MLVF5 GQTVTTPLSLTLDHWKDVERTAHNQSVEIR 2 S
GAG_MLVFF GQAVTTPLSLTLDHWKDVERTAHNLSVEVR 2 S
GAG_MLVFP GQTATTPLSLTLDHWKDVERTAHNQSVEVR 2 S
GAG_MLVHO GQTITTPLSLTLDHWRDVQRIASNQSVDVK 2 S
GAG_MLVMO GQTVTTPLSLTLGHWKDVERIAHNQSVDVK 2 . MASS
GAG_MLVRD GQTVTTPLSLTLEHWGDVQRIASNQSVEVK 2 .
GAG_MMTVB GVSGSKGQKLFVSVLQRLLSERGLHVKESS 2 . MASS
GAG_MMTVG GVSGSKGQKLFVSVLQRLLSERGLHVKESS 2 . MASS
GAG_MSVFR GQTVTTPLSLTLEHWGDVQRIASNQSVDVK 2 S
GAG_MSVMO GQTVTTPLSLTLDHWKDVERLAHNQSVDVK 2 . MASS
GAG_MSVMT GQTVTTPLSLTLDHWKDVERIAHNQSVDVK 2 S
GB01_BOVIN GCTLSAEERAALERSKAIEKNLKEDGISAA 1 . HPLC
GB01_HUMAN GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S
GB01_MOUSE GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S
GB01_RAT GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S 3HGE
GB02_CRILO GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S
GB02_HUMAN GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S
GB02_MOUSE GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S
GB02_RAT GCTLSAEERAALERSKAIEKNLKEDGISAA 1 S
GB0_CAEEL GCTMSQEERAALERSRMIEKNLKEDGMQAA 1 S
GB0_HELTI GCTLSAEERAAMERSKAIEKNLKEDGMQAA 1 S
GB0_LYMST GCTLSAEERAAMERSKAIEKNLKEDGMQAA 1 S
GB0_MANSE GCASSAEERAAPSAQQADREKLKEDGIQAA 1 S
GB0_PATYE GCTMSAEDRAAAERSRDIEKKLKEDGIQAA 1 S
GB0_XENLA GCTLSAEERAALERSKQIEKNLKEDGVTAA 1 S
GBA1_CANAL GCGASVPVDDDEIDPFLQDKRINDAIEQSL 1 S
GBA1_COCHE GCGMSTEEKEGKQRNEEIENQLKRDKLMQR 1 S
GBA1_COLTR GCGMSTEDKEGKARNEEIENQLKRDKMMQR 1 S
GBA1_COPCO GCVQSTGVDDEAKARNDEIENQLKRDRVMA 1 S
GBA1_CRYNE GGCMSTPEAPKKTAETKQVPSTSTSSRPPQ 1 S SPEC
GBA1_CRYPA GCGMSTEEKEGKARNEEIENQLKRDRMQQR 1 S
GBA1_EMENI GCGMSTEDKEGKARNEEIENQLKRDKMMQR 1 S
GBA1_MAGGR GCGMSTEEKEGKARNEEIENQLKRDRLQQR 1 S
GBA1_SCHPO GCMSSKYADTSGGEVIQKKLSDTQTSNSST 1 S
GBA1_YEAST GCTVSTQTIGDESDPFLQNKRANDVIEQSL 1 . 3HGE
GBAK_CAVPO GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBAK_HUMAN GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBAK_RAT GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBI1_CAVPO GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBI1_CHICK GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBI1_HUMAN GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBI1_ORYLA GCTLSTDDKAAQERSKMIDRNLRDDGEKAA 1 S
GBI1_RAT GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBI1_XENLA GCTLSAEDKAAVERSKMIDRNLREDGEKAA 1 S
GBI2_CANFA GCTVSAEDKAAAERSKMIDKNLREDGEKAA 1 S
GBI2_CAVPO GCTVSAEDKAAAERSKMIDKNLREDGEKAA 1 S
GBI2_CHICK GCTVSAEDKAAAERSRMIDRNLREDGEKAA 1 S
GBI2_HUMAN GCTVSAEDKAAAERSKMIDKNLREDGEKAA 1 S
GBI2_MOUSE GCTVSAEDKAAAERSKMIDKNLREDGEKAA 1 S
GBI2_RAT GCTVSAEDKAAAERSKMIDKNLREDGEKAA 1 S
GBI_ASTPE GCATSAEDKAAAERSKAIDRNLRIDGEKAA 1 S
GBI_HELTI GCVTSQEDKAAVERSKQIDKSLRMDGEKAA 1 S
GBI_HOMAM GCAMSNAADKEAAERSKKIDKDLRLAGERA 1 S
GBI_LYMST GCVTSQEDKAAVERSKQIDKSLRMDGEKAA 1 S
GBT1_BOVIN GAGASAEEKHSRELEKKLKEDAEKDARTVK 1 . MASS
GBT1_CANFA GAGASAEEKHSRELEKKLKEDAEKDARTVK 1 S
GBT1_HUMAN GAGASAEEKHSRELEKKLKEDAEKDARTVK 1 S
GBT1_MOUSE GAGASAEEKHSRELEKKLKEDAEKDARTVK 1 S
GBT2_BOVIN GSGASAEDKELAKRSKELEKKLQEDADKEA 1 S
GBT2_HUMAN GSGASAEDKELAKRSKELEKKLQEDADKEA 1 S
GBT2_MOUSE GSGISAEDKELARRSKELEKKLQEDADKEA 1 S
GBT3_RAT GSGISSESKESAKRSKELEKKLQEDAERDA 1 S
GBT_XENLA GAGASAEEKHSRELEKKLKEDADKDARTVK 1 S
GCA2_BOVIN GQQFSWEEAEENGAVGAADAAQLQEWYKKF 1 P
GCAP_BOVIN GNIMDGKSVEELSSTECHQWYKKFMTECPS 1 P MASS
GCAP_HUMAN GNVMEGKSVEELSSTECHQWYKKFMTECPS 1 P
GCAP_MOUSE GNVMEGKSVEELSSTECHQWYKKFMTEVPS 1 P
HCK_HUMAN GGRSSCEDPGCPRDEERAPRMGSMKSKFLQ 2 S 3HGE
HCK_MOUSE GGRSSCEDPGCPRSEGRAPRMGCVKSRFLR 2 S 3HGE
HCK_RAT GCVKSRFLREGSKASKIEPNANQKGPVYVP 2 S
HIA1_DICDI GNRAFKSHHGHFLSAEGEAVKTHHGHHDHH 1 . MASS
HIA2_DICDI GNRAFKAHNGHYLSAEHDHVKTHHGHHDHH 1 . MASS
HIPP_HUMAN GKQNSKLRPEMLQDLRENTEFSELELQEWY 1 S
HIPP_RAT GKQNSKLRPEMLQDLRENTEFSELELQEWY 1 . 3HGE
KAPA_BOVIN GNAAAAKKGSEQESVKEFLAKAKEDFLKKW 1 . MASS
KAPA_CRIGR GNAAAAKKGSEQESVKEFLAKAKEEFLKKW 1 S
KAPA_HUMAN GNAAAAKKGSEQESVKEFLAKAKEDFLKKW 1 S
KAPA_MOUSE GNAAAAKKGSEQESVKEFLAKAKEDFLKKW 1 .
KAPA_RAT GNAAAAKKGSEQESVKEFLAKAKEDFLKKW 1 S
KAPB_BOVIN GNAATAKKGSEVESVKEFLAKAKEDFLKKW 1 S
KAPB_HUMAN GNAATAKKGSEVESVKEFLAKAKEDFLKKW 1 S
KAPC_DROME GNNATTSNKKVDAAETVKEFLEQAKEEFED 1 S
KAPG_HUMAN GNAPAKKDTEQEESVNEFLAKARGDFLYRW 1 S
KAPG_MACMU GNAAAKKDTEQETVNEFLAKARGDFLYRWG 1 S
LCK_CHICK GCCCSSDYDEDWIENIDICEHCNYPIDPDS 1 S
LCK_HUMAN GCGCSSHPEDDWMENIDVCENCHYPIVPLD 1 S
LCK_MOUSE GCVCSSNPEDDWMENIDVCENCHYPIVPLD 1 S 3HGE
LYN_HUMAN GCIKSKGKDSLSDDGVDLKTQPVRNTERTI 1 S
LYN_MOUSE GCIKSKRKDNLNDDEVDSKTQPVRNTDRTI 1 S
LYN_RAT GCIKSKRKDNLNDDGVDMKTQPVRNTDRTI 1 S
MACS_BOVIN GAQFSKTAAKGEATAERPGEAAVASSPSKA 1 . MASS
MACS_CHICK GAQFSKTAAKGEAAAEKPGEAVAASPSKAN 1 .
MACS_HUMAN GAQFSKTAAKGEAAAERPGEAAVASSPSKA 1 S
MACS_MOUSE GAQFSKTAAKGEATAERPGEAAVASSPSKA 1 S 3HGE
MACS_RAT GAQFSKTAAKGEAAAERPGEAAVASSPSKA 1 S
MRP_HUMAN GSQSSKAPRGDVTAEEAAGASPAKANGQEN 1 S
MRP_MOUSE GSQSSKAPRGDVTAEEAAGASPAKANGQEN 1 . 3HGE
MRP_RABIT GSQSSKAPRGDVTAEEAAGASPAKANGQEN 1 S 3HLC
BASP_RAT GSKLSKKKKGYNVNDEKAKDKDKKAEGAGT 1 . 3HGE
NB8M_BOVIN GAHLARRYLGDASVEPDPLRMPTFPPDYGF 1 . MASS
NB8M_HUMAN GAHLVRRYLGDASVEPDPLQMPTFPPDYGF 1 S
NC5R_BOVIN GAQLSTLGHVVLSPVWFLYSLIMKLFQRST 1 . MASS
NC5R_HUMAN GAQLSTLGHMVLFPVWFLYSLLMKLFQRST 1 S
NC5R_RAT GAQLSTLSRVVLSPVWFVYSLFMKLFQRSS 1 S
NCAH_DROME GKQNSKLKPEVLEDLKQNTEFTDAEIQEWY 1 . 3HGE
NECD_BOVIN GKQNSKLRPEVMQDLLESTDFTEHEIQEWY 1 S 3HGE
NECX_APLCA GKQNSKLKPEVLEDLRHQTQFSEEELQEWY 1 P
NEF_HV112 GGKWSKSSVVGWPAVRERMRRAEPAADGVG 2 . 3HGE
NEF_HV1A2 GGKWSKRSMGGWSAIRERMRRAEPRAEPAA 2 . 3HGE
NEF_HV1B1 GGKWSKSSVVGWPAVRERMRRAEPAADGVG 2 . 3HGE
NEF_HV1B8 GGKWSKSSVVGWPAVRERMRRAEPAADGVG 2 . 3HGE
NEF_HV1BN GGKWSKMAGWSTVRERMRRAEPARERMRRA 2 S
NEF_HV1BR GGKWSKSSVVGWPTVRERMRRAEPAADGVG 2 . 3HGE
NEF_HV1EL GGKWSKSSIVGWPAIRERIRRTNPAADGVG 2 . 3HGE
NEF_HV1H2 GGKWSKSSVIGWPTVRERMRRAEPAADRVG 2 . 3HGE
NEF_HV1JR GGKWSKHSVPGWSTVRERMRRAEPATDRVR 2 S
NEF_HV1LW GGKWSKSSVIGWPTVRERMRRAEPAADGVG 2 .
NEF_HV1MA GGKWSKSSIVGWPKIRERIRRTPPTETGVG 2 . 3HGE
NEF_HV1MN GGKWSKRVTGWPTVRERMRRAEPAELAADG 2 S
NEF_HV1ND GGKWSKSSLVGWPAIRERIRKTDPAADGVG 2 S
NEF_HV1OY GGKWSKCSMKGWPTIRERMKRAELQPPEPA 2 .
NEF_HV1PV GGKWSKSSVIGWPAVRERMRRAEPAADGVG 2 . 3HGE
NEF_HV1RH GGKWSKSKMGGWPAVRERMQKAEPAADGVG 2 S
NEF_HV1S1 GGKWSKRMSGWSAVRERMKRAEPAEPAADG 2 S
NEF_HV1S3 GGKWSKSKMGWPAVRERMKRAEPAADGVGA 2 S
NEF_HV1SC GGKWSKRSVVGWPTVRERMRKTEPAADGVG 2 S
NEF_HV1U4 GGKWSKKSRVEWPEVRKRMRETPAAAKGVG 2 S
NEF_HV1Y2 GGKWSKRSMAGWPTVRERMRRAEPAAERMR 2 S
NEF_HV1Z2 GGRWSKSSIVGWPAIRERIRRTDPAADGVG 2 S
NEF_HV1Z6 GGRWSKSSIVGWPAIRERIRRTDPRRTDPA 2 . 3HGE
NEF_HV2BE GASGSKKLSKHSRGLRERLLRARGDGYGKQ 2 S
NEF_HV2CA GASGSKKRSRPLQGLQERLLRARAGTCGEC 2 S
NEF_HV2D1 GASGSKKRSEHSQGLRERLLRARGGGYVKQ 2 S
NEF_HV2G1 GASGSKKHSKHSQRLRERLLRAHGGGYVQQ 2 S
NEF_HV2KR GASGSKKCSRSLQGLRERLLRARGETCGGQ 2 S
NEF_HV2NZ GASGSKKRSKPLQGLQERLLQARGETCGGR 2 S
NEF_HV2RO GASGSKKHSRPPRGLQERLLRARAGACGGY 2 . 3HGE
NEF_HV2SB GASGSKKRSRPSRGLQERLLRARGGACGGL 2 S
NEF_HV2ST GASGSKKRSEPSRGLRERLLQTPGEASGGH 2 S
NEF_SIVA1 GLGSSKPQHKKQLTIWRALHATRHTRYGLL 2 S
NEF_SIVAG GLGNSKPQHKKQLSLWHALHKTRATRYGLL 2 S
NEF_SIVAI GSSNSKRQQQGLLKLWRGLRGKPGADWVLL 2 S
NEF_SIVAT GSQNSKPAHKKYSKLWQALHKTHVTRYGLL 2 S
NEF_SIVCZ GTKWSKSSLVGWPEVRRRIREAPTAAEGVG 2 S
NEF_SIVGB GSSQSKKRSEAWVRYSSALRQLVGGPVTPD 2 S
NEF_SIVM1 GGAISKKRSKPPRDLRQRLLRARGENYGRL 2 S
NEF_SIVMA GGTISMRRSRSTGDLRQRLLRARGETYERL 2 S
NEF_SIVMK GGAISMRRSKPAGDLRQKLLRARGETYGRL 2 S
NEF_SIVML GGAISMRRSKPAGDLRQKLLRARGETYGRL 2 S SPEC
NEF_SIVS4 GGAISKKQYKRGGNLRERLLQARGETYGRL 2 S
NEF_SIVSP GGVTSKKQRRRGGNLYERLLQARGETYGRL 2 S
NOS3_BOVIN GNLKSVGQEPGPPCGLGLGLGLGLCGKQGP 1 . 3HGE
NOS3_HUMAN GNLKSVAQEPGPPCGLGLGLGLGLCGKQGP 1 S
POLG_BOVEV GAQLSRNTAGSHTTGTYATGGSTINYNNIN 2 . 3HTC
POLG_CXA16 GSQVSTQRSGSHENSNSASEGSTINYTTIN 2 S
POLG_CXA21 GAQVSTQKTGAHENQNVAANGSTINYTTIN 2 S
POLG_CXA24 GAQVSSQKVGAHENTNVATGGSTVNYTTIN 2 S 3HLC
POLG_CXA9 GAQVSTQKTGAHETSLSAAGNSIIHYTNIN 2 S
POLG_CXB1J GAQVSTQKTGAHETGLNASGNSIIHYTNIN 2 S
POLG_CXB3N GAQVSTQKTGAHETRLNASGNSIIHYTNIN 2 S
POLG_CXB3W GAQVSTQKTGAHETGLNASGNSIIHYTNIN 2 S
POLG_CXB4E GAQVSTQKTGAHETSLSATGNSIIHYTNIN 2 S
POLG_CXB4J GAQVSTQKTGAHETSLSASGNSIIHYTNIN 2 S
POLG_CXB5P GAQVSTQKTGAHETGLRASGNSIIHYTNIN 2 S
POLG_EC01F GAQVSTQKTGAHETSLSATGNSIIHYTNIN 2 . XRAY
POLG_EC06C GAQVSTQKTGAHETGLSASGNSIIHYTNIN 2 S
POLG_EC09B GAQVSTQKTGAHETGLNASGNSIIHYTNIN 2 S
POLG_EC09H GAQVSTQKTGAHEASLSATGSSIIHYTNIN 2 S
POLG_EC11G GAQVSTQKTGAHETGLNASGSSIIHYTNIN 2 S
POLG_EC12T GAQVSTQKTGAHETGLSASGNSIIHYTNIN 2 S
POLG_EMCV GNSTSSDKNNSSSEGNEGVIINNFYSNQYQ 68 S
POLG_EMCVB GNSTSSDKNNSSSDGNEGVIINNFYSNQYQ 68 S
POLG_EMCVD GNSTSSDKNNSSSDGNEGVIINNFYSNQYQ 68 S
POLG_ENMG3 GNSTSSDKNNSSSEGNEGVIINNFYSNQYQ 68 S
POLG_ENMGO GNSTSSDKNNSSSEGNEGVIINNFYSNQYQ 1 S
POLG_FMDV1 GAGQSSPATGSQNQSGNTGSIINNYYMQQY 202 .
POLG_FMDVA GAGQSSPATGSQNQSGNTGSIINNYYMQQY 201 .
POLG_FMDVO GAGQSSPATGSQNQSGNTGSIINNYYMQQY 202 .
POLG_FMDVT GAGQSSPATGSQNQSGNTGSIINNYYMQQY 202 S
POLG_FMDVZ GAGQSSPATGSQNQSGNTGSIINNYYMQQY 202 S
POLG_HE701 GAQVSRQQTGTHENANVATGGSSITYNQIN 2 S
POLG_HE71B GSQVSTQRSGSHENSNSATEGSTINYTTIN 2 S
POLG_HE71M GSQVSTQRSGSHENSNSATEGSTINYTTIN 2 S
POLG_HRV14 GAQVSTQKSGSHENQNILTNGSNQTFTVIN 2 S XRAY
POLG_HRV16 GAQVSRQNVGTHSTQNMVSNGSSLNYFNIN 2 .
POLG_HRV1B GAQVSRQNVGTHSTQNSVSNGSSLNYFNIN 2 S
POLG_HRV2 GAQVSRQNVGTHSTQNSVSNGSSLNYFNIN 2 S
POLG_HRV3 GAQVSTQKSGSHENQNILTNGSNQTFTVIN 2 .
POLG_HRV89 GAQVSRQNVGTHSTQNSVSNGSSLNYFNIN 2 S
POLG_PEV9U GMQMSKNTAGSHTTVTQASGGSHINYTNIN 2 S
POLG_POL1M GAQVSSQKVGAHENSNRAYGGSTINYTTIN 1 . 3HGE
POLG_POL1S GAQVSSQKVGAHENSNRAYGGSTINYTTIN 2 .
POLG_POL2L GAQVSSQKVGAHENSNRAYGGSTINYTTIN 2 .
POLG_POL2W GAQVSSQKVGAHENSNRAYGGSTINYTTIN 2 S
POLG_POL32 GAQVSSQKVGAHENSNRAYGGSTINYTTIN 2 S
POLG_POL3L GAQVSSQKVGAHENSNRAYGGSTINYTTIN 2 .
POLG_SVDVH GAQVSTQKTGAHETSLSAAGNSVIHYTNIN 2 S
POLG_SVDVU GAQVSTQKTGAHETSLSAAGNSVIHYTNIN 2 S
POLG_TMEVB GNSSSSDKSNSQSSGNEGVIINNFYSNQYQ 77 S
POLG_TMEVD GNASSSDKSNSQSSGNEGVIINNFYSNQYQ 77 S
POLG_TMEVG GNASSSDKSNSQSSGNEGVIINNFYSNQYQ 77 S
POLH_POL1M GAQVSSQKVGAHENSNRAYGGSTINYTTIN 1 .
PPZ1_YEAST GNSSSKSSKKDSHSNSSSRNPRPQVSRTET 1 P
PPZ2_YEAST GNSGSKQHTKHNSKKDDHDGDRKKTLDLPP 1 P
RAPS_CHICK GQDQTKQQIEKGLHLYQSNQTEKALQVWMR 1 S
RAPS_HUMAN GQDQTKQQIEKGLQLYQSNQTEKALQVWTK 1 S
RAPS_MOUSE GQDQTKQQIEKGLQLYQSNQTEKALQVWMK 1 . 3HLC
RAPS_TORCA GQDQTKQQIEKGLQLYQANETGKALEIWQQ 1 . MASS
RECO_BOVIN GNSKSGALSKEILEELQLNTKFTEEELSSW 1 . MASS
RECO_HUMAN GNSKSGALSKEILEELQLNTKFSEEELCSW 1 S
RECO_MOUSE GNSKSGALSKEILEELQLNTKFTEEELSAW 1 S
SIF1_DROME GNKLSCSCAPLMRKAYRYEDSPWQSSRRRD 2 P
SMOD_RANCA GNTKSGALSKEILEELQLNTKFTQEELCTW 1 S
SRC1_XENLA GATKSKPREGGPRSRSLDIVEGSHQPFTSL 1 S
SRC2_XENLA GATKSKPREGGPRSRSLDIAEGSHQPFTSL 1 S
SRC_AVIS2 GSSKSKPKDPSQRRRSLEPPDSTHHGGFPA 2 .
SRC_AVISR GSSKSKPKDPSQRRCSLEPPDSTHHGGFPA 2 .
SRC_AVISS GSSKSKPKDPSQRRRSLEPPDSTHHGGFPA 2 .
SRC_AVIST GSSKSKPKDPSQRRRSLEPPDSTHHGGFPA 2 .
SRC_CHICK GSSKSKPKDPSQRRRSLEPPDSTHHGGFPA 1 . 3HLC
SRC_HUMAN GSNKSKPKDASQRRRSLEPAENVHGAGGGA 1 S
SRC_RSVH1 GSSKSKPKDPSQRRRSLEPPDSTHHGGFPA 2 S
SRC_RSVP GSSKSKPKDPSQRRHSLEPPDSTHHGGFPA 2 .
SRC_RSVPA GSSKSKPKDPSQRRRSLEPPDSTHHGGFPA 2 .
SRC_RSVSR GSSKSKPKGPSQRRRSLEPPDSTHHGGFPA 2 .
STK_HYDAT GPCCSKQTKALNNQPDKSKSKDVVLKENTS 2 S
TIAM_HUMAN GNAESQHVEHEFYGEKHASLGRNDTSRSLR 2 P
TIAM_MOUSE GNAESQNVDHEFYGEKHASLGRKHTSRSLR 2 P
UL11_HSV11 GLSFSGARPCCCRNNVLITDDGEVVSLTAH 2 S
UL11_HSV2 GLAFSGARPCCCRHNVITTDGGEVVSLTAH 2 S
VIS1_HUMAN GKQNSKLAPEVMEDLVKSTEFNEHELKQWY 1 S
VIS1_MOUSE GKQNSKLAPEVMEDLVKSTEFNEHELKQWY 1 S
VIS2_RAT GKNNSKLAPEELEDLVQNTEFSEQELKQWY 1 S
VIS3_CHICK GKQNSKLRPEVLQDLRENTEFTDHELQEWY 1 S
VIS3_HUMAN GKQNSKLRPEVLQDLREKTEFTDHELQEWY 1 P
VIS3_MOUSE GKQNSKLRPEVLQDLREHTEFTDHELQEWY 1 S
VISI_CHICK GNSRSSALSREVLQELRASTRYTEEELSRW 1 S
VMSA_HPBDU GQHPAKSMDVRRIEGGEILLNQLAGRMIPK 2 . 3HGE
VP15_YEAST GAQLSLVVQASPSIAIFSYIDVLEEVHYVS 1 . SPEC
YES_CANFA GCIKSKEDKGPAIKYRNTPEPVSVSHYGAE 2 S
YES_CHICK GCIKSKEDKGPAMKYRTDNTPEPISSHVSH 2 S 3HGE
YES_HUMAN GCIKSKENKSPAIKYRPENTPEPVSTSVSH 2 S
YES_MOUSE GCIKSKENKSPAIKYTPENLTEPVSPSASH 2 S
YES_XENLA GCIKSKEDKGPSIKYRTEPKPDPGSQYGAD 2 S
YES_XIPHE GCVRSKEAKGPALKYQPDNSNVVPVSAHLG 2 S
YRK_CHICK GCVHCKEKISGKGQGGSGTGTPAHPPSQYD 1 S
Q04762 GKVISASRVSMEHFEFQGYRQSFPVWLRPP 1 .
Q04021 GGKWSKSSIVVWPAVRKRMRRTEPAADGVG 2 S
Q00283 GAGQSSPATGSQNQSGNTGSIINNYYMQQY 1 S
Q01119 GTGQSSPATGSQNQSGNTASIINNYYMQQY 1 S
Q01120 GAGQSSPATGSQNQSGNTGSIINNYYMQQY 1 S
Q02472 GNSTSSDKSNSQSSGNEGVIINNFYSNQYQ 77 S
Q03260 GSTSSKSQQLRSEGKYAIGWRLFGKQYTPL 2 S
Q07375 GGAISRRRSKSAGDLRQRLLRARGETYGRL 2 S
Q04022 GGATSKRRSKSPGDLRQRLLRARGETYGRL 2 S
Q07461 GSSKSKPRDPSQRRHSLEPPDSTHHGGFPA 2 S
P82797 GAQVSTQRSGSHETSNVARDGSTINFTNIN 2 S 3HGE
ARF2_ARATH GLSFAKLFSRLFAKKEMRILMVGLDAAGKT 1 P
ARF4_ARATH GARFSRIAKRFLPKSKVRILMVGLDGSGKT 1 P
ARF_SALBA GLSFTKLLGRLFSKKEMRILMVGLDAAGKT 1 P
ARF_VIGUN GLSFTKLFSRLFAKKEMRILMVGLDAAGKT 1 P
GCA2_HUMAN GQEFSWEEAEAAGEIDVAELQEWYKKFVME 1 P 3HGE
GBAZ_MOUSE GCRQSSEEKEAARRSRRIDRHLRSESQRQR 1 S
GBAZ_RAT GCRQSSEEKEAARRSRRIDRHLRSESQRQR 1 S
NEF_HV1ZH GNKWSKGWPAVRERIRQTPPAPPAAEGVGA 2 S
NOS3_MOUSE GNLKSVGQEPGPPCGLGLGLGLGLCGKQGP 1 S
NOS3_PIG GNLKSVGQEPGPPCGLGLGLGLGLCGKQGP 1 S
NOS3_RAT GNLKSVGQEPGPPCGLGLGLGLGLCGKQGP 1 S
AKA7_HUMAN GQLCCFPFSRDEGKISEKNGGEPDDAELVR 1 .
AKA7_MOUSE GQLCCFPFAREEGKICEKDRKEPEDAELVR 1 .
APKB_ARATH GICLSAQIKAVSPGASPKYMSSEANDSLGS 2 S
SRC_MOUSE GSNKSKPKDASQRRRSLEPSENVHGAGGAF 1 S
SRC_RAT GSNKSKPKDASQRRRSLEPAENVHGAGGAF 1 S
Q25326 GSSCTKDSAKEPQKSADKIKSTNETNQGGN 2 N 3HGE
O81223 GCSVSKKKKKNAMRPPGYEDPELLASVTPF 2 N 3HGE
BID_MOUSE GSQASRSFNQGRIEPDSESQEEIIHNIARH 60 N 3HGE
Learning set statistics:
-->Statistic_confidence
of all selected entries (by their SWISSPROT annotation):
number of accepted
entries
: 390
number of entries
with certain LIPID-site
: 88
number of entries
with potential LIPID-site
: 56
number of entries
with LIPID-site by similarity :
234
12 entries had no
MYRISTATE annotation by SWISSPROT at all,
to read them in I had
to add new LIPID-lines (comment is set to 'N')
-->Statistic_confidence
of all selected entries (OUR ANNOTATION):
Distribution of used methods
MASS
18
HPLC 2
3HLC
6
3HGE 46
3HTC
4
XRAY 2
SPEC
4
INTK 0
Annotated
Myristoylation (by SWISSPROT) not (!) N-terminal relative to the sequence
entry:
GAG_FLV
supposed myristoylation position:
76 .
GAG_FSVGA
supposed myristoylation position:
79 .
GAG_FSVHZ
supposed myristoylation position:
76 .
GAG_FSVMD
supposed myristoylation position:
79 .
GAG_FSVST
supposed myristoylation position:
76 .
POLG_EMCV
supposed myristoylation position:
68 S
POLG_EMCVB
supposed myristoylation position:
68 S
POLG_EMCVD
supposed myristoylation position:
68 S
POLG_ENMG3
supposed myristoylation position:
68 S
POLG_FMDV1
supposed myristoylation position:
202 .
POLG_FMDVA
supposed myristoylation position:
201 .
POLG_FMDVO
supposed myristoylation position:
202 .
POLG_FMDVT
supposed myristoylation position:
217 S
POLG_FMDVZ
supposed myristoylation position:
202 S
POLG_TMEVB
supposed myristoylation position:
77 S
POLG_TMEVD
supposed myristoylation position:
77 S
POLG_TMEVG
supposed myristoylation position:
77 S
Q02472 supposed myristoylation position: 77 S
BID_MOUSE
supposed myristoylation position:
60
N
Relevance of
SWISSPROT-annotations:
53 of 88 entries
annotated with '.' have been verified by literature
-> 39.77 % of the
above were annotated too optimistic
4 of 56 entries
annotated with 'P' have been verified by literature
-> 7.14 % were
annotated too pessimistic or new lit. is available
13 of 234 entries
annotated with 'S' have been verified by literature
-> 5.56 % were
annotated too pessimistic or new lit. is available
Additional 12 entries
had no MYRISTATE annotation at all, but could be
validated by
literature (in the table they appear with 'N').
Assuming that '.' should stand for experimental evidence (in contrast to 'P' and 'S'),
83.59 % of the
SWISSPROT entries have correct annotations for myristoylation !
Non-annotated
entries,
whose N-terminal 8 residues were shown to be myristoylated
in vitro (Ashrafi
et al., 1998):