analysis of sequence from A55731.fa ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. MANRRGGGQG QPPSVSPSPG SSGNLSDDRT CTHNICMVSD FFYPNMGGVE SHIYQLSQCL IERGHKVITV THAYGNRKGV RYLTNGLKVY YLPLRVMYNQ STATTLFHSL PLLRYIFVRE RITIIHSHSS FSAMAHDALF HAKTMGLQTV FTDHSLFGFA DVSSVLTNKL LTVSLCDTNH IICVSYTSKE NTVLRAALNP EIVSVIPNAV DPTDFTPDPF RRHDSVITVV VVSRLVYRKG TDLLSGIIPE LCQKYQELHF LIGGEGPKRI ILEEVRERYQ LHDRVQLLGA LEHKDVRNVL VQGHIFLNTS LTEAFCMAIV EAASCGLQVV STKVGGIPEV LPESLIILCE PSVKSLCDGL EKAIFQVKSG TLPAPENIHN VVKTFYTWRN VAERTEKVYE RVSKETVLPM HKRLDRLISH CGPVTGYMFA LLAVLSYLFL IFLQWMTPDS FIDVAIDATG PRRAWTHQWP RDKKRDENDK ISQSR ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > A55731 . . . . . 1 MANRRGGGQGQPPSVSPSPGSSGNLSDDRTCTHNICMVSDFFYPNMGGVE 50 _________________________________EEEEE____________ . . . . . 51 SHIYQLSQCLIERGHKVITVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQ 100 _HHHHHHHHHHH___EEEEEE_______EEEEE____EEEEEHHHHHH__ . . . . . 101 STATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTV 150 __EEEEEEEHHHHHHHHHHHEEEEEE____HHHHHHHHHHHHHHHHEEEE . . . . . 151 FTDHSLFGFADVSSVLTNKLLTVSLCDTNHIICVSYTSKENTVLRAALNP 200 EE__________HHHHHHHHEEEEE______EEEE______HHHHH____ . . . . . 201 EIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVSRLVYRKGTDLLSGIIPE 250 _EEEEE_________________EEEEEEEEEEEEEEE___EEEEEEEHH . . . . . 251 LCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVL 300 HHHHHHHHHHH_______HHHHHHHHHHHHHHHHHHHHHH_______HHH . . . . . 301 VQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTKVGGIPEVLPESLIILCE 350 HHHHEEE____HHHHHHHHHHHHH___EEEE____________EEEEEE_ . . . . . 351 PSVKSLCDGLEKAIFQVKSGTLPAPENIHNVVKTFYTWRNVAERTEKVYE 400 __HHHHH__HHHHHHHHH__________EEEEEEEHHHHHHHHHHHHHHH . . . . . 401 RVSKETVLPMHKRLDRLISHCGPVTGYMFALLAVLSYLFLIFLQWMTPDS 450 HHHH_EEEEHHHHHHHHH_______HHHHHHHHHHHHHHHHHHHH_____ . . . 451 FIDVAIDATGPRRAWTHQWPRDKKRDENDKISQSR 485 EEEEEEE____________________________ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 4.1 % beta-contents : 59.5 % coil-contents : 36.4 % class : beta method : 2 alpha-contents : 0.4 % beta-contents : 51.1 % coil-contents : 48.5 % class : beta ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -35.01 -4.96 -1.58 -1.48 -4.00 0.00 -12.00 0.00 -0.43 -12.26 -3.79 -12.00 -12.00 0.00 -12.00 0.00 -111.52 -13.99 -3.77 -3.63 -2.76 -4.00 0.00 -32.00 0.00 -0.01 -12.26 -3.79 -12.00 -12.00 0.00 0.00 0.00 -100.21 ID: A55731 AC: xxx Len: 485 1:I 476 Sc: -100.21 Pv: 9.931727e-01 NO_GPI_SITE GPI: learning from protozoa -31.66 -5.91 -2.73 -0.21 -4.00 0.00 -12.00 0.00 -0.06 -10.34 -12.73 -12.00 -12.00 0.00 -12.00 0.00 -115.64 -16.40 -4.43 -3.62 -3.92 -4.00 0.00 -32.00 0.00 0.00 -10.34 -12.73 -12.00 -12.00 0.00 0.00 0.00 -111.44 ID: A55731 AC: xxx Len: 485 1:I 476 Sc: -111.44 Pv: 9.724436e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? A55731 0.425 323 Y 0.454 453 Y 0.986 435 Y 0.158 N # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? A55731 0.632 136 Y 0.316 136 N 0.940 437 Y 0.183 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? A55731 0.645 315 Y 0.456 452 Y 0.963 436 Y 0.221 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. 1-5 MANRR gggqgqppsvspspgssg 6-23 24-220 NLSDDRTCTHNICMVSDFFYPNMGGVESHI YQLSQCLIERGHKVITVTHAYGNRKGVRYL TNGLKVYYLPLRVMYNQSTATTLFHSLPLL RYIFVRERITIIHSHSSFSAMAHDALFHAK TMGLQTVFTDHSLFGFADVSSVLTNKLLTV SLCDTNHIICVSYTSKENTVLRAALNPEIV SVIPNAVDPTDFTPDPF rrhdsvitvvvvsrlvyr 221-238 239-485 KGTDLLSGIIPELCQKYQELHFLIGGEGPK RIILEEVRERYQLHDRVQLLGALEHKDVRN VLVQGHIFLNTSLTEAFCMAIVEAASCGLQ VVSTKVGGIPEVLPESLIILCEPSVKSLCD GLEKAIFQVKSGTLPAPENIHNVVKTFYTW RNVAERTEKVYERVSKETVLPMHKRLDRLI SHCGPVTGYMFALLAVLSYLFLIFLQWMTP DSFIDVAIDATGPRRAWTHQWPRDKKRDEN DKISQSR low complexity regions: SEG 25 3.0 3.3 >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. 1-5 MANRR gggqgqppsvspspgssg 6-23 24-204 NLSDDRTCTHNICMVSDFFYPNMGGVESHI YQLSQCLIERGHKVITVTHAYGNRKGVRYL TNGLKVYYLPLRVMYNQSTATTLFHSLPLL RYIFVRERITIIHSHSSFSAMAHDALFHAK TMGLQTVFTDHSLFGFADVSSVLTNKLLTV SLCDTNHIICVSYTSKENTVLRAALNPEIV S vipnavdptdftpdpfrrhdsvitvvvvsr 205-238 lvyr 239-485 KGTDLLSGIIPELCQKYQELHFLIGGEGPK RIILEEVRERYQLHDRVQLLGALEHKDVRN VLVQGHIFLNTSLTEAFCMAIVEAASCGLQ VVSTKVGGIPEVLPESLIILCEPSVKSLCD GLEKAIFQVKSGTLPAPENIHNVVKTFYTW RNVAERTEKVYERVSKETVLPMHKRLDRLI SHCGPVTGYMFALLAVLSYLFLIFLQWMTP DSFIDVAIDATGPRRAWTHQWPRDKKRDEN DKISQSR low complexity regions: SEG 45 3.4 3.75 >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. 1-485 MANRRGGGQGQPPSVSPSPGSSGNLSDDRT CTHNICMVSDFFYPNMGGVESHIYQLSQCL IERGHKVITVTHAYGNRKGVRYLTNGLKVY YLPLRVMYNQSTATTLFHSLPLLRYIFVRE RITIIHSHSSFSAMAHDALFHAKTMGLQTV FTDHSLFGFADVSSVLTNKLLTVSLCDTNH IICVSYTSKENTVLRAALNPEIVSVIPNAV DPTDFTPDPFRRHDSVITVVVVSRLVYRKG TDLLSGIIPELCQKYQELHFLIGGEGPKRI ILEEVRERYQLHDRVQLLGALEHKDVRNVL VQGHIFLNTSLTEAFCMAIVEAASCGLQVV STKVGGIPEVLPESLIILCEPSVKSLCDGL EKAIFQVKSGTLPAPENIHNVVKTFYTWRN VAERTEKVYERVSKETVLPMHKRLDRLISH CGPVTGYMFALLAVLSYLFLIFLQWMTPDS FIDVAIDATGPRRAWTHQWPRDKKRDENDK ISQSR low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. MANRRgggqgqppsvspspgssgnLSDDRTCTHNICMVSDFFYPNMGGVESHIYQLSQCL IERGHKVITVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQSTATTLFHSLPLLRYIFVRE RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADVSSVLTNKLLTVSLCDTNH IICVSYTSKENTVLRAALNPEIVSVIPNAVDPTDFTPDPFrrhdsvitvvvvSRLVYRKG TDLLSGIIPELCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVL VQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTKVGGIPEVLPESLIILCEPSVKSLCDGL EKAIFQVKSGTLPAPENIHNVVKTFYTWRNVAERTEKVYERVSKETVLPMHKRLDRLISH CGPVTGYmfallavlsylfliflQWMTPDSFIDVAIDATGPRRAWTHQWPRDKKRDENDK ISQSR 1 - 5 MANRR 6 - 24 gggqg qppsvspspg ssgn 25 - 220 LSDDRT CTHNICMVSD FFYPNMGGVE SHIYQLSQCL IERGHKVITV THAYGNRKGV RYLT NGLKVY YLPLRVMYNQ STATTLFHSL PLLRYIFVRE RITIIHSHSS FSAMAHDALF HAKT MGLQTV FTDHSLFGFA DVSSVLTNKL LTVSLCDTNH IICVSYTSKE NTVLRAALNP EIVS VIPNAV DPTDFTPDPF 221 - 232 rrhdsvitvv vv 233 - 427 SRLVYRKG TDLLSGIIPE LCQKYQELHF LIGGEGPKRI ILEEVRERYQ LHDRVQLLGA LE HKDVRNVL VQGHIFLNTS LTEAFCMAIV EAASCGLQVV STKVGGIPEV LPESLIILCE PS VKSLCDGL EKAIFQVKSG TLPAPENIHN VVKTFYTWRN VAERTEKVYE RVSKETVLPM HK RLDRLISH CGPVTGY 428 - 443 mfa llavlsylfl ifl 444 - 485 QWMTPDS FIDVAIDATG PRRAWTHQWP RDKKRDENDK ISQSR low complexity regions: DUST >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. MANRRGGGQGQPPSVSPSPGSSGNLSDDRTCTHNICMVSDFFYPNMGGVESHIYQLSQCL IERGHKVITVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQSTATTLFHSLPLLRYIFVRE RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADVSSVLTNKLLTVSLCDTNH IICVSYTSKENTVLRAALNPEIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVSRLVYRKG TDLLSGIIPELCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVL VQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTKVGGIPEVLPESLIILCEPSVKSLCDGL EKAIFQVKSGTLPAPENIHNVVKTFYTWRNVAERTEKVYERVSKETVLPMHKRLDRLISH CGPVTGYMFALLAVLSYLFLIFLQWMTPDSFIDVAIDATGPRRAWTHQWPRDKKRDENDK ISQSR ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for A55731 sequence: 485 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MANRRGGGQG QPPSVSPSPG SSGNLSDDRT CTHNICMVSD FFYPNMGGVE SHIYQLSQCL ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 IERGHKVITV THAYGNRKGV RYLTNGLKVY YLPLRVMYNQ STATTLFHSL PLLRYIFVRE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 RITIIHSHSS FSAMAHDALF HAKTMGLQTV FTDHSLFGFA DVSSVLTNKL LTVSLCDTNH ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 IICVSYTSKE NTVLRAALNP EIVSVIPNAV DPTDFTPDPF RRHDSVITVV VVSRLVYRKG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 TDLLSGIIPE LCQKYQELHF LIGGEGPKRI ILEEVRERYQ LHDRVQLLGA LEHKDVRNVL ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 360 VQGHIFLNTS LTEAFCMAIV EAASCGLQVV STKVGGIPEV LPESLIILCE PSVKSLCDGL ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 420 EKAIFQVKSG TLPAPENIHN VVKTFYTWRN VAERTEKVYE RVSKETVLPM HKRLDRLISH ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~333333333 33333~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 480 CGPVTGYMFA LLAVLSYLFL IFLQWMTPDS FIDVAIDATG PRRAWTHQWP RDKKRDENDK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . ISQSR ~~~~~ ----- ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** A55731.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: A55731.fa.___inter___ (1 sequences) MANRRGGGQGQPPSVSPSPGSSGNLSDDRTCTHNICMVSDFFYPNMGGVE SHIYQLSQCLIERGHKVITVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQ STATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTV FTDHSLFGFADVSSVLTNKLLTVSLCDTNHIICVSYTSKENTVLRAALNP EIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVSRLVYRKGTDLLSGIIPE LCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVL VQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTKVGGIPEVLPESLIILCE PSVKSLCDGLEKAIFQVKSGTLPAPENIHNVVKTFYTWRNVAERTEKVYE RVSKETVLPMHKRLDRLISHCGPVTGYMFALLAVLSYLFLIFLQWMTPDS FIDVAIDATGPRRAWTHQWPRDKKRDENDKISQSR (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 2 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 312 332 0.907 Putative 2 424 444 2.229 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 Loop length 311 91 41 K+R profile + 8.00 + CYT-EXT prof 0.98 - 0.63 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 8.00 Tm probability: 0.77 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 0.00 (NEG-POS)/(NEG+POS): -0.0526 NEG: 27.0000 POS: 30.0000 -> Orientation: undecided CYT-EXT difference: 0.35 -> Orientation: N-out ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 2 Loop length 423 41 K+R profile + 8.00 CYT-EXT prof 0.95 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -8.00 Tm probability: 1.00 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 5.00 (NEG-POS)/(NEG+POS): -0.0488 NEG: 39.0000 POS: 43.0000 -> Orientation: N-in CYT-EXT difference: 0.95 -> Orientation: N-out ---------------------------------------------------------------------- "A55731" 485 312 332 #f 0.907292 424 444 #t 2.22917 ************************************ *TOPPREDM with prokaryotic function* ************************************ A55731.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: A55731.fa.___inter___ (1 sequences) MANRRGGGQGQPPSVSPSPGSSGNLSDDRTCTHNICMVSDFFYPNMGGVE SHIYQLSQCLIERGHKVITVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQ STATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTV FTDHSLFGFADVSSVLTNKLLTVSLCDTNHIICVSYTSKENTVLRAALNP EIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVSRLVYRKGTDLLSGIIPE LCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVL VQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTKVGGIPEVLPESLIILCE PSVKSLCDGLEKAIFQVKSGTLPAPENIHNVVKTFYTWRNVAERTEKVYE RVSKETVLPMHKRLDRLISHCGPVTGYMFALLAVLSYLFLIFLQWMTPDS FIDVAIDATGPRRAWTHQWPRDKKRDENDKISQSR (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 2 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 312 332 0.907 Putative 2 424 444 2.229 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 Loop length 311 91 41 K+R profile + 8.00 + CYT-EXT prof 0.98 - 0.63 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 8.00 Tm probability: 0.77 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 0.00 (NEG-POS)/(NEG+POS): -0.0526 NEG: 27.0000 POS: 30.0000 -> Orientation: undecided CYT-EXT difference: 0.35 -> Orientation: N-out ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 2 Loop length 423 41 K+R profile + 8.00 CYT-EXT prof 0.95 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -8.00 Tm probability: 1.00 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 5.00 (NEG-POS)/(NEG+POS): -0.0488 NEG: 39.0000 POS: 43.0000 -> Orientation: N-in CYT-EXT difference: 0.95 -> Orientation: N-out ---------------------------------------------------------------------- "A55731" 485 312 332 #f 0.907292 424 444 #t 2.22917 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ SAPS. Version of April 11, 1996. Date run: Thu Nov 22 12:53:38 2001 File: /people/b_eisen/A55731.fa.___saps___ ID A55731 DE GPI-anchor biosynthesis protein PIG-A - mouse. number of residues: 485; molecular weight: 54.5 kdal 1 MANRRGGGQG QPPSVSPSPG SSGNLSDDRT CTHNICMVSD FFYPNMGGVE SHIYQLSQCL 61 IERGHKVITV THAYGNRKGV RYLTNGLKVY YLPLRVMYNQ STATTLFHSL PLLRYIFVRE 121 RITIIHSHSS FSAMAHDALF HAKTMGLQTV FTDHSLFGFA DVSSVLTNKL LTVSLCDTNH 181 IICVSYTSKE NTVLRAALNP EIVSVIPNAV DPTDFTPDPF RRHDSVITVV VVSRLVYRKG 241 TDLLSGIIPE LCQKYQELHF LIGGEGPKRI ILEEVRERYQ LHDRVQLLGA LEHKDVRNVL 301 VQGHIFLNTS LTEAFCMAIV EAASCGLQVV STKVGGIPEV LPESLIILCE PSVKSLCDGL 361 EKAIFQVKSG TLPAPENIHN VVKTFYTWRN VAERTEKVYE RVSKETVLPM HKRLDRLISH 421 CGPVTGYMFA LLAVLSYLFL IFLQWMTPDS FIDVAIDATG PRRAWTHQWP RDKKRDENDK 481 ISQSR -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 24( 4.9%); C : 11( 2.3%); D : 22( 4.5%); E : 24( 4.9%); F : 20( 4.1%) G : 29( 6.0%); H : 20( 4.1%); I : 30( 6.2%); K : 21( 4.3%); L : 50(10.3%) M : 10( 2.1%); N : 18( 3.7%); P : 24( 4.9%); Q : 16( 3.3%); R : 30( 6.2%) S : 38( 7.8%); T : 33( 6.8%); V : 45( 9.3%); W : 4( 0.8%); Y : 16( 3.3%) KR : 51 ( 10.5%); ED : 46 ( 9.5%); AGP : 77 ( 15.9%); KRED : 97 ( 20.0%); KR-ED : 5 ( 1.0%); FIKMNY : 115 ( 23.7%); LVIFM : 155 ( 32.0%); ST : 71 ( 14.6%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 000++00000 0000000000 000000--+0 000000000- 000000000- 0000000000 61 0-+00+0000 000000++00 +000000+00 0000+00000 0000000000 000+0000+- 121 +000000000 000000-000 00+0000000 00-0000000 -0000000+0 000000-000 181 00000000+- 0000+00000 -000000000 -00-000-00 ++0-000000 000+000++0 241 0-0000000- 000+00-000 0000-00++0 00--0+-+00 00-+000000 0-0+-0+000 301 0000000000 00-0000000 -000000000 00+00000-0 00-000000- 000+000-00 361 -+00000+00 00000-0000 00+00000+0 00-+0-+00- +00+-00000 0++0-+0000 421 0000000000 0000000000 00000000-0 00-000-000 0++0000000 +-+++--0-+ 481 0000+ A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none Negative charge clusters (cmin = 9/30 or 12/45 or 14/60): none Mixed charge clusters (cmin = 14/30 or 19/45 or 24/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 5 | 5 | 7 | 41 | 9 | 9 | 12 | 11 | 11 | 15 | 8 | 10 | lmin1 6 | 6 | 8 | 50 | 11 | 11 | 15 | 14 | 14 | 18 | 9 | 12 | lmin2 7 | 7 | 10 | 55 | 13 | 13 | 17 | 16 | 15 | 20 | 11 | 14 | (Significance level: 0.010000; Minimal displayed length: 6) (*) 9(1,0,0); at 471- 480: RDKKRDENDK (4. quartile) +-+++--0-+ Run count statistics: + runs >= 3: 1, at 473; - runs >= 3: 0 * runs >= 4: 1, at 471; 0 runs >= 27: 1, at 417; -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. __________________________________ High scoring hydrophobic segments: 2.00 (LVIFM) 1.00 (AGYCW) 0.00 (BZX) -2.00 (PH) -4.00 (STNQ) -8.00 (KEDR) Expected score/letter: -1.835 M_0.01= 26.94; M_0.05= 22.28 1) From 426 to 443: length= 18, score=25.00 * 426 GYMFALLAVL SYLFLIFL L: 6(33.3%); A: 2(11.1%); F: 3(16.7%); Y: 2(11.1%); ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -3.225 M_0.01= 67.38; M_0.05= 55.27; M_0.30= 40.87 1) From 424 to 443: length= 20, score=62.00 * 424 VTGYMFALLA VLSYLFLIFL L: 6(30.0%); A: 2(10.0%); V: 2(10.0%); F: 3(15.0%); Y: 2(10.0%); 2. SPACINGS OF C. H2N-30-C-4-C-22-C-116-C-6-C-68-C-63-C-8-C-23-C-7-C-63-C-64-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-30-C-1-H-2-C-15-H-6-C-5-H-6-H-35-H-17-H-1-H-7-H-4-H-12-H-21-C-3-H-2-C-39-H-28-C-6-H-22-H-10-H-10-H-11-C-8-C-23-C-7-C-21-H-31-H-8-H-C-45-H-18-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 Aligned matching blocks: [ 67- 70] VITV [ 226- 229] VITV ______________________________ [ 118- 121] VRER [ 275- 278] VRER ______________________________ [ 174- 177] SLCD [ 355- 358] SLCD B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 32 (Expected range: 12-- 43) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 9 (6-10) 9 (11-20) 8 (>=21) 7 3. Clusters of amino acid multiplets (cmin = 11/30 or 15/45 or 18/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 11 (Expected range: 0-- 18) 8 +plets (f+: 10.5%), 3 -plets (f-: 9.5%) Total number of charge altplets: 16 (Critical number: 21) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 3 (6-10) 2 (11-20) 1 (>=21) 6 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 14- 29 4 S... 4 4 0 74- 105 8 Y....... 4 4 0 229- 232 1 V 4 4 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 157- 237 9 i.......0 8 6 /1/./././././././3/ 251- 320 10 i......... 7 7 0 438- 443 1 i 6 6 0 471- 480 1 * 9 7 1 -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 0- 388 (2.) W( 388)W 1 of 5 0.0076 large 1. maximal spacing 388- 445 (4.) W( 57)W 2 of 5 0.9768 small 2. maximal spacing ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Glycos_transf_1 Glycosyl transferases group 1 78.7 1.2e-20 1 Gemini_C4 Geminivirus C4 protein 1.7 54 1 PHP_N PHP domain N-terminal region -27.2 52 1 DUF59 Domain of unknown function DUF59 -41.5 96 1 DUF216 Domain of unknown function DUF -66.6 42 1 PGM_PMM_I Phosphoglucomutase/phosphomannomutase -71.5 75 1 Pico_P2A Picornavirus core protein 2A -103.0 75 1 UPF0023 Uncharacterized protein family UPF002 -123.3 26 1 HSP33 Hsp33 protein -188.6 72 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- PHP_N 1/1 126 175 .. 1 83 [] -27.2 52 Gemini_C4 1/1 179 192 .. 1 16 [. 1.7 54 DUF216 1/1 55 206 .. 1 208 [] -66.6 42 Pico_P2A 1/1 215 304 .. 1 127 [] -103.0 75 UPF0023 1/1 99 336 .. 1 256 [] -123.3 26 DUF59 1/1 283 344 .. 1 82 [] -41.5 96 Glycos_transf_1 1/1 212 364 .. 1 169 [. 78.7 1.2e-20 PGM_PMM_I 1/1 300 413 .. 1 161 [] -71.5 75 HSP33 1/1 191 435 .. 1 294 [] -188.6 72 Alignments of top-scoring domains: PHP_N: domain 1 of 1, from 126 to 175: score -27.2, E = 52 *->vdLHvHSdySlLDGalspeelverAkelGlkaiAiTDHgnellfgav H HS +S a+ +l + Ak +Gl+ + TDH+ lfg++ A55731 126 ---HSHSSFS----AMAHDALFH-AKTMGLQTV-FTDHS---LFGFA 160 efyekeRLkQltridelnkaakkagikpIiGiEasi<-* + + + k +s+ A55731 161 DVSSV-------------LTNKL--------LTVSL 175 Gemini_C4: domain 1 of 1, from 179 to 192: score 1.7, E = 54 *->MGnlIsmcSSsSKens<-* n+I S+ SKen+ A55731 179 --NHIICVSYTSKENT 192 DUF216: domain 1 of 1, from 55 to 206: score -66.6, E = 42 *->qVFhlLlfLLAiqRFllYFfPsqRQtEksVkivQkfiqkriwylYlv q + L+ + + k+++++ + +++ ++ Yl+ A55731 55 QLSQCLIE-----------RGH-----KVITVTHAYGNRKGVR-YLT 84 FiiKdvisllviviliayrvlaFfllaslnnskkiefwltgfelleliyv +K ++l +++n s+ + l+ + +++ A55731 85 NGLKVYY---------------LPLRVMYNQSTATT--LF---HSLPLLR 114 tvfivlnlLiflsaflYiPiiisirkdflshltSaqqhlllfnkpqkYil ++f+ +++ i+ s++ + ++ + + ++++ q +f+ + + A55731 115 YIFVRERITIIHSHSSFSAMAHDALF--HAKTMGLQT---VFTDHSLFGF 159 WQtilVfifKlitspvgiififfyldsaafiiliIpitdmpSrkrivvtD + +V+ +Kl t + l ++++ii+ ++t+ k +v+ A55731 160 ADVSSVLTNKLLT---------VSLCDTNHIICV-SYTS----KENTVLR 195 ivttPliIQiS<-* + P i+ ++ A55731 196 AALNPEIVSVI 206 Pico_P2A: domain 1 of 1, from 215 to 304: score -103.0, E = 75 *->NyHLATpeDw..eNlvw..vdynRDLLVtrttAhGcD....tIAR.C +Tp+ +++++v + v +R LV+r G+D +++I+ C A55731 215 ----FTPDPFrrHDSVItvVVVSR--LVYR---KGTDllsgIIPElC 252 nCttGVYYCksrnKyYPVsfegPtiieieas.eYYPaRyQshvLlGvGfa + k+ + ++ egP i e+++e Y+ + + +L + A55731 253 Q--------KYQELHFLIGGEGPKRIILEEVrERYQLHDRVQLLGAL--- 291 EPGDCGGiLRCeHGVIGIvTAGGeGvVaFADvRDLlwlEe<-* eH DvR+ l+ + A55731 292 -----------EH----------------KDVRNVLVQGH 304 UPF0023: domain 1 of 1, from 99 to 336: score -123.3, E = 26 *->kgpdiqvsLtnvaivRlkkaGkrFEia....cyknkvad..yregqe ++ + + R+ ++r i ++++ + d + + A55731 99 NQSTATTLFHSLPLLRYIFVRERITIIhshsSFSAMAHDalFHAKT- 144 kDlDEvLqihvVFrnvSKGevAkkEeLskiFGttdvkeiikkqRRLAiil +VF + S FG dv + + + A55731 145 ------MGLQTVFTDHS------------LFGFADVSSVLTNKLLTVSLC 176 KRAsGGevqLaekerelllekvkreiitiVSrktiNPetkkpyPPsvIeK v+ + ke l+ + ei++++ + + P +p P + + A55731 177 DTNHIICVSYTSKENTVLRAALNPEIVSVI-PNAVDPTDFTPDPFRRHDS 225 ALqELkfslkinkSAkeQaleaikkLvskkvLpIrra......KmkikVa + s +++ +i+ L k + +++++K+ i A55731 226 VITVVVVSRLVYRKGTDLLSGIIPEL-CQKYQELHFLiggegpKRIILEE 274 IsEPakeAekvekalkllassPkeeekqedgSLivvglI...epgsyreL + +e + ll++ e + v g I + + ++ A55731 275 V----RERYQLHDRVQLLGAL----EHKDVRNVLVQGHIflnTSLTEAFC 316 yalvrketKGHGrvqvlslkkvve<-* +a v +qv s k v A55731 317 MAIVEAASC---GLQVVSTK-VGG 336 DUF59: domain 1 of 1, from 283 to 344: score -41.5, E = 96 *->lkeaileALktViDPElpvVdvVdLGlVydLvdvdGddGEtnVkvkm ++ ++l AL E+ V++ v v+G + + ++ A55731 283 DRVQLLGAL------EHKD--------VRN-VLVQG--H---IFLNT 309 tLTtpgCPladlIeddvreAvkeslpGvedVeVel<-* LT ++C++ + ++ ++v + ++G+ +V e A55731 310 SLTEAFCMAIVEAASCGLQVVSTKVGGIPEVLPES 344 Glycos_transf_1: domain 1 of 1, from 212 to 364: score 78.7, E = 1.2e-20 *->dreeirkklgikedkkiilfvGRlvpeKGidllieAfkkLkkkpkll + ++ ++ + + ++++ v+Rlv++KG dll +++L++k A55731 212 PTDFTPDPFRRHDSVITVVVVSRLVYRKGTDLLSGIIPELCQK---- 254 klnpnlkLvivGgpYdsedgeeedelkklaeklglednviflGfvpdedl +++l+++i G +g+++ l++ e + l d+v++lG + ++d+ A55731 255 --YQELHFLIGG------EGPKRIILEEVRERYQLHDRVQLLGALEHKDV 296 pelyksadvfvlPSryEgFGivllEAmAcGlPVIatncvgGipEvvkdge ++++ + ++f+ +S +E+F+++++EA +cGl V++t vgGipEv+ + A55731 297 RNVLVQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTK-VGGIPEVLPESL 345 tGllvepgqdpealaeaiekll<-* +l+ep+ +++l +++ek++ A55731 346 -IILCEPS--VKSLCDGLEKAI 364 PGM_PMM_I: domain 1 of 1, from 300 to 413: score -71.5, E = 75 *->akarkktltiFgtydiRGkvgestlTedfayrIgrAigavlksagat ++ ++ ++ ++lTe+f+++I +A+ +l+ +++ A55731 300 LVQGHI------------FLN-TSLTEAFCMAIVEAASCGLQVVSTK 333 tvvvGgDgRlssyeleqalaagLaaaGinvldiGQdGlvpTPaiyfatrT vGg +e++ + +L + ++ l + Gl + + f+ A55731 334 ---VGG-----IPEVLPESLIILCEPSVKSLCD---GL---EKAIFQVK- 368 YNRDRslkaagGImiTASHNPGGpdedNGiKfnrsnGgpipediGekaIe s ++ P + N K + ++e++ ++++ A55731 369 -----SGTLP---------AP--ENIHNVVKTFYTWR-NVAERT--EKVY 399 aiaeknesykvsge<-* ++++k++ +++ + A55731 400 ERVSKETVLPMHKR 413 HSP33: domain 1 of 1, from 191 to 435: score -188.6, E = 72 *->DqLvrAlakdgaVRay..vVrttntveearrrHnlspsataaLGrtm rA + V + +++V t+ + + rrH+ ++ ++ r+ A55731 191 NTVLRAALNPEIVSVIpnAVDPTDFTPDPFRRHDSVITV-VVVSRLV 236 v..AtlLLtAt...LKfdkNgkltvkIeGdGPlglivVdAnakGqVRGyV +++t LL ++L + + l I G GP + i+ + q V A55731 237 YrkGTDLLSGIipeLCQKY-QELHFLIGGEGPKRIILEEVRERYQLHDRV 285 rNPsVdtvdgniqGkKKlDvkkaVGtkGtLvVVkdlgeGepYtgvVeLvs + + g + k Dv++ +g + L A55731 286 Q------LLGALEHK---DVRNVLV-----------------QGHIFLNT 309 geiGeDltyYLvrSEQlPSAvalgVrVgeddgvpaAGGmLlQvMPgAAtd ++e + v + +l V +++g+p v+P A55731 310 -SLTEAFCMAIVEA----ASCGLQVVSTKVGGIPE-------VLP----- 342 eatkedlEhrltllepvtelelkGlpaeeIleelLge..eevailyelqd E+++ l ep + + Gl + I++ g + +++ + + A55731 343 -------ESLIILCEPSVKSLCDGLE-KAIFQVKSGTlpAPEN-IHNVVK 383 VrFkCpCSkERvkaALllLsdeEledileEdkgeaEasCdFCGeh..YlF + + +ER + s+e + + + + + CG +++Y F A55731 384 TFYTWRNVAERTEKVYERVSKETVLPM---H-KRLDRLISHCGPVtgYMF 429 dreeieel<-* ++ l A55731 430 --ALLAVL 435 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Glycos_transf_1 Glycosyl transferases group 1 78.7 1.2e-20 1 HOK_GEF Hok/gef family 4.4 13 1 Gemini_C4 Geminivirus C4 protein 1.7 54 1 7kD_coat 7kD viral coat protein 1.4 92 1 RuvA RuvA N terminal domain 0.7 31 1 PARP Poly(ADP-ribose) polymerase catalytic -0.6 85 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- HOK_GEF 1/1 61 78 .. 33 50 .] 4.4 13 RuvA 1/1 154 159 .. 63 68 .] 0.7 31 Gemini_C4 1/1 179 192 .. 1 16 [. 1.7 54 Glycos_transf_1 1/1 212 364 .. 1 169 [. 78.7 1.2e-20 PARP 1/1 422 429 .. 105 112 .. -0.6 85 7kD_coat 1/1 427 450 .. 1 30 [. 1.4 92 Alignments of top-scoring domains: HOK_GEF: domain 1 of 1, from 61 to 78: score 4.4, E = 13 *->irqGntEvaAfLAYEskk<-* i++G+++ ++ AY +k A55731 61 IERGHKVITVTHAYGNRK 78 RuvA: domain 1 of 1, from 154 to 159: score 0.7, E = 31 *->hlLYGF<-* h+L+GF A55731 154 HSLFGF 159 Gemini_C4: domain 1 of 1, from 179 to 192: score 1.7, E = 54 *->MGnlIsmcSSsSKens<-* n+I S+ SKen+ A55731 179 --NHIICVSYTSKENT 192 Glycos_transf_1: domain 1 of 1, from 212 to 364: score 78.7, E = 1.2e-20 *->dreeirkklgikedkkiilfvGRlvpeKGidllieAfkkLkkkpkll + ++ ++ + + ++++ v+Rlv++KG dll +++L++k A55731 212 PTDFTPDPFRRHDSVITVVVVSRLVYRKGTDLLSGIIPELCQK---- 254 klnpnlkLvivGgpYdsedgeeedelkklaeklglednviflGfvpdedl +++l+++i G +g+++ l++ e + l d+v++lG + ++d+ A55731 255 --YQELHFLIGG------EGPKRIILEEVRERYQLHDRVQLLGALEHKDV 296 pelyksadvfvlPSryEgFGivllEAmAcGlPVIatncvgGipEvvkdge ++++ + ++f+ +S +E+F+++++EA +cGl V++t vgGipEv+ + A55731 297 RNVLVQGHIFLNTSLTEAFCMAIVEAASCGLQVVSTK-VGGIPEVLPESL 345 tGllvepgqdpealaeaiekll<-* +l+ep+ +++l +++ek++ A55731 346 -IILCEPS--VKSLCDGLEKAI 364 PARP: domain 1 of 1, from 422 to 429: score -0.6, E = 85 *->APvTGYMF<-* +PvTGYMF A55731 422 GPVTGYMF 429 7kD_coat: domain 1 of 1, from 427 to 450: score 1.4, E = 92 *->lilaillvlvltvaallllysldssttsne<-* ++ ++ll+++ +++++ +l+ ++t+++ A55731 427 YM-FALLAVLSYLFLI-FLQ----WMTPDS 450 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Thu Nov 22 12:55:41 2001 Sequence file: A55731.fa ---------------------------------------- Sequence A55731 (485 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 24: NLSD 99: NQST 308: NTSL Total matches: 3 Matching pattern PS00004 CAMP_PHOSPHO_SITE: 238: RKGT Total matches: 1 Matching pattern PS00005 PKC_PHOSPHO_SITE: 167: TNK 187: TSK 331: STK 352: SVK 387: TWR 395: TEK Total matches: 6 Matching pattern PS00006 CK2_PHOSPHO_SITE: 174: SLCD 187: TSKE 310: SLTE 355: SLCD 450: SFID Total matches: 5 Matching pattern PS00008 MYRISTYL: 6: GGGQGQ 10: GQPPSV 20: GSSGNL 47: GGVESH 75: GNRKGV 359: GLEKAI Total matches: 6 Total no of hits in this sequence: 21 ======================================== 1314 pattern(s) searched in 1 sequence(s), 485 residues. Total no of hits in all sequences: 21. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >ER-GOLGI-traffic signal is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M minimal SH3 binding is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. ;LENGTH=485; DIRECT_SEQUENCE n 2 solutions m %_E 257-257 %_XXXL 258-261 %_I 262-262 f m %_D 283-283 %_XXXL 284-287 %_L 288-288 f >STATISTICS Total : 2 solutions in 1 sequences, 485 units; out of 1 sequences, 485 units >TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >EXTRACELL-M minimal furin protease cleavage site motif is the MOTIF name >A55731 GPI-anchor biosynthesis protein PIG-A - mouse. ;LENGTH=485; DIRECT_SEQUENCE n 1 solutions m %_RXXR 413-416 f >STATISTICS Total : 1 solutions in 1 sequences, 485 units; out of 1 sequences, 485 units >EXTRACELL-M extended furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >EXTRACELL-M zinc binding motif in MMPs is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >EXTRACELL-M g alpha binding go loco is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SV40 LrgT experimentally determined is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS Max experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >PDZ domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units >WW domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 485 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB+ PSSM from other authors IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= A55731 GPI-anchor biosynthesis protein PIG-A - mouse. (485 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value AAA AAA+ ATPase Module 24 0.37 VPS9 VPS9 related protein which are possible RAB-type GTPase 23 0.98 CYCLIN Cyclin/TFIIB domain 22 1.6 SEC7 Sec7 like GDP exchange factor for ARF like GTPases 22 1.7 BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 22 1.8 ARM Armadillo repeat 22 1.9 DAGKIN Diacyl glycerol kinase 21 3.6 RASGEF RAS-type GTPase GDP exchange factor 21 4.3 LRR Leucine rich repeats 20 5.9 CALMO Calmodulin like EF-hand domains 20 6.3 MYB MYB domain(HTH DNA binding domain) 20 7.2 HOG HOG- intein(HINT) like domain 20 8.4 INSL Insulinase like Metallo protease domain 20 8.7 MBL Metallo-betalactamase domain 20 9.4 >AAA AAA+ ATPase Module Length = 298 Score = 24.1 bits (51), Expect = 0.37 Identities = 13/86 (15%), Positives = 13/86 (15%), Gaps = 4/86 (4%) Query: 234 RLVYRKGTDLLSGIIPELCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEH 293 Sbjct: 107 TFIRVSGSELVQKFIGEGARMVRELFVMAREHAPSIIFMDEIDS---IGSRLEGGS-GGD 162 Query: 294 KDVRNVLVQGHIFLNTSLTEAFCMAI 319 Sbjct: 163 SEVQRTMLELLNQLDGFEATKNIKVI 188 Score = 20.2 bits (41), Expect = 6.5 Identities = 9/56 (16%), Positives = 9/56 (16%), Gaps = 5/56 (8%) Query: 250 ELCQKYQELHF--LIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVLVQG 303 Sbjct: 35 MMVEKVPDSTYEMIGGLDKQIKEIKEVIELPVKHPEHFEALGI---AQPKGVLLYG 87 >VPS9 VPS9 related protein which are possible RAB-type GTPase Length = 117 Score = 22.9 bits (49), Expect = 0.98 Identities = 10/66 (15%), Positives = 10/66 (15%), Gaps = 2/66 (3%) Query: 66 KVITVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQSTATTLFHSLPLLRYIFVRERITII 125 Sbjct: 8 ELGKINRFKSPRDKMVCVLNASKVIFGLLKHTKLEQNGADSF--IPVLIYCILKGQVRYL 65 Query: 126 HSHSSF 131 Sbjct: 66 VSNVNY 71 >CYCLIN Cyclin/TFIIB domain Length = 317 Score = 22.3 bits (47), Expect = 1.6 Identities = 8/50 (16%), Positives = 8/50 (16%), Gaps = 6/50 (12%) Query: 415 DRLISHCGPVTGYMFALLAVLSYLFLIF------LQWMTPDSFIDVAIDA 458 Sbjct: 118 QFVGNLRESPLGQEKALEQILEYELLLIQQLNFHLIVHNPYRPFEGFLID 167 >SEC7 Sec7 like GDP exchange factor for ARF like GTPases Length = 192 Score = 22.1 bits (47), Expect = 1.7 Identities = 13/54 (24%), Positives = 13/54 (24%), Gaps = 2/54 (3%) Query: 259 HFLIGGEGPKRI-ILEEVRERYQLHDRVQLLGALEHKDVRNVLVQGHIFLNTSL 311 Sbjct: 94 HIRVQGEAQKVERLIEAFSQRY-CICNPGVVRQFRNPDTIFILAFAIILLNTDM 146 >BRIGHT BRIGHT domain (Alpha helical DNA binding domain) Length = 172 Score = 21.9 bits (46), Expect = 1.8 Identities = 7/23 (30%), Positives = 7/23 (30%) Query: 42 FYPNMGGVESHIYQLSQCLIERG 64 Sbjct: 44 RLPIMAKSVLDLYELYNLVIARG 66 >ARM Armadillo repeat Length = 532 Score = 21.8 bits (46), Expect = 1.9 Identities = 14/77 (18%), Positives = 14/77 (18%), Gaps = 9/77 (11%) Query: 353 VKSLCDGLEKAIFQVKSGTLPA------PENIHNVVKTFYTWRNVAERTEKVYERVSKET 406 Sbjct: 225 LSNFCRGKPQPHFDQVKPALPALERLIHSDDEEVLTDACWALSYLSDGTNDKIQTVIQAG 284 Query: 407 VLPMHKRLDRLISHCGP 423 Sbjct: 285 VVP---KLVELLLHHSP 298 >DAGKIN Diacyl glycerol kinase Length = 128 Score = 20.9 bits (44), Expect = 3.6 Identities = 11/40 (27%), Positives = 11/40 (27%), Gaps = 1/40 (2%) Query: 194 LRAALNPEIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVS 233 Sbjct: 22 LCWLLNPRQVFDITSLKGP-KFGLEMFRKVVTQLRILVCG 60 >RASGEF RAS-type GTPase GDP exchange factor Length = 196 Score = 20.9 bits (44), Expect = 4.3 Identities = 12/70 (17%), Positives = 12/70 (17%), Gaps = 5/70 (7%) Query: 387 TWRNVAERTEKVYERVSKETVLPM--HKRLDRLISHCGPVTGYMFALLAVLSYLFLIF-- 442 Sbjct: 124 AWRLIEPGDLLTWEEL-KKIPSLDRNYSTIRNLLNSVNPLVGCVPFIVVYLSDLSANAEK 182 Query: 443 LQWMTPDSFI 452 Sbjct: 183 KDWILEDKVV 192 >LRR Leucine rich repeats Length = 339 Score = 20.3 bits (42), Expect = 5.9 Identities = 5/18 (27%), Positives = 5/18 (27%) Query: 244 LSGIIPELCQKYQELHFL 261 Sbjct: 68 IETIPNSVCANLIDLLFL 85 >CALMO Calmodulin like EF-hand domains Length = 147 Score = 20.0 bits (41), Expect = 6.3 Identities = 9/54 (16%), Positives = 9/54 (16%), Gaps = 5/54 (9%) Query: 272 LEEVRERYQLHDRVQLLGALEHKDVRNVLVQGHIFLNTSLTEAFCMAIVEAASC 325 Sbjct: 10 IAEFKEAFALFDKDN-NGSISSSELATVMRSLGL----SPSEAEVNDLMNEIDV 58 >MYB MYB domain(HTH DNA binding domain) Length = 122 Score = 19.8 bits (41), Expect = 7.2 Identities = 7/35 (20%), Positives = 7/35 (20%) Query: 388 WRNVAERTEKVYERVSKETVLPMHKRLDRLISHCG 422 Sbjct: 37 WRTLPKNAGTCLQRCGKSCRLRWTNYLRPDIKRGR 71 >HOG HOG- intein(HINT) like domain Length = 389 Score = 19.9 bits (41), Expect = 8.4 Identities = 12/84 (14%), Positives = 12/84 (14%), Gaps = 6/84 (7%) Query: 69 TVTHAYGNRKGVRYLTNGLKVYYLPLRVMYNQSTATTLFHSLPLLRYIFV------RERI 122 Sbjct: 182 TALLESGVRKPLGELSIGDRVLSMTANGQAVYSEVILFMDRNLEQMQNFVQLHTDGGAVL 241 Query: 123 TIIHSHSSFSAMAHDALFHAKTMG 146 Sbjct: 242 TVTPAHLVSVWQPESQKLTFVFAH 265 >INSL Insulinase like Metallo protease domain Length = 433 Score = 19.9 bits (41), Expect = 8.7 Identities = 6/14 (42%), Positives = 6/14 (42%) Query: 80 VRYLTNGLKVYYLP 93 Sbjct: 23 IRDLPNGAKLIVKP 36 >MBL Metallo-betalactamase domain Length = 256 Score = 19.7 bits (40), Expect = 9.4 Identities = 12/96 (12%), Positives = 12/96 (12%), Gaps = 1/96 (1%) Query: 334 VGGIPEVLPESLIILCEPSVKSLCDGLEKAIFQVKSGTLPAPENIHNVVKTFYTWRNVAE 393 Sbjct: 70 VGGLEYVGFSTMFDPNCGKPNLYLSQDIAADLWERSLAG-GMEAIEGGMTEVDSYFQIHA 128 Query: 394 RTEKVYERVSKETVLPMHKRLDRLISHCGPVTGYMF 429 Sbjct: 129 LGPGETFTWENVNFQLIKLNHVDTGSMLMPAYGLFF 164 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 14 Number of calls to ALIGN: 15 Length of query: 485 Total length of test sequences: 20182 Effective length of test sequences: 16536.0 Effective search space size: 7455794.4 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= A55731 GPI-anchor biosynthesis protein PIG-A - mouse. (485 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|131825 [58..319] Periplasmic binding protein-like I 30 0.090 gi|1536958 [63..316] Periplasmic binding protein-like I 30 0.11 gi|628933 [59..301] Periplasmic binding protein-like I 28 0.46 gi|2495406 [61..315] Periplasmic binding protein-like I 27 0.80 gi|2497466 [46..370] Purine nucleoside hydrolase 27 0.86 gi|797337 [17..308] Periplasmic binding protein-like I 27 0.93 gi|1945664 [131..316] Periplasmic binding protein-like I 26 0.95 gi|398985 [18..447] PLP-dependent transferases 26 1.1 gi|1742164 [73..325] Periplasmic binding protein-like I 26 1.3 gi|1708629 [67..339] Periplasmic binding protein-like I 26 1.4 gi|2414449 [42..312] alpha/beta-Hydrolases 26 1.5 gi|1183859 [197..434] Protein kinases (PK), catalytic core 25 2.5 gi|2506561 [63..321] Periplasmic binding protein-like I 25 3.3 gi|1346563 [21..174] Periplasmic binding protein-like I 25 3.3 gi|119334 [61..336] Periplasmic binding protein-like I 25 3.6 gi|130155 [358..635] FAD-binding (C-terminal) domain of DNA ... 25 3.6 gi|77758 [5..132] N-terminal domain of enolase & muconate-la... 24 4.6 gi|2463100 [37..265] Trypsin-like serine proteases 24 4.6 gi|2635910 [102..378] Periplasmic binding protein-like I 24 4.8 gi|1894762 [59..322] Periplasmic binding protein-like I 23 7.6 gi|586847 [59..303] Periplasmic binding protein-like I 23 8.2 gi|120586 [1..210] Class II aldolase 23 8.5 gi|974180 [245..629] beta/alpha (TIM)-barrel 23 8.5 gi|729214 [63..305] Periplasmic binding protein-like I 23 8.8 gi|1652715 [5..195] NAD(P)-binding Rossmann-fold domains 23 8.8 gi|999647 [1..512] Ferritin-like 23 10.0 >gi|131825 [58..319] Periplasmic binding protein-like I Length = 262 Score = 30.0 bits (67), Expect = 0.090 Identities = 8/56 (14%), Positives = 8/56 (14%), Gaps = 4/56 (7%) Query: 121 RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADV--SSVLTNKLLTVS 174 Sbjct: 180 PPTAIITDN--DLSGDGAAMALQLRGRLSGKEAVSLVVYDGLPQDSIIELDVAAVI 233 >gi|1536958 [63..316] Periplasmic binding protein-like I Length = 254 Score = 29.6 bits (66), Expect = 0.11 Identities = 16/131 (12%), Positives = 16/131 (12%), Gaps = 17/131 (12%) Query: 54 YQLSQCLIERGHKVITV--------THAYGNRKGVRYLTNGLKVYYLPLRVMYNQSTATT 105 Sbjct: 107 YDMTQSCIEKGYEYFLLITADTSRLSTRIERASGFV---DALTDANMRHASLTIEDKHTN 163 Query: 106 LFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADV--S 163 Sbjct: 164 LEQIKEFLQKEIDPDEKTLVFIPN--CWALPLVFTVIKELNYNLP--QVGLIGFDNTEWT 219 Query: 164 SVLTNKLLTVS 174 Sbjct: 220 CFSSPSVSTLV 230 >gi|628933 [59..301] Periplasmic binding protein-like I Length = 243 Score = 27.6 bits (61), Expect = 0.46 Identities = 9/56 (16%), Positives = 9/56 (16%), Gaps = 5/56 (8%) Query: 121 RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGF--ADVSSVLTNKLLTVS 174 Sbjct: 181 LPTVVITSD--TLLNHLILSVFYELKL-HIPTDIQTATFNDSYLNAFASPPQTTVD 233 >gi|2495406 [61..315] Periplasmic binding protein-like I Length = 255 Score = 26.9 bits (59), Expect = 0.80 Identities = 7/58 (12%), Positives = 7/58 (12%), Gaps = 5/58 (8%) Query: 119 RERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADV--SSVLTNKLLTVS 174 Sbjct: 167 NEKPDGIFLSD--DMTAILTMKIANQLNIT-IPHELKIIGYDGTHFVENYYPYLTTIR 221 >gi|2497466 [46..370] Purine nucleoside hydrolase Length = 325 Score = 26.9 bits (59), Expect = 0.86 Identities = 9/44 (20%), Positives = 9/44 (20%) Query: 223 HDSVITVVVVSRLVYRKGTDLLSGIIPELCQKYQELHFLIGGEG 266 Sbjct: 198 HKAIATYKVNEMIYNEKNNSKLRELFLELFQFFAHTYKDMQGFE 241 >gi|797337 [17..308] Periplasmic binding protein-like I Length = 292 Score = 26.5 bits (58), Expect = 0.93 Identities = 25/140 (17%), Positives = 25/140 (17%), Gaps = 27/140 (19%) Query: 54 YQLSQCLIERGHKVITVTHAYGNRKGVRYLT-----------NGLKVYYLPLRV------ 96 Sbjct: 108 YEAVKSLIGQGHRNVALVSNAPDHGEQRHLISSVRERVDGYRAALHDTEIPVSSDFIVFG 167 Query: 97 MYNQSTATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSL 156 Sbjct: 168 GWDPQRLALHVRAL-----CMSANRPTAFLATD--SSVALVLLAVLKDMNL-SIPDEVSL 219 Query: 157 FGFADV--SSVLTNKLLTVS 174 Sbjct: 220 ICFDDPDWTAATTPALTVIS 239 >gi|1945664 [131..316] Periplasmic binding protein-like I Length = 186 Score = 26.5 bits (58), Expect = 0.95 Identities = 13/134 (9%), Positives = 13/134 (9%), Gaps = 25/134 (18%) Query: 54 YQLSQCLIERGHK---VITVTHAYGN--------RKGVRYLTNGLKVYYLPLRVMYNQST 102 Sbjct: 24 FMATRHVMGLGEREVVFFGIDLDEPFERAREQGYIRAMNKSFKKSNMFRIDNSSKKSECL 83 Query: 103 ATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGF--A 160 Sbjct: 84 ARELLKSMD---------NQAAFVCAS--DRIALGVIRAAQSLGK-RIPEDVAVTGNDGV 131 Query: 161 DVSSVLTNKLLTVS 174 Sbjct: 132 FLDRISSPRLTTVR 145 >gi|398985 [18..447] PLP-dependent transferases Length = 430 Score = 26.4 bits (57), Expect = 1.1 Identities = 9/88 (10%), Positives = 9/88 (10%), Gaps = 2/88 (2%) Query: 112 LLRYIFVRERITIIHSHSSFSAMAHDALFHA-KTMGLQTVFTDHSLFGFADVSSVLTNKL 170 Sbjct: 334 LSDFKLKQQWFKDVDFMVQRLHHVRQEMFDRLGWPDLVNFAQQHGMFYYTRFSPKQVEIL 393 Query: 171 L-TVSLCDTNHIICVSYTSKENTVLRAA 197 Sbjct: 394 RNNSFVYLTGDGRLSLSGVNDSNVDYLC 421 >gi|1742164 [73..325] Periplasmic binding protein-like I Length = 253 Score = 26.1 bits (57), Expect = 1.3 Identities = 6/56 (10%), Positives = 6/56 (10%), Gaps = 5/56 (8%) Query: 121 RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADV--SSVLTNKLLTVS 174 Sbjct: 172 LPEAVFATD--SLKLMSIYRAAAEKNI-AIPQQLAVVGYSNETLSFILTPAPGGID 224 >gi|1708629 [67..339] Periplasmic binding protein-like I Length = 273 Score = 26.1 bits (57), Expect = 1.4 Identities = 8/58 (13%), Positives = 8/58 (13%), Gaps = 5/58 (8%) Query: 119 RERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGF--ADVSSVLTNKLLTVS 174 Sbjct: 179 PEQKKAILALN--GLIMLKIISCMEELGL-RIPQDIGIAGFDDTEWYKLIGPGITTIA 233 >gi|2414449 [42..312] alpha/beta-Hydrolases Length = 271 Score = 25.7 bits (56), Expect = 1.5 Identities = 14/94 (14%), Positives = 14/94 (14%), Gaps = 16/94 (17%) Query: 146 GLQTVFTDHSLFGFADVSSVLTNKLLTVSLCDTNHIICVSY--------------TSKEN 191 Sbjct: 113 NYDVYVTGHSLGG--ALAGLCAPRIVHDGLRQSQKIKVVTFGEPRVGNIEFSRAYDQLVP 170 Query: 192 TVLRAALNPEIVSVIPNAVDPTDFTPDPFRRHDS 225 Sbjct: 171 YSFRVVHSGDVVPHLPGCVKDLSYTPPAGSDGSM 204 >gi|1183859 [197..434] Protein kinases (PK), catalytic core Length = 238 Score = 25.0 bits (53), Expect = 2.5 Identities = 12/105 (11%), Positives = 12/105 (11%), Gaps = 3/105 (2%) Query: 366 QVKSGTLPAPENIHNVVKTFYTWRNVAERTEKVYERVSKETVLPMHKRLDRLISHCGPVT 425 Sbjct: 13 EVRKIRSKYRKKDVFALKKLNMIYN--ETPEKFYKRCSKEFIIAKQLSHHVHITNTFLLV 70 Query: 426 -GYMFALLAVLSYLFLIFLQWMTPDSFIDVAIDATGPRRAWTHQW 469 Sbjct: 71 KVPTTVYTTRGWGFVMELGLRDLFAMIQKSGWRSVALAEKFCIFK 115 >gi|2506561 [63..321] Periplasmic binding protein-like I Length = 259 Score = 24.9 bits (54), Expect = 3.3 Identities = 12/57 (21%), Positives = 12/57 (21%), Gaps = 5/57 (8%) Query: 120 ERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADV--SSVLTNKLLTVS 174 Sbjct: 176 PQLDGVFCTN--DDLAVGAAFECQRLGL-KVPDDMAIAGFHGHDIGQVMEPRLASVL 229 >gi|1346563 [21..174] Periplasmic binding protein-like I Length = 154 Score = 24.9 bits (54), Expect = 3.3 Identities = 13/97 (13%), Positives = 13/97 (13%), Gaps = 12/97 (12%) Query: 143 KTMGLQTVFTDHSLFGFADVSSVLTNKLL-----TVSLCDTNHIICVSYTSKENTVLRAA 197 Sbjct: 1 ETIGA--SMAVFDDKFGTLLRNGMEDYAKTLDGVDLQIEDALNDV-----AKQQSQIQNF 53 Query: 198 LNPEIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVSR 234 Sbjct: 54 IAAGVDAIIVQPVDTDATTVMSKLAADAGIPLVYVNR 90 >gi|119334 [61..336] Periplasmic binding protein-like I Length = 276 Score = 24.5 bits (53), Expect = 3.6 Identities = 17/57 (29%), Positives = 17/57 (29%), Gaps = 5/57 (8%) Query: 120 ERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFADV--SSVLTNKLLTVS 174 Sbjct: 179 NPPTAIIATN--DMILEQVLIYAKNNHL-TIPNDFSLIGIDDVSFASFYNPPITTVS 232 >gi|130155 [358..635] FAD-binding (C-terminal) domain of DNA photolyase Length = 278 Score = 24.5 bits (53), Expect = 3.6 Identities = 12/49 (24%), Positives = 12/49 (24%), Gaps = 4/49 (8%) Query: 425 TGYM--FALLAVLSYLFL-IFLQW-MTPDSFIDVAIDATGPRRAWTHQW 469 Sbjct: 139 TGYMHNRLRMIVASFLAKDLLVDWRMGERYFMEHLIDGDFASNNGGWGF 187 >gi|77758 [5..132] N-terminal domain of enolase & muconate-lactonizing enzyme Length = 128 Score = 24.4 bits (53), Expect = 4.6 Identities = 5/20 (25%), Positives = 5/20 (25%) Query: 104 TTLFHSLPLLRYIFVRERIT 123 Sbjct: 7 AIIVHDLPTIRPPHKLAMHT 26 >gi|2463100 [37..265] Trypsin-like serine proteases Length = 229 Score = 24.3 bits (52), Expect = 4.6 Identities = 14/162 (8%), Positives = 14/162 (8%), Gaps = 15/162 (9%) Query: 43 YPNMGGVESHIYQLSQCLIERGHKVITVTHAYGNRK--------GVRYLTNGLKVYYLPL 94 Sbjct: 16 YPFMATVWQNDRKLCTASIVSPNYILSAGHCFVKMSEENYIILVGTVNAKLEKGNGQQFK 75 Query: 95 RVMYNQSTATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDH 154 Sbjct: 76 VEKAHVYSETVFGQDIAIVK---LKNSIDFSDNATQPITLSRRSNFTKTDLAFIAGWGRI 132 Query: 155 SLFGFADVSSVLTNKLLTVSLCDTNHI----ICVSYTSKENT 192 Sbjct: 133 TDWSSPVTLQGANVLIWPKDEARCDGIMESEVCAFGEDGANV 174 >gi|2635910 [102..378] Periplasmic binding protein-like I Length = 277 Score = 24.2 bits (52), Expect = 4.8 Identities = 21/136 (15%), Positives = 21/136 (15%), Gaps = 23/136 (16%) Query: 54 YQLSQCLIERGHKVI--------TVTHA--YGNRKGVRYLTNGLKV---YYLPLRVMYNQ 100 Sbjct: 111 MMAAEHLLSLGHTHMMGIFKADDTQGVKRMNGFIQAHRE--RELFPSPDMIVTFTTEEKE 168 Query: 101 STATTLFHSLPLLRYIFVRERITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFA 160 Sbjct: 169 SKLLEKVKATLEKNS---KHMPTAILCYN--DEIALKVIDMLREMDL-KVPEDMSIVGYD 222 Query: 161 DV--SSVLTNKLLTVS 174 Sbjct: 223 DSHFAQISEVKLTSVK 238 >gi|1894762 [59..322] Periplasmic binding protein-like I Length = 264 Score = 23.3 bits (50), Expect = 7.6 Identities = 11/56 (19%), Positives = 11/56 (19%), Gaps = 5/56 (8%) Query: 121 RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGF--ADVSSVLTNKLLTVS 174 Sbjct: 172 ATDGVIASN--DIQAAAVLHEALRRGK-NVPEDIQIIGYDDIPQSGLLFPPLSTIK 224 >gi|586847 [59..303] Periplasmic binding protein-like I Length = 245 Score = 23.4 bits (50), Expect = 8.2 Identities = 16/172 (9%), Positives = 16/172 (9%), Gaps = 34/172 (19%) Query: 143 KTMGLQTVFTDHSLFGFAD----VSSVLTNKLLTVSLCDTNHIICVSYTSKENTVLRAAL 198 Sbjct: 2 HTVGV--ILPYSDHPCFDKIVNGITKAAFQHEYATTLLPTNYNP-----DIEIKYLELLR 54 Query: 199 NPEIVSVIPNAVDPTDFTPDPFRRHDSVITVVVVSRLV----------YRKGTDLLSGII 248 Sbjct: 55 TKKIDGLI---ITSRANHWDSILAYQEYGPVIACEDTGDIDVPCAFNDRKTAYAESFRYL 111 Query: 249 PELCQKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVL 300 Sbjct: 112 KS--RGHENIAFTCVREAD---RSPSTADKAA-----AYKAVCGRLEDRHML 153 >gi|120586 [1..210] Class II aldolase Length = 210 Score = 23.3 bits (50), Expect = 8.5 Identities = 13/76 (17%), Positives = 13/76 (17%), Gaps = 2/76 (2%) Query: 275 VRERYQLHDRVQLLGALEHKDVRNVLVQGH-IF-LNTSLTEAFCMAIVEAASCGLQVVST 332 Sbjct: 126 APYATFGTRELSEHVALALKNRKATLLQHHGLIACEVNLEKALWLAHEVEVLAQLYLTTL 185 Query: 333 KVGGIPEVLPESLIIL 348 Sbjct: 186 AITDPVPVLSDEEIAV 201 >gi|974180 [245..629] beta/alpha (TIM)-barrel Length = 385 Score = 23.3 bits (50), Expect = 8.5 Identities = 8/61 (13%), Positives = 8/61 (13%) Query: 272 LEEVRERYQLHDRVQLLGALEHKDVRNVLVQGHIFLNTSLTEAFCMAIVEAASCGLQVVS 331 Sbjct: 229 ECDATKEAGIIELINAVKEADGTRIDGFGMQGHYSVNAPTVDRIKEAIQDYSQVVDEVMI 288 Query: 332 T 332 Sbjct: 289 T 289 >gi|729214 [63..305] Periplasmic binding protein-like I Length = 243 Score = 23.4 bits (50), Expect = 8.8 Identities = 9/57 (15%), Positives = 9/57 (15%), Gaps = 6/57 (10%) Query: 121 RITIIHSHSSFSAMAHDALFHAKTMGLQTVFTDHSLFGFAD---VSSVLTNKLLTVS 174 Sbjct: 180 DFDVLICGN--DRAAFVAYQVLLAKGVR-IPQDVAVMGFDNLVGVGHLFLPPLTTIQ 233 >gi|1652715 [5..195] NAD(P)-binding Rossmann-fold domains Length = 191 Score = 23.4 bits (49), Expect = 8.8 Identities = 5/48 (10%), Positives = 5/48 (10%), Gaps = 3/48 (6%) Query: 47 GGVESHIYQLSQCLIERGHKVITVTHAYGNRKGVRYLTNGLKVYYLPL 94 Sbjct: 15 RGIGKVL---VESFLEHGAAKVYAAVRKLESAAFLVDKYGNKIVPILI 59 >gi|999647 [1..512] Ferritin-like Length = 512 Score = 23.0 bits (49), Expect = 10.0 Identities = 10/52 (19%), Positives = 10/52 (19%), Gaps = 9/52 (17%) Query: 253 QKYQELHFLIGGEGPKRIILEEVRERYQLHDRVQLLGALEHKDVRNVLVQGH 304 Sbjct: 430 EMH---TF--SDQWGERMWLAE-PERYECQ---NIFEQYEGRELSEVIAELH 472 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 26 Number of calls to ALIGN: 26 Length of query: 485 Total length of test sequences: 256703 Effective length of test sequences: 208388.0 Effective search space size: 92677489.4 Initial X dropoff for ALIGN: 25.0 bits ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ calculation of internal repeats with prospero ***** PROSPERO v1.3 Thu Nov 22 12:56:22 2001 ***** Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford For help see http://www.well.ox.ac.uk/ariadne For usage use -help using gap penalty 11+1k using matrix BLOSUM62 printing all alignments with eval < 0.100000 using sequence1 A55731 using self-comparison ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ TIGRFAM hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR01088 aroQ: 3-dehydroquinate dehydratase, type II -116.8 70 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR01088 1/1 361 438 .. 1 144 [] -116.8 70 Alignments of top-scoring domains: TIGR01088: domain 1 of 1, from 361 to 438: score -116.8, E = 70 *->kiLVlNGPNLNmLGlREPgvYG..sqTL...eeIeeiletfaaqlnL E++++ +s TL+ +e I+ +tf + n A55731 361 ----------------EKAIFQvkSGTLpapENIHNVVKTFYTWRN- 390 DvevefFQSNsEGeLidkIHealgqdydGIvINPGAyTHTSvALRDAlaa v +e +k+ e + + + A55731 391 -V----------AERTEKVYERVSK------------------------E 405 vslPvVEVHLSNvhaREEFRhhSyiApVAkGvIvGLGaqGYrLALrylve lP h+R R +S+ +pV GY +AL + ++ A55731 406 TVLPM--------HKRLD-RLISHCGPV----------TGYMFALLAVLS 436 iL<-* +L A55731 437 YL 438 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm-f Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR01088 aroQ: 3-dehydroquinate dehydratase, type II 1.3 37 1 TIGR00282 TIGR00282: conserved hypothetical protein T -0.4 45 1 TIGR01129 secD: protein-export membrane protein SecD -0.6 23 1 TIGR00225 prc: C-terminal peptidase (prc) -0.9 56 1 TIGR00893 2A0114: d-galactonate transporter -1.0 71 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR00225 1/1 45 104 .. 204 263 .. -0.9 56 TIGR01129 1/1 225 237 .. 510 522 .] -0.6 23 TIGR00282 1/1 227 265 .. 1 39 [. -0.4 45 TIGR01088 1/1 426 438 .. 132 144 .] 1.3 37 TIGR00893 1/1 429 441 .. 415 427 .] -1.0 71 Alignments of top-scoring domains: TIGR00225: domain 1 of 1, from 45 to 104: score -0.9, E = 56 *->npGGllqsavdlaglflpegppivstkdrngeke.ldykangralyp n GG+ l+ + +g ++ + ++g ++++y +ng +y A55731 45 NMGGVESHIYQLSQCLIERGHKVITVTHAYGNRKgVRYLTNGLKVY- 90 nlplvvlvnggsas<-* lpl v+ n+++a+ A55731 91 YLPLRVMYNQSTAT 104 TIGR01129: domain 1 of 1, from 225 to 237: score -0.6, E = 23 *->SlFtalvvtRlLl<-* S++t++vv+Rl + A55731 225 SVITVVVVSRLVY 237 TIGR00282: domain 1 of 1, from 227 to 265: score -0.4, E = 45 *->ikvlflGdvyGkaGrkivkenlpklknkykpdlviange<-* i v+++ ++ + G ++++ +p+l ky+ ++ ge A55731 227 ITVVVVSRLVYRKGTDLLSGIIPELCQKYQELHFLIGGE 265 TIGR01088: domain 1 of 1, from 426 to 438: score 1.3, E = 37 *->GYrLALrylveiL<-* GY +AL + +++L A55731 426 GYMFALLAVLSYL 438 TIGR00893: domain 1 of 1, from 429 to 441: score -1.0, E = 71 *->laligalsvLllV<-* +al+++ls+L+l A55731 429 FALLAVLSYLFLI 441 // SMART hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/iprscan/data/smart.HMMs Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- POLIIIAc DNA polymerase alpha chain like domain -16.9 35 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- POLIIIAc 1/1 123 180 .. 1 75 [] -16.9 35 Alignments of top-scoring domains: POLIIIAc: domain 1 of 1, from 123 to 180: score -16.9, E = 35 *->vdLHvHsdySllDGalspeelvkrAkklGlkaiAiTDHgppyqdlnl H Hs +S ++ + + Ak +Gl+ + TDH+ l A55731 123 TIIHSHSSFS-----AMAHDALFHAKTMGLQTV-FTDHS-------L 156 fgavefykaakeiagikpIiGiEaniap<-* fg+ + + + k+++ +++++ A55731 157 FGFADVSSVLTN----KLLTVSLCDTNH 180 // COG hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG0438 118.9 9.3e-32 1 COG2123 -178.2 81 1 COG0788 -244.6 69 1 COG0297 -257.2 0.069 1 COG0380 -325.8 67 1 COG2851 -386.1 66 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG0788 1/1 93 315 .. 1 307 [] -244.6 69 COG2851 1/1 107 353 .. 1 473 [] -386.1 66 COG0380 1/1 1 405 [. 1 586 [] -325.8 67 COG2123 1/1 188 406 .. 1 292 [] -178.2 81 COG0438 1/1 225 407 .. 1 255 [] 118.9 9.3e-32 COG0297 1/1 24 408 .. 1 556 [] -257.2 0.069 Alignments of top-scoring domains: COG0788: domain 1 of 1, from 93 to 315: score -244.6, E = 69 *->MmspmqkrpvaenpdlaDyaiLlisCpDqpGiVaaISnfLf.ehgaN + +m++ + + ++ ++ p + +++f + ++ A55731 93 PLRVMYNQS-----------TATTLFHSLPLL-----RYIFvRERIT 123 Ivend.qfVtDpegGrFFm.RvefdlefldgadkstdlaaLqaalakvAe I++++++f+ +F +++ ++ f + + +a+ + l+ A55731 124 IIHSHsSFSAMAHDALFHAkTMGLQTVFT--DHSLFGFADVSSVLT---- 167 eFgmtwrllfasrrKriai.lVSKedHCLgDLLwRwysGeLdaeIpaVIS + + + + + i ++ SKe + L + L+ eI+ VI A55731 168 --NKLLTVSLCDTNHIICVsYTSKE----NTVLR----AALNPEIVSVIP 207 NHddLrplvErFgIPFhhiPvdkqlnraEaEaqirqleLleeygaDlvVL N v++ + + P+ + + + + + +L+++ g+Dl A55731 208 N------AVDPTD----FTPDPF-RRHDSVITVVVVSRLVYRKGTDL--- 243 ARYMQILS...PdFVrrfenkIINIHHSFLPAFiGAnPYhQAfeRGVKiI LS+ P+++ ++ H FL I A55731 244 ------LSgiiPELCQKYQE----LH--FL-------------------I 262 GATAHYVTedLDEGP..IIeQDVvrVdHrdsved.lvraG....rDvEkl G EGP++II V r+ +d+++ G +++Dv ++ A55731 263 GG----------EGPkrIIL---EEVRERYQLHDrVQLLGalehKDV-RN 298 VLARAVklHLEDRvlVygNKTVVF<-* VL+ +++++++ T F A55731 299 VLV-------QGHIFLNTSLTEAF 315 COG2851: domain 1 of 1, from 107 to 353: score -386.1, E = 66 *->meiMLtlLGFlmvlvfvwLiltkrlSPliALIiVPIvfALIaLiLER F ++ + + ++r + + + + + + ++ A55731 107 ---------FHSLPLLRYIFVRERITIIHSHSSFSAMAHDAL----- 139 LGFGvGvALGvKGGvFDikelgemmleGiksVaptaiMlmFAILYFGIMi F k++ G ++V ++ FA A55731 140 --FHA-------------KTM------GLQTVFTDHSLFGFA-------- 160 DsGLFDPligkiLsivKGdpvkvavGTavlAmlvsL.DGdGaTTYlItvp d+ +v+T l ++vsL D T I+v+ A55731 161 ------------------DVS--SVLTNKL-LTVSLcD----TNHIICVS 185 AlLPLYkRLGmnplvLallamlsaGVmNmiPWGGPTaRAAsvLgvDpael Y ++ + l a l ++++iP + vDp+++ A55731 186 -----YT----SKENTVLRAALNPEIVSVIPNA-----------VDPTDF 215 f.vPLIPvmiiGlllilaLAylLGkrErkRGlGvlnlsapGwnsitkdea ++ P ++ ++ ++ l R G ls++ A55731 216 TpDPF--RRHDSVITVVVVSRLV-----YR-KGTDLLSGI---------- 247 daaepaeieaeelkkkkDrsLARPKLlWlNLLLtialmglLvsGilPlpv ++++ ++ el+ +L+ G+ P + A55731 248 --IPELCQKYQELH-------------------------FLIGGEGPKRI 270 lFMIAfaiALllNYPnvkeQkkRIaaHAgNaLlvvslIfAAGiFtGILsG + ++e ++R H L +G L A55731 271 I---------------LEEVRERYQLHDRVQL------------LGALEH 293 TGMvDAmaksivsliPdalGPyLplIvAilsvPFtfvlsnDAYYFGvLPI k+ ++++ +++ +l++ A55731 294 --------KDVRNVL---VQGHI-------------FLNT---------- 309 veeaasaYGidpveiArAsiiGqpVghllSPLVPsTyLlvGLAkvdfGdH + + a+ + ve+A + q V sT + G +v A55731 310 --SLTEAFCMAIVEAASCGL--QVV---------STKVG-GIPEV----- 340 qRFslkWAvgiSlViliiAllaGIipll<-* l +l+++ p A55731 341 ---------------LPESLIILCEPSV 353 COG0380: domain 1 of 1, from 1 to 405: score -325.8, E = 67 *->mtdtaqdntpkkrqrisdletdeakkqlvesfgdYSNkaKLknsklv + + +s++ + + + ++ ++ A55731 1 -MANRRGGGQGQPPSVSPSPGSSGNLSDDRTCTH----------NIC 36 vvSneLFsRlPvsiekendgtgeykknavGstGlvtaLerllkrreakEk vS +++ n g++ + + L l +r A55731 37 MVSD---------FFYPNMGGV--ESHIYQ-------LSQCLIER----- 63 PqDLDDDPLYgtWvgwpgvttdelpsdkeekdkdgkilkdrfnvhPViLs g +v + vt ++ ++ + +++lk+ + +V+ A55731 64 ----------GHKVIT--VT--HAYGNRKGVRYLTNGLKVYYLPLRVMYN 99 dedfkgyYnnySnaiLWPlfHYfintnpdnvnskafernwWdgYvkvNqk ++ lfH+ ++ + + f r+ ++ A55731 100 QSTATT------------LFHSL---PLLRYI---FVRE------RITII 125 FAdkivkvlkkgDKDSlIWiHDYhLlLvPqmLRvkimkkrlpnakIGFFl + +++++++ + + + F+ A55731 126 HSHSSFSAMAHDA-----------------LFH-----AKTMGLQTVFTD 153 HiPFPSsEiFriLPeRmReeIleGLlgaDlvGFHterYarnFlsscrrLl H F ++ L + + + +L++ + + ++++ A55731 154 HSLFGFADVSSVLTN---KLLTVSLCDTNHI------------ICVSYTS 188 dvdtrrrkvssrsdtyaGtALAEPELttivryggrlvrVdafPIGiDpee ++t+ r + + V ++P +Dp++ A55731 189 KENTVLRA----ALNPE--------------------IVSVIPNAVDPTD 214 ltkqaakgsvqekvqeIKsealknkKlIvsvDRlDyiKGivekLlA..yE +t + + ++ + I v v Rl y KG++ Ll++ + A55731 215 FTPDPF----RRHDSVIT---------VVVVSRLVYRKGTD--LLSgiIP 249 elLeehPElrgkvvlvqiatpsredveYqnlrqeveelVgrINgeyGnls el ++ El+ l+ p+r e +++r ++ + V + G l+ A55731 250 ELCQKYQELH---FLIGGEGPKRIILEEVRERYQLHDRVQ----LLGALE 292 wsQPv........hylhrpipfeeLialfkaaDvaLVtPLRDGMNLVakE ++ v++ +++ +l+ ++ + +a+ aa ++L A55731 293 HK-DVrnvlvqghIFLNTSLTEAFCMAIVEAASCGL-------------- 327 YVacsseknnFLCYgGpLILSEFaGaaneLkegAiiVNPwDlkevadAId ++s + + G +L+e+ ii + k + d+++ A55731 328 -QVVSTKVG---------------GIPEVLPESLIILCEPSVKSLCDGLE 361 eALkMskeekrkrwekLkkeVlkrDidhWankllrEefldslkgekpss< +A + k + e +V + + W n e++ + ++++ A55731 362 KAIFQVKSGTLPAPENIHNVV--KTFYTWRNV---AERTEKVYERVSKE 405 -* A55731 - - COG2123: domain 1 of 1, from 188 to 406: score -178.2, E = 81 *->maksdivaeisrkyilsllregkRiDGRlfdeFRd....ieIetgvI + ++ + i s + + + + +d FR +++ i+ + A55731 188 SKENTVLRAALNPEIVSVIPNAVDPTDFTPDPFRRhdsvITVVVV-- 232 eKAeGSalVKLGnGTqVivGvKsqiGePfpDtPnqGVltvnaELlPlAsP ++ +GT + G+ +++ +++ + +++ E A55731 233 ------SRLVYRKGTDLLSGIIPELCQKYQELH----FLIGGE------- 265 tFEpGNPPDElaiElaRVvDRgiReSgaldlEkLvI..veGkKVWvvFvD +p R+++ +Re l + + + +e k V+ v v A55731 266 --GPK-----------RIILEEVRERYQLHDRVQLLgaLEHKDVRNVLVQ 302 vhvLDhDGNLiDAsslAaiAALlntkvPkLasefdegevvievereyepL h+ L A + A++ a+ + + + A55731 303 GHIF-LNTSLTEAFCMAIV---------E-AASCGLQV---------VST 332 pVeriPPisVTfakiGPQDTEENiKGETNSnilvvDPsleEelVadgrLt +V iP + S i+++ Ps + +++dg A55731 333 KVGGIP------EVLP------------ESLIILCEPSVK--SLCDGL-- 360 ittdenghivamqKggg.galtvkdvkkHavkqalekveklreklleslk ++ i ++ +g ++ + + v k + +++e + +++ k A55731 361 -----EKAIFQVKSGTLpAPENIHNVVK-TFYTWRNVAERTEKVYERVSK 404 pl<-* + A55731 405 ET 406 COG0438: domain 1 of 1, from 225 to 407: score 118.9, E = 9.3e-32 *->dkpvilfvGRlvpeKgldllieafaklkeeipellpdlklvivGgts +++++v+Rlv++Kg dll ++++l++++ ++l+++i G A55731 225 SVITVVVVSRLVYRKGTDLLSGIIPELCQKY----QELHFLIGG--- 264 yiaaeacdGpeeerlrlleklakklglednVeflGfvpdprvldeelpel +Gp++ le++ +++ l d+V++lG ++ +++++++ A55731 265 -------EGPKRII---LEEVRERYQLHDRVQLLGALE-----HKDVRNV 299 lkaadvfvlPSrysekrgedrEgfglvllEAmAaGtPViatdvgslelga l + ++f+ +S++ E+f+++++EA ++G+ V++t vg A55731 300 LVQGHIFLNTSLT--------EAFCMAIVEAASCGLQVVSTKVG------ 335 neereladkGipEvvedgarylfgenGregkrrlnlllvdpgdeddidsi GipEv+ + +l++p+ d A55731 336 ---------GIPEVLPE--------SL--------IILCEPSVKSLCD-- 358 ealaeaierlledpelreregvsllgrearrrvaerfswekiakrllkly +l +ai ++ + + + + + v+ ++w+++a+r++k+y A55731 359 -GLEKAIFQVKSGTLPAPE--------NIHNVVKTFYTWRNVAERTEKVY 399 eellekre<-* e++ ++ + A55731 400 ERVSKETV 407 COG0297: domain 1 of 1, from 24 to 408: score -257.2, E = 0.069 *->lnsqdryserMkILfvasEvtPfvKvGGLADVlgaLPkaLkklGhdV n d +I v+ ++P GG ++ L++ L+ +Gh+V A55731 24 -NLSDDRTCTHNICMVSDFFYP--NMGGVESHIYQLSQCLIERGHKV 67 rVlLPkYgriqgepieqlykvsegetvavvgreqqfdvlesyldGt.vgl ++ +Yg +g + + + ++++ + l++ ++ ++ ++ A55731 68 ITVTHAYGNRKGVRYLTN----GLKVYYLP--------LRVMYNQStATT 105 ylid.K.ndyyfnregnPYhDanlygypDnaeRFafFsaAalelldgldp ++++ + y+f re+ + + F++ ++ al +++ + A55731 106 LFHSlPlLRYIFVRER--------ITIIHSHSSFSAMAHDALFHAKTMG- 146 fwqPDiVHaHDWhTGLvpalLKteyrklPFfervKtVFTIHNLaYQGEmI L tVFT A55731 147 --------------------LQ-------------TVFT----------- 152 EYGEVmTFLifpahylhel.lglplylfhyeglefpGqinflKaGivfaD +++ +++++ +l l ++++ + + A55731 153 ---------DHSLFGFADVsSVLTNKLL---------TVSLCDT-----N 179 hVTTVSPTYAqEIqTpeygygLeglLkarssegklsGILNGIDyeiWnPe h+ VS y + +L+a + +s I N +D P A55731 180 HIICVS-----------YTSKENTVLRAALNPEIVSVIPNAVD-----PT 213 tDpylaanYdagsledpvlFkkKaeNKtaLqeelGLpedddaPligiVsR ++++++ d + + +VsR A55731 214 -------DFTPDPFRR----------------------HDSVITVVVVSR 234 LteQKGvdLlleiideLlekEFqdaqlViLGtGd.PeLEnafrnlaerhp L+ KG dLl ii+eL +k +q+ + i+G G ++ + ++ r+ + h+ A55731 235 LVYRKGTDLLSGIIPELCQK-YQELHFLIGGEGPkRIILEEVRERYQLHD 283 dsgnvavligfdepLArriYAGaDfilMPSrFEPCGLtQLiaMrYGTvPI +v +l + +r + +++l S E++ +a G ++ A55731 284 ---RVQLLGALEHKDVRNVLVQGHIFLNTSLTEAFCMAIVEAASCGLQVV 330 VReTGGLaDTVvdldydeenleekgtGflFkepdaeallnalsRAlalYr +++GG +++ +e+ + ep +l ++l++A+ + A55731 331 STKVGG----IPE--VLPES------LIILCEPSVKSLCDGLEKAIFQVK 368 qelNEICmFmQYIRY.CPHpdewqnlvtraMaNCYYHVFadfSWdkSPAk + P p+ n+v++ + W++ A+ A55731 369 S-------------GtLPAPENIHNVVKTFYT-----------WRNV-AE 393 eYvelYegllaktrd<-* + ++Ye++ t + A55731 394 RTEKVYERVSKETVL 408 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm-f Sequence file: A55731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: A55731 GPI-anchor biosynthesis protein PIG-A - mouse. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG0438 117.2 1.5e-35 1 COG0297 9.9 0.038 1 COG1751 1.3 33 1 COG3080 0.3 59 1 COG0664 0.0 96 1 COG0333 0.0 67 1 COG1102 -0.2 85 1 COG1620 -1.5 77 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG0333 1/1 63 71 .. 56 64 .] 0.0 67 COG1751 1/1 60 73 .. 86 99 .. 1.3 33 COG0664 1/1 60 81 .. 202 223 .] 0.0 96 COG0297 1/1 229 266 .. 338 376 .. 9.9 0.038 COG1102 1/1 269 284 .. 182 197 .] -0.2 85 COG0438 1/1 225 407 .. 1 255 [] 117.2 1.5e-35 COG1620 1/1 429 449 .. 580 601 .] -1.5 77 COG3080 1/1 441 453 .. 1 13 [. 0.3 59 Alignments of top-scoring domains: COG0333: domain 1 of 1, from 63 to 71: score 0.0, E = 67 *->rGkkVitke<-* rG+kVit++ A55731 63 RGHKVITVT 71 COG1751: domain 1 of 1, from 60 to 73: score 1.3, E = 33 *->LkeRGakVlrgSHA<-* L eRG+kV + +HA A55731 60 LIERGHKVITVTHA 73 COG0664: domain 1 of 1, from 60 to 81: score 0.0, E = 96 *->lisvegktievldlaaLrrlag<-* li++++k i+v++ + r+ ++ A55731 60 LIERGHKVITVTHAYGNRKGVR 81 COG0297: domain 1 of 1, from 229 to 266: score 9.9, E = 0.038 *->igiVsRLteQKGvdLlleiideLlekEFqdaqlViLGtG<-* + +VsRL+ KG dLl ii+eL +k +q+ + i+G G A55731 229 VVVVSRLVYRKGTDLLSGIIPELCQK-YQELHFLIGGEG 266 COG1102: domain 1 of 1, from 269 to 284: score -0.2, E = 85 *->eiIleaiDarklkkde<-* +iIle++++r+ d A55731 269 RIILEEVRERYQLHDR 284 COG0438: domain 1 of 1, from 225 to 407: score 117.2, E = 1.5e-35 *->dkpvilfvGRlvpeKgldllieafaklkeeipellpdlklvivGgts +++++v+Rlv++Kg dll ++++l++++ ++l+++i G A55731 225 SVITVVVVSRLVYRKGTDLLSGIIPELCQKY----QELHFLIGG--- 264 yiaaeacdGpeeerlrlleklakklglednVeflGfvpdprvldeelpel +Gp++ le++ +++ l d+V++lG ++ +++++++ A55731 265 -------EGPKRII---LEEVRERYQLHDRVQLLGALE-----HKDVRNV 299 lkaadvfvlPSrysekrgedrEgfglvllEAmAaGtPViatdvgslelga l + ++f+ +S++ E+f+++++EA ++G+ V++t vg A55731 300 LVQGHIFLNTSLT--------EAFCMAIVEAASCGLQVVSTKVG------ 335 neereladkGipEvvedgarylfgenGregkrrlnlllvdpgdeddidsi GipEv+ + +l++p+ d A55731 336 ---------GIPEVLPE--------SL--------IILCEPSVKSLCD-- 358 ealaeaierlledpelreregvsllgrearrrvaerfswekiakrllkly +l +ai ++ + + + + + v+ ++w+++a+r++k+y A55731 359 -GLEKAIFQVKSGTLPAPE--------NIHNVVKTFYTWRNVAERTEKVY 399 eellekre<-* e++ ++ + A55731 400 ERVSKETV 407 COG1620: domain 1 of 1, from 429 to 449: score -1.5, E = 77 *->yliivGvityliaylftgmipt<-* +++++ v+ yl++ ++++m p A55731 429 FALLA-VLSYLFLIFLQWMTPD 449 COG3080: domain 1 of 1, from 441 to 453: score 0.3, E = 59 *->mFLLWmtpsimin<-* +FL Wmtp i+ A55731 441 IFLQWMTPDSFID 453 //