analysis of sequence from tem25 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >gi|1245357|gb|AAA93462.1| procollagen C-proteinase=
>gi|5453579|ref|NP_006120.1| bone morphogenetic protein 1, isoform 4, precursor; PCP [Homo sapiens] MPGVARLPLLLGLLLLPRPGRPLDLADYTYDLAEEDDSEPLNYKDPCKAAAFLGDIALDEEDLRAFQVQQ AVDLRRHTARKSSIKAAVPGNTSTPSCQSTNGQPQRGACGRWRGRSRSRRAATSRPERVWPDGVIPFVIG GNFTGSQRAVFRQAMRHWEKHTCVTFLERTDEDSYIVFTYRPCGCCSYVGRRGGGPQAISIGKNCDKFGI VVHELGHVVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFLKMEPQEVESLGETYDFDSIMHYARNTFSRG IFLDTIVPKYEVNGVKPPIGQRTRLSKGDIAQARKLYKCPACGETLQDSTGNFSSPEYPNGYSAHMHCVW RISVTPGEKIILNFTSLDLYRSRLCWYDYVEVRDGFWRKAPLRGRFCGSKLPEPIVSTDSRLWVEFRSSS NWVGKGFFAVYEAICGGDVKKDYGHIQSPNYPDDYRPSKVCIWRIQVSEGFHVGLTFQSFEIERHDSCAY DYLEVRDGHSESSTLIGRYCGYEKPDDIKSTSSRLWLKFVSDGSINKAGFAVNFFKEVDECSRPNRGGCE QRCLNTLGSYKCSCDPGYELAPDKRRCEAACGGFLTKLNGSITSPGWPKEYPPNKNCIWQLVAPTQYRIS LQFDFFETEGNDVCKYDFVEVRSGLTADSKLHGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHF FSDKDECSKDNGGCQQDCVNTFGSYECQCRSGFVLHDNKHDCKEAGCDHKVTSTSGTITSPNWPDKYPSK KECTWAISSTPGHRVKLTFMEMDIESQPECAYDHLEVFDGRDAKAPVLGRFCGSKKPEPVLATGSRMFLR FYSDNSVQRKGFQASHATECGGQVRADVKTKDLYSHAQFGDNNYPGGVDCEWVIVAEEGYGVELVFQTFE VEEETDCGYDYMELFDGYDSTAPRLGRYCGSGPPEEVYSAGDSVLVKFHSDDTITKKGFHLRYTSTKFQD TLHSRK ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > gi|1245357|gb|AAA93462.1| . . . . . 1 MPGVARLPLLLGLLLLPRPGRPLDLADYTYDLAEEDDSEPLNYKDPCKAA 50 ___EEEHHHHHHHH____________HHHHHHH______________HHH . . . . . 51 AFLGDIALDEEDLRAFQVQQAVDLRRHTARKSSIKAAVPGNTSTPSCQST 100 HHH___HHHHHHHHHHHHHHHHHHHHH___HHHHHH______________ . . . . . 101 NGQPQRGACGRWRGRSRSRRAATSRPERVWPDGVIPFVIGGNFTGSQRAV 150 _________________HHHHHH__________EEEEEE_____HHHHHH . . . . . 151 FRQAMRHWEKHTCVTFLERTDEDSYIVFTYRPCGCCSYVGRRGGGPQAIS 200 HHHHHHHHHHHEEEEEEE______EEEEEE_________________EEE . . . . . 201 IGKNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFL 250 E______EEEEEEE__EEEEEEE_________EEEEEEE_________EE . . . . . 251 KMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKYEVNGVKPPIG 300 E___________EEE__HHHHHH______EEEE___EEEEE_________ . . . . . 301 QRTRLSKGDIAQARKLYKCPACGETLQDSTGNFSSPEYPNGYSAHMHCVW 350 _________HHHHHHHH__________________________EEEEEEE . . . . . 351 RISVTPGEKIILNFTSLDLYRSRLCWYDYVEVRDGFWRKAPLRGRFCGSK 400 EEE____EEEEEE___________EEEEEEEE___________EEE____ . . . . . 401 LPEPIVSTDSRLWVEFRSSSNWVGKGFFAVYEAICGGDVKKDYGHIQSPN 450 ____________EEEEE_________EEEEEEEE________________ . . . . . 451 YPDDYRPSKVCIWRIQVSEGFHVGLTFQSFEIERHDSCAYDYLEVRDGHS 500 ________EEEEEEEEE____EEEEEE_EEEE________EEEEE_____ . . . . . 501 ESSTLIGRYCGYEKPDDIKSTSSRLWLKFVSDGSINKAGFAVNFFKEVDE 550 ___EEEEEEE__________HHHHHEEEEE_______EEEEEEEE_____ . . . . . 551 CSRPNRGGCEQRCLNTLGSYKCSCDPGYELAPDKRRCEAACGGFLTKLNG 600 _________________________________HHHHHHHH__EEEE___ . . . . . 601 SITSPGWPKEYPPNKNCIWQLVAPTQYRISLQFDFFETEGNDVCKYDFVE 650 _________________EEEEE____EEEEHHHHH________EEEEEEE . . . . . 651 VRSGLTADSKLHGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHF 700 E____________________EEEE______EEEEE___________EEE . . . . . 701 FSDKDECSKDNGGCQQDCVNTFGSYECQCRSGFVLHDNKHDCKEAGCDHK 750 E________________________EEEE_____________________ . . . . . 751 VTSTSGTITSPNWPDKYPSKKECTWAISSTPGHRVKLTFMEMDIESQPEC 800 _____________________EEEEEEE_____EEEEEE___________ . . . . . 801 AYDHLEVFDGRDAKAPVLGRFCGSKKPEPVLATGSRMFLRFYSDNSVQRK 850 ___EEEE_________EEEEEE_______EEE____EEEEE______HHH . . . . . 851 GFQASHATECGGQVRADVKTKDLYSHAQFGDNNYPGGVDCEWVIVAEEGY 900 HHHHHHH______EEEEE_____________________EEEEEEEE___ . . . . . 901 GVELVFQTFEVEEETDCGYDYMELFDGYDSTAPRLGRYCGSGPPEEVYSA 950 _EEEEEE_EEEEEE_______EEEE____________________EEE__ . . . 951 GDSVLVKFHSDDTITKKGFHLRYTSTKFQDTLHSRK 986 __EEEEEEE__________EEEEE____________ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 4.0 % beta-contents : 23.8 % coil-contents : 72.2 % class : beta method : 2 alpha-contents : 0.0 % beta-contents : 26.9 % coil-contents : 73.1 % class : beta ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -13.14 -0.19 -0.06 0.00 -4.00 0.00 0.00 0.00 -0.61 -8.78 -4.02 -12.00 -12.00 0.00 -12.00 0.00 -66.81 -2.41 -0.16 -0.25 -0.47 0.00 0.00 0.00 -0.55 -0.12 -10.03 -4.02 -12.00 -12.00 0.00 -12.00 0.00 -54.02 ID: gi|1245357|gb|AAA93462.1| AC: xxx Len: 980 1:I 949 Sc: -54.02 Pv: 2.768261e-01 NO_GPI_SITE GPI: learning from protozoa -21.71 -0.16 -0.02 0.00 -4.00 0.00 0.00 0.00 -0.13 -7.51 -13.36 -12.00 -12.00 0.00 -12.00 0.00 -82.89 -17.37 -0.84 -0.58 -1.10 0.00 0.00 0.00 0.00 0.00 -8.28 -13.36 -12.00 -12.00 0.00 -12.00 0.00 -77.53 ID: gi|1245357|gb|AAA93462.1| AC: xxx Len: 980 1:I 960 Sc: -77.53 Pv: 4.653755e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? gi|1245357| 0.724 23 Y 0.748 23 Y 0.967 3 Y 0.839 Y # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? gi|1245357| 0.350 274 N 0.321 21 N 0.955 8 Y 0.612 Y # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? gi|1245357| 0.399 913 N 0.380 24 Y 0.948 116 N 0.801 Y ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >gi|1245357|gb|AAA93462.1| procollagen C-proteinase 1-5 MPGVA rlplllgllllprpgrpl 6-23 24-105 DLADYTYDLAEEDDSEPLNYKDPCKAAAFL GDIALDEEDLRAFQVQQAVDLRRHTARKSS IKAAVPGNTSTPSCQSTNGQPQ rgacgrwrgrsrsrraatsr 106-125 126-986 PERVWPDGVIPFVIGGNFTGSQRAVFRQAM RHWEKHTCVTFLERTDEDSYIVFTYRPCGC CSYVGRRGGGPQAISIGKNCDKFGIVVHEL GHVVGFWHEHTRPDRDRHVSIVRENIQPGQ EYNFLKMEPQEVESLGETYDFDSIMHYARN TFSRGIFLDTIVPKYEVNGVKPPIGQRTRL SKGDIAQARKLYKCPACGETLQDSTGNFSS PEYPNGYSAHMHCVWRISVTPGEKIILNFT SLDLYRSRLCWYDYVEVRDGFWRKAPLRGR FCGSKLPEPIVSTDSRLWVEFRSSSNWVGK GFFAVYEAICGGDVKKDYGHIQSPNYPDDY RPSKVCIWRIQVSEGFHVGLTFQSFEIERH DSCAYDYLEVRDGHSESSTLIGRYCGYEKP DDIKSTSSRLWLKFVSDGSINKAGFAVNFF KEVDECSRPNRGGCEQRCLNTLGSYKCSCD PGYELAPDKRRCEAACGGFLTKLNGSITSP GWPKEYPPNKNCIWQLVAPTQYRISLQFDF FETEGNDVCKYDFVEVRSGLTADSKLHGKF CGSEKPEVITSQYNNMRVEFKSDNTVSKKG FKAHFFSDKDECSKDNGGCQQDCVNTFGSY ECQCRSGFVLHDNKHDCKEAGCDHKVTSTS GTITSPNWPDKYPSKKECTWAISSTPGHRV KLTFMEMDIESQPECAYDHLEVFDGRDAKA PVLGRFCGSKKPEPVLATGSRMFLRFYSDN SVQRKGFQASHATECGGQVRADVKTKDLYS HAQFGDNNYPGGVDCEWVIVAEEGYGVELV FQTFEVEEETDCGYDYMELFDGYDSTAPRL GRYCGSGPPEEVYSAGDSVLVKFHSDDTIT KKGFHLRYTSTKFQDTLHSRK low complexity regions: SEG 25 3.0 3.3 >gi|1245357|gb|AAA93462.1| procollagen C-proteinase 1-1 M pgvarlplllgllllprpgrpldladytyd 2-43 laeeddseplny 44-88 KDPCKAAAFLGDIALDEEDLRAFQVQQAVD LRRHTARKSSIKAAV pgntstpscqstngqpqrgacgrwrgrsrs 89-126 rraatsrp 127-886 ERVWPDGVIPFVIGGNFTGSQRAVFRQAMR HWEKHTCVTFLERTDEDSYIVFTYRPCGCC SYVGRRGGGPQAISIGKNCDKFGIVVHELG HVVGFWHEHTRPDRDRHVSIVRENIQPGQE YNFLKMEPQEVESLGETYDFDSIMHYARNT FSRGIFLDTIVPKYEVNGVKPPIGQRTRLS KGDIAQARKLYKCPACGETLQDSTGNFSSP EYPNGYSAHMHCVWRISVTPGEKIILNFTS LDLYRSRLCWYDYVEVRDGFWRKAPLRGRF CGSKLPEPIVSTDSRLWVEFRSSSNWVGKG FFAVYEAICGGDVKKDYGHIQSPNYPDDYR PSKVCIWRIQVSEGFHVGLTFQSFEIERHD SCAYDYLEVRDGHSESSTLIGRYCGYEKPD DIKSTSSRLWLKFVSDGSINKAGFAVNFFK EVDECSRPNRGGCEQRCLNTLGSYKCSCDP GYELAPDKRRCEAACGGFLTKLNGSITSPG WPKEYPPNKNCIWQLVAPTQYRISLQFDFF ETEGNDVCKYDFVEVRSGLTADSKLHGKFC GSEKPEVITSQYNNMRVEFKSDNTVSKKGF KAHFFSDKDECSKDNGGCQQDCVNTFGSYE CQCRSGFVLHDNKHDCKEAGCDHKVTSTSG TITSPNWPDKYPSKKECTWAISSTPGHRVK LTFMEMDIESQPECAYDHLEVFDGRDAKAP VLGRFCGSKKPEPVLATGSRMFLRFYSDNS VQRKGFQASHATECGGQVRADVKTKDLYSH AQFGDNNYPG gvdcewvivaeegygvelvfqtfeveeetd 887-929 cgydymelfdgyd 930-986 STAPRLGRYCGSGPPEEVYSAGDSVLVKFH SDDTITKKGFHLRYTSTKFQDTLHSRK low complexity regions: SEG 45 3.4 3.75 >gi|1245357|gb|AAA93462.1| procollagen C-proteinase 1-1 M pgvarlplllgllllprpgrpldladytyd 2-65 laeeddseplnykdpckaaaflgdialdee dlra 66-69 FQVQ qavdlrrhtarkssikaavpgntstpscqs 70-156 tngqpqrgacgrwrgrsrsrraatsrperv wpdgvipfviggnftgsqravfrqamr 157-986 HWEKHTCVTFLERTDEDSYIVFTYRPCGCC SYVGRRGGGPQAISIGKNCDKFGIVVHELG HVVGFWHEHTRPDRDRHVSIVRENIQPGQE YNFLKMEPQEVESLGETYDFDSIMHYARNT FSRGIFLDTIVPKYEVNGVKPPIGQRTRLS KGDIAQARKLYKCPACGETLQDSTGNFSSP EYPNGYSAHMHCVWRISVTPGEKIILNFTS LDLYRSRLCWYDYVEVRDGFWRKAPLRGRF CGSKLPEPIVSTDSRLWVEFRSSSNWVGKG FFAVYEAICGGDVKKDYGHIQSPNYPDDYR PSKVCIWRIQVSEGFHVGLTFQSFEIERHD SCAYDYLEVRDGHSESSTLIGRYCGYEKPD DIKSTSSRLWLKFVSDGSINKAGFAVNFFK EVDECSRPNRGGCEQRCLNTLGSYKCSCDP GYELAPDKRRCEAACGGFLTKLNGSITSPG WPKEYPPNKNCIWQLVAPTQYRISLQFDFF ETEGNDVCKYDFVEVRSGLTADSKLHGKFC GSEKPEVITSQYNNMRVEFKSDNTVSKKGF KAHFFSDKDECSKDNGGCQQDCVNTFGSYE CQCRSGFVLHDNKHDCKEAGCDHKVTSTSG TITSPNWPDKYPSKKECTWAISSTPGHRVK LTFMEMDIESQPECAYDHLEVFDGRDAKAP VLGRFCGSKKPEPVLATGSRMFLRFYSDNS VQRKGFQASHATECGGQVRADVKTKDLYSH AQFGDNNYPGGVDCEWVIVAEEGYGVELVF QTFEVEEETDCGYDYMELFDGYDSTAPRLG RYCGSGPPEEVYSAGDSVLVKFHSDDTITK KGFHLRYTSTKFQDTLHSRK low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >gi|1245357|gb|AAA93462.1| procollagen C-proteinase MPGVARLPLLLGLLLLPRPGRPLDLADYTYDLAEEDDSEPLNYKDPCKAAAFLGDIALDE EDLRAFQVQQAVDLRRHTARKSSIKAAVPGNTSTPSCQSTNGQPQRGACgrwrgrsrsrR AATSRPERVWPDGVIPFVIGGNFTGSQRAVFRQAMRHWEKHTCVTFLERTDEDSYIVFTY RPCGCCSYVGRRGGGPQAISIGKNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVSIVREN IQPGQEYNFLKMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKYEVNGVKPPIG QRTRLSKGDIAQARKLYKCPACGETLQDSTGNFSSPEYPNGYSAHMHCVWRISVTPGEKI ILNFTSLDLYRSRLCWYDYVEVRDGFWRKAPLRGRFCGSKLPEPIVSTDSRLWVEFRSSS NWVGKGFFAVYEAICGGDVKKDYGHIQSPNYPDDYRPSKVCIWRIQVSEGFHVGLTFQSF EIERHDSCAYDYLEVRDGHSESSTLIGRYCGYEKPDDIKSTSSRLWLKFVSDGSINKAGF AVNFFKEVDECSRPNRGGCEQRCLNTLGSYKCSCDPGYELAPDKRRCEAACGGFLTKLNG SITSPGWPKEYPPNKNCIWQLVAPTQYRISLQFDFFETEGNDVCKYDFVEVRSGLTADSK LHGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHFFSDKDECSKDNGGCQQDCVN TFGSYECQCRSGFVLHDNKHDCKEAGCDHKVTSTSGTITSPNWPDKYPSKKECTWAISST PGHRVKLTFMEMDIESQPECAYDHLEVFDGRDAKAPVLGRFCGSKKPEPVLATGSRMFLR FYSDNSVQRKGFQASHATECGGQVRADVKTKDLYSHAQFGDNNYPGGVDCEWVIVAEEGY GVELVFQTFEVEEETDCGYDYMELFDGYDSTAPRLGRYCGSGPPEEVYSAGDSVLVKFHS DDTITKKGFHLRYTSTKFQDTLHSRK 1 - 109 MPGVARLPLL LGLLLLPRPG RPLDLADYTY DLAEEDDSEP LNYKDPCKAA AFLGDIALDE EDLRAFQVQQ AVDLRRHTAR KSSIKAAVPG NTSTPSCQST NGQPQRGAC 110 - 119 g rwrgrsrsr 120 - 986 R AATSRPERVW PDGVIPFVIG GNFTGSQRAV FRQAMRHWEK HTCVTFLERT DEDSYIVFT Y RPCGCCSYVG RRGGGPQAIS IGKNCDKFGI VVHELGHVVG FWHEHTRPDR DRHVSIVRE N IQPGQEYNFL KMEPQEVESL GETYDFDSIM HYARNTFSRG IFLDTIVPKY EVNGVKPPI G QRTRLSKGDI AQARKLYKCP ACGETLQDST GNFSSPEYPN GYSAHMHCVW RISVTPGEK I ILNFTSLDLY RSRLCWYDYV EVRDGFWRKA PLRGRFCGSK LPEPIVSTDS RLWVEFRSS S NWVGKGFFAV YEAICGGDVK KDYGHIQSPN YPDDYRPSKV CIWRIQVSEG FHVGLTFQS F EIERHDSCAY DYLEVRDGHS ESSTLIGRYC GYEKPDDIKS TSSRLWLKFV SDGSINKAG F AVNFFKEVDE CSRPNRGGCE QRCLNTLGSY KCSCDPGYEL APDKRRCEAA CGGFLTKLN G SITSPGWPKE YPPNKNCIWQ LVAPTQYRIS LQFDFFETEG NDVCKYDFVE VRSGLTADS K LHGKFCGSEK PEVITSQYNN MRVEFKSDNT VSKKGFKAHF FSDKDECSKD NGGCQQDCV N TFGSYECQCR SGFVLHDNKH DCKEAGCDHK VTSTSGTITS PNWPDKYPSK KECTWAISS T PGHRVKLTFM EMDIESQPEC AYDHLEVFDG RDAKAPVLGR FCGSKKPEPV LATGSRMFL R FYSDNSVQRK GFQASHATEC GGQVRADVKT KDLYSHAQFG DNNYPGGVDC EWVIVAEEG Y GVELVFQTFE VEEETDCGYD YMELFDGYDS TAPRLGRYCG SGPPEEVYSA GDSVLVKFH S DDTITKKGFH LRYTSTKFQD TLHSRK low complexity regions: DUST >gi|1245357|gb|AAA93462.1| procollagen C-proteinase MPGVARLPLLLGLLLLPRPGRPLDLADYTYDLAEEDDSEPLNYKDPCKAAAFLGDIALDE EDLRAFQVQQAVDLRRHTARKSSIKAAVPGNTSTPSCQSTNGQPQRGACGRWRGRSRSRR AATSRPERVWPDGVIPFVIGGNFTGSQRAVFRQAMRHWEKHTCVTFLERTDEDSYIVFTY RPCGCCSYVGRRGGGPQAISIGKNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVSIVREN IQPGQEYNFLKMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKYEVNGVKPPIG QRTRLSKGDIAQARKLYKCPACGETLQDSTGNFSSPEYPNGYSAHMHCVWRISVTPGEKI ILNFTSLDLYRSRLCWYDYVEVRDGFWRKAPLRGRFCGSKLPEPIVSTDSRLWVEFRSSS NWVGKGFFAVYEAICGGDVKKDYGHIQSPNYPDDYRPSKVCIWRIQVSEGFHVGLTFQSF EIERHDSCAYDYLEVRDGHSESSTLIGRYCGYEKPDDIKSTSSRLWLKFVSDGSINKAGF AVNFFKEVDECSRPNRGGCEQRCLNTLGSYKCSCDPGYELAPDKRRCEAACGGFLTKLNG SITSPGWPKEYPPNKNCIWQLVAPTQYRISLQFDFFETEGNDVCKYDFVEVRSGLTADSK LHGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHFFSDKDECSKDNGGCQQDCVN TFGSYECQCRSGFVLHDNKHDCKEAGCDHKVTSTSGTITSPNWPDKYPSKKECTWAISST PGHRVKLTFMEMDIESQPECAYDHLEVFDGRDAKAPVLGRFCGSKKPEPVLATGSRMFLR FYSDNSVQRKGFQASHATECGGQVRADVKTKDLYSHAQFGDNNYPGGVDCEWVIVAEEGY GVELVFQTFEVEEETDCGYDYMELFDGYDSTAPRLGRYCGSGPPEEVYSAGDSVLVKFHS DDTITKKGFHLRYTSTKFQDTLHSRK ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for gi|1245357|gb|AAA93462.1| sequence: 980 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MPGVARLPLL LGLLLLPRPG RPLDLADYTY DLAEEDDSEP LNYKDPCKAA AFLGDIALDE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 EDLRAFQVQQ AVDLRRHTAR KSSIKAAVPG NTSTPSCQST NGQPQRGACG RWRGRSRSRR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 AATSRPERVW PDGVIPFVIG GNFTGSQRAV FRQAMRHWEK HTCVTFLERT DEDSYIVFTY ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 RPCGCCSYVG RRGGGPQAIS IGKNCDKFGI VVHELGHVVG FWHEHTRPDR DRHVSIVREN ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 IQPGQEYNFL KMEPQEVESL GETYDFDSIM HYARNTFSRG IFLDTIVPKY EVNGVKPPIG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 360 QRTRLSKGDI AQARKLYKCP ACGETLQDST GNFSSPEYPN GYSAHMHCVW RISVTPGEKI ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 420 ILNFTSLDLY RSRLCWYDYV EVRDGFWRKA PLRGRFCGSK LPEPIVSTDS RLWVEFRSSS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 480 NWVGKGFFAV YEAICGGDVK KDYGHIQSPN YPDDYRPSKV CIWRIQVSEG FHVGLTFQSF ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 540 EIERHDSCAY DYLEVRDGHS ESSTLIGRYC GYEKPDDIKS TSSRLWLKFV SDGSINKAGF ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 600 AVNFFKEVDE CSRPNRGGCE QRCLNTLGSY KCSCDPGYEL APDKRRCEAA CGGFLTKLNG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 660 SITSPGWPKE YPPNKNCIWQ LVAPTQYRIS LQFDFFETEG NDVCKYDFVE VRSGLTADSK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 720 LHGKFCGSEK PEVITSQYNN MRVEFKSDNT VSKKGFKAHF FSDKDECSKD NGGCQQDCVN ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 780 TFGSYECQCR SGFVLHDNKH DCKEAGCDHK VTSTSGTITS PNWPDKYPSK KECTWAISST ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 840 PGHRVKLTFM EMDIESQPEC AYDHLEVFDG RDAKAPVLGR FCGSKKPEPV LATGSRMFLR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 900 FYSDNSVQRK GFQASHATEC GGQVRADVKT KDLYSHAQFG DNNYPGGVDC EWVIVAEEGY ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 960 GVELVFQTFE VEEETDCGYD YMELFDGYDS TAPRLGRYCG SGPPEEVYSA GDSVLVKFHS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | DDTITKKGFH LRYTSTKFQD ~~~~~~~~~~ ~~~~~~~~~~ ---------- ---------- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** tem25.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem25.___inter___ (1 sequences) MPGVARLPLLLGLLLLPRPGRPLDLADYTYDLAEEDDSEPLNYKDPCKAA AFLGDIALDEEDLRAFQVQQAVDLRRHTARKSSIKAAVPGNTSTPSCQST NGQPQRGACGRWRGRSRSRRAATSRPERVWPDGVIPFVIGGNFTGSQRAV FRQAMRHWEKHTCVTFLERTDEDSYIVFTYRPCGCCSYVGRRGGGPQAIS IGKNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFL KMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKYEVNGVKPPIG QRTRLSKGDIAQARKLYKCPACGETLQDSTGNFSSPEYPNGYSAHMHCVW RISVTPGEKIILNFTSLDLYRSRLCWYDYVEVRDGFWRKAPLRGRFCGSK LPEPIVSTDSRLWVEFRSSSNWVGKGFFAVYEAICGGDVKKDYGHIQSPN YPDDYRPSKVCIWRIQVSEGFHVGLTFQSFEIERHDSCAYDYLEVRDGHS ESSTLIGRYCGYEKPDDIKSTSSRLWLKFVSDGSINKAGFAVNFFKEVDE CSRPNRGGCEQRCLNTLGSYKCSCDPGYELAPDKRRCEAACGGFLTKLNG SITSPGWPKEYPPNKNCIWQLVAPTQYRISLQFDFFETEGNDVCKYDFVE VRSGLTADSKLHGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHF FSDKDECSKDNGGCQQDCVNTFGSYECQCRSGFVLHDNKHDCKEAGCDHK VTSTSGTITSPNWPDKYPSKKECTWAISSTPGHRVKLTFMEMDIESQPEC AYDHLEVFDGRDAKAPVLGRFCGSKKPEPVLATGSRMFLRFYSDNSVQRK GFQASHATECGGQVRADVKTKDLYSHAQFGDNNYPGGVDCEWVIVAEEGY GVELVFQTFEVEEETDCGYDYMELFDGYDSTAPRLGRYCGSGPPEEVYSA GDSVLVKFHSDDTITKKGFHLRYTSTKFQDTLHSRK (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment Loop length 986 K+R profile + CYT-EXT prof 0.37 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 0.00 Tm probability: 1.00 -> Orientation: undecided Charge-difference over N-terminal Tm (+-15 residues): 3.00 (NEG-POS)/(NEG+POS): 0.0312 NEG: 132.0000 POS: 124.0000 -> Orientation: N-in CYT-EXT difference: 0.37 -> Orientation: N-out ---------------------------------------------------------------------- "tem25" 986 ************************************ *TOPPREDM with prokaryotic function* ************************************ tem25.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem25.___inter___ (1 sequences) MPGVARLPLLLGLLLLPRPGRPLDLADYTYDLAEEDDSEPLNYKDPCKAA AFLGDIALDEEDLRAFQVQQAVDLRRHTARKSSIKAAVPGNTSTPSCQST NGQPQRGACGRWRGRSRSRRAATSRPERVWPDGVIPFVIGGNFTGSQRAV FRQAMRHWEKHTCVTFLERTDEDSYIVFTYRPCGCCSYVGRRGGGPQAIS IGKNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFL KMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKYEVNGVKPPIG QRTRLSKGDIAQARKLYKCPACGETLQDSTGNFSSPEYPNGYSAHMHCVW RISVTPGEKIILNFTSLDLYRSRLCWYDYVEVRDGFWRKAPLRGRFCGSK LPEPIVSTDSRLWVEFRSSSNWVGKGFFAVYEAICGGDVKKDYGHIQSPN YPDDYRPSKVCIWRIQVSEGFHVGLTFQSFEIERHDSCAYDYLEVRDGHS ESSTLIGRYCGYEKPDDIKSTSSRLWLKFVSDGSINKAGFAVNFFKEVDE CSRPNRGGCEQRCLNTLGSYKCSCDPGYELAPDKRRCEAACGGFLTKLNG SITSPGWPKEYPPNKNCIWQLVAPTQYRISLQFDFFETEGNDVCKYDFVE VRSGLTADSKLHGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHF FSDKDECSKDNGGCQQDCVNTFGSYECQCRSGFVLHDNKHDCKEAGCDHK VTSTSGTITSPNWPDKYPSKKECTWAISSTPGHRVKLTFMEMDIESQPEC AYDHLEVFDGRDAKAPVLGRFCGSKKPEPVLATGSRMFLRFYSDNSVQRK GFQASHATECGGQVRADVKTKDLYSHAQFGDNNYPGGVDCEWVIVAEEGY GVELVFQTFEVEEETDCGYDYMELFDGYDSTAPRLGRYCGSGPPEEVYSA GDSVLVKFHSDDTITKKGFHLRYTSTKFQDTLHSRK (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment Loop length 986 K+R profile + CYT-EXT prof 0.37 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 0.00 Tm probability: 1.00 -> Orientation: undecided Charge-difference over N-terminal Tm (+-15 residues): 3.00 (NEG-POS)/(NEG+POS): 0.0312 NEG: 132.0000 POS: 124.0000 -> Orientation: N-in CYT-EXT difference: 0.37 -> Orientation: N-out ---------------------------------------------------------------------- "tem25" 986 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem25.___saps___ SAPS. Version of April 11, 1996. Date run: Mon Nov 13 22:07:08 2000 File: /people/maria/tem25.___saps___ ID gi|1245357|gb|AAA93462.1| DE procollagen C-proteinase number of residues: 986; molecular weight: 111.3 kdal 1 MPGVARLPLL LGLLLLPRPG RPLDLADYTY DLAEEDDSEP LNYKDPCKAA AFLGDIALDE 61 EDLRAFQVQQ AVDLRRHTAR KSSIKAAVPG NTSTPSCQST NGQPQRGACG RWRGRSRSRR 121 AATSRPERVW PDGVIPFVIG GNFTGSQRAV FRQAMRHWEK HTCVTFLERT DEDSYIVFTY 181 RPCGCCSYVG RRGGGPQAIS IGKNCDKFGI VVHELGHVVG FWHEHTRPDR DRHVSIVREN 241 IQPGQEYNFL KMEPQEVESL GETYDFDSIM HYARNTFSRG IFLDTIVPKY EVNGVKPPIG 301 QRTRLSKGDI AQARKLYKCP ACGETLQDST GNFSSPEYPN GYSAHMHCVW RISVTPGEKI 361 ILNFTSLDLY RSRLCWYDYV EVRDGFWRKA PLRGRFCGSK LPEPIVSTDS RLWVEFRSSS 421 NWVGKGFFAV YEAICGGDVK KDYGHIQSPN YPDDYRPSKV CIWRIQVSEG FHVGLTFQSF 481 EIERHDSCAY DYLEVRDGHS ESSTLIGRYC GYEKPDDIKS TSSRLWLKFV SDGSINKAGF 541 AVNFFKEVDE CSRPNRGGCE QRCLNTLGSY KCSCDPGYEL APDKRRCEAA CGGFLTKLNG 601 SITSPGWPKE YPPNKNCIWQ LVAPTQYRIS LQFDFFETEG NDVCKYDFVE VRSGLTADSK 661 LHGKFCGSEK PEVITSQYNN MRVEFKSDNT VSKKGFKAHF FSDKDECSKD NGGCQQDCVN 721 TFGSYECQCR SGFVLHDNKH DCKEAGCDHK VTSTSGTITS PNWPDKYPSK KECTWAISST 781 PGHRVKLTFM EMDIESQPEC AYDHLEVFDG RDAKAPVLGR FCGSKKPEPV LATGSRMFLR 841 FYSDNSVQRK GFQASHATEC GGQVRADVKT KDLYSHAQFG DNNYPGGVDC EWVIVAEEGY 901 GVELVFQTFE VEEETDCGYD YMELFDGYDS TAPRLGRYCG SGPPEEVYSA GDSVLVKFHS 961 DDTITKKGFH LRYTSTKFQD TLHSRK -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 48( 4.9%); C+ : 41( 4.2%); D : 68( 6.9%); E : 64( 6.5%); F : 52( 5.3%) G : 86( 8.7%); H : 27( 2.7%); I : 36( 3.7%); K : 58( 5.9%); L : 58( 5.9%) M : 10( 1.0%); N : 32( 3.2%); P : 53( 5.4%); Q : 33( 3.3%); R : 67( 6.8%) S : 80( 8.1%); T : 51( 5.2%); V : 63( 6.4%); W : 16( 1.6%); Y : 43( 4.4%) KR : 125 ( 12.7%); ED : 132 ( 13.4%); AGP : 187 ( 19.0%); KRED : 257 ( 26.1%); KR-ED : -7 ( -0.7%); FIKMNY : 231 ( 23.4%); LVIFM : 219 ( 22.2%); ST : 131 ( 13.3%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 00000+0000 0000000+00 +00-00-000 -00----0-0 000+-00+00 0000-000-- 61 --0+000000 00-0++000+ +000+00000 0000000000 00000+0000 +0+0+0+0++ 121 0000+0-+00 0-00000000 0000000+00 0+000+00-+ 0000000-+0 ---0000000 181 +000000000 ++00000000 00+00-+000 000-000000 000-00+0-+ -+00000+-0 241 00000-0000 +0-00-0-00 0-00-0-000 000+0000+0 000-0000+0 -0000+0000 301 0+0+00+0-0 000++00+00 000-000-00 000000-000 0000000000 +000000-+0 361 0000000-00 +0+0000-00 -0+-000++0 00+0+0000+ 00-00000-0 +000-0+000 421 0000+00000 0-00000-0+ +-00000000 00--0+00+0 000+0000-0 0000000000 481 -0-+0-0000 -00-0+-000 -000000+00 00-+0--0+0 000+000+00 0-0000+000 541 00000+-0-- 00+00+000- 0+00000000 +000-000-0 00-+++0-00 000000+000 601 00000000+- 0000+00000 0000000+00 000-00-0-0 0-00+0-00- 0+00000-0+ 661 000+0000-+ 0-00000000 0+0-0+0-00 00++00+000 00-+--00+- 000000-000 721 00000-000+ 000000-0+0 -0+-000-0+ 0000000000 0000-+000+ +-00000000 781 000+0+0000 -0-0-000-0 00-00-00-0 +-0+00000+ 0000++0-00 00000+000+ 841 000-0000++ 00000000-0 0000+0-0+0 +-00000000 -0000000-0 -00000--00 901 00-000000- 0---0-000- 00-00-00-0 000+00+000 0000--0000 0-0000+000 961 --000++000 0+0000+00- 0000++ A. CHARGE CLUSTERS. Positive charge clusters (cmin = 12/30 or 15/45 or 19/60): none Negative charge clusters (cmin = 12/30 or 16/45 or 19/60): none Mixed charge clusters (cmin = 18/30 or 24/45 or 30/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 5 | 6 | 8 | 34 | 10 | 11 | 14 | 12 | 12 | 17 | 7 | 9 | lmin1 7 | 7 | 10 | 41 | 13 | 13 | 17 | 15 | 15 | 20 | 9 | 12 | lmin2 8 | 8 | 12 | 45 | 14 | 14 | 19 | 17 | 17 | 22 | 10 | 13 | (Significance level: 0.010000; Minimal displayed length: 6) (+0) 10(0,0,0); at 110- 119: GRWRGRSRSR (1. quartile) 0+0+0+0+0+ (-00) 16(0,1,0); at 915- 931: TDCGYDYMELFDGYDST (4. quartile) 0-000-00-00-00-00 (*00) 24(0,1,1); at 915- 939: TDCGYDYMELFDGYDSTAPRLGRYC (4. quartile) 0-000-00-00-00-0000+00+00 Run count statistics: + runs >= 4: 0 - runs >= 4: 2, at 34; 59; * runs >= 6: 0 0 runs >= 22: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. There are no high scoring transmembrane segments. 2. SPACINGS OF C. H2N-46-C-49-C-11-C-53-C-19-C-1-C-C-18-C-113-C-2-C-25-C-26-C-21-C-37-C-25-C-26-C-21-C-40-C-7-C-3-C-8-C-1-C-12-C-3-C-25-C-26-C-21-C-40-C-6-C-3-C-8-C-1-C-12-C-4-C-25-C-26-C-21-C-37-C-29-C-26-C-21-C-47-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-46-C-29-H-19-C-11-C-47-H-3-H-1-C-19-C-1-C-C-18-C-7-H-3-H-5-H-1-H-7-H-37-H-47-C-2-C-22-H-1-H-C-26-C-21-C-37-C-9-H-15-C-10-H-12-H-2-C-10-H-10-C-40-C-7-C-3-C-8-C-1-C-12-C-3-C-25-C-26-C-17-H-3-C-32-H-7-C-6-C-3-C-8-C-1-C-6-H-3-H-1-C-4-C-1-H-23-C-9-H-16-C-3-H-17-C-33-H-3-C-15-H-13-C-26-C-21-C-19-H-10-H-12-H-3-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 5 Aligned matching blocks: [ 377- 385] YDYVEVRDG [ 490- 498] YDYLEVRDG ______________________________ [ 377- 385]-( 3)-[ 389- 404] [ 802- 810]-( 3)-[ 814- 829] [ 377- 385] YDYVEVRDG [ 802- 810] YDHLEVFDG [ 389- 404] KAPLRGRFCGSKLPEP [ 814- 829] KAPVLGRFCGSKKPEP ______________________________ [ 507- 511] GRYCG [ 936- 940] GRYCG B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 9 Aligned matching blocks: [ 477- 503] inoi-i-+h-ocsa-ai-i+-sho-oo [ 906- 931] inoi-i---o-csa-am-ii-s_a-oo with superset: [ 377- 385] a-ai-i+-s [ 490- 498] a-ai-i+-s [ 646- 654] a-ii-i+os [ 802- 810] a-hi-ii-s [ 919- 927] a-am-ii-s -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 53 (Expected range: 32-- 75) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 15 (6-10) 10 (11-20) 14 (>=21) 15 3. Clusters of amino acid multiplets (cmin = 10/30 or 13/45 or 16/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 25 (Expected range: 13-- 46) 14 +plets (f+: 12.7%), 11 -plets (f-: 13.4%) Total number of charge altplets: 24 (Critical number: 51) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 3 (6-10) 0 (11-20) 7 (>=21) 16 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 7- 16 1 L 8 4 2 14- 49 9 L...D.... 4 4 /0/./././1/././././ 111- 120 2 R. 5 5 ! 0 195- 222 7 G...... 4 4 0 519- 554 9 K........ 4 4 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 7) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 1- 18 3 i0. 6 6 /0/2/./ 111- 120 2 +0 5 5 /0/1/ 368- 397 5 *.0.0 6 6 /0/./2/./2/ 952- 981 5 *0.0. 6 6 /0/2/./2/./ -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 92- 94 (1.) T( 2)T 52 of 52 0.0034 large minimal spacing 752- 754 (4.) T( 2)T 48 of 52 0.0034 matching minimum 757- 759 (4.) T( 2)T 49 of 52 0.0034 matching minimum 963- 965 (4.) T( 2)T 50 of 52 0.0034 matching minimum 974- 976 (4.) T( 2)T 51 of 52 0.0034 matching minimum ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: tem25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|1245357|gb|AAA93462.1| procollagen C-proteinase Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- CUB CUB domain 1005.1 1.6e-298 5 Astacin Astacin (Peptidase family M12A) 396.1 3.5e-115 1 EGF EGF-like domain 56.0 8e-13 2 LIM LIM domain containing proteins 4.0 12 2 HCV_capsid Hepatitis C virus capsid protein 1.4 46 1 AT_hook DNA binding domain with preference fo 1.2 94 1 neur Neuraminidases -2.5 92 1 Somatomedin_B Somatomedin B domain -16.8 73 1 TIL Trypsin Inhibitor like cysteine rich -18.8 19 1 Metallothio_PEC Plant PEC family metallothionein -49.6 63 1 DUF35 Domain of unknown function DUF35 -53.7 42 1 NAM No apical meristem (NAM) protein -75.1 60 1 A4_EXTRA Amyloid A4 extracellular domain -104.7 19 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- HCV_capsid 1/1 99 106 .. 1 9 [. 1.4 46 AT_hook 1/1 110 122 .. 1 13 [] 1.2 94 neur 1/1 130 139 .. 477 487 .] -2.5 92 NAM 1/1 74 183 .. 1 134 [] -75.1 60 A4_EXTRA 1/1 46 187 .. 1 180 [] -104.7 19 Astacin 1/1 128 321 .. 1 200 [] 396.1 3.5e-115 CUB 1/5 322 431 .. 1 116 [] 205.5 7.9e-58 DUF35 1/1 383 487 .. 1 131 [] -53.7 42 CUB 2/5 435 544 .. 1 116 [] 228.5 9.6e-65 EGF 1/2 551 587 .. 1 45 [] 35.0 1.7e-06 Somatomedin_B 1/1 555 596 .. 1 46 [] -16.8 73 LIM 1/2 634 653 .. 40 61 .] 2.7 28 CUB 3/5 591 700 .. 1 116 [] 210.8 2.1e-59 LIM 2/2 714 723 .. 52 61 .] 1.6 56 TIL 1/1 707 742 .. 1 67 [] -18.8 19 EGF 2/2 707 742 .. 1 45 [] 29.0 0.00011 Metallothio_PEC 1/1 713 760 .. 1 80 [] -49.6 63 CUB 4/5 747 856 .. 1 116 [] 207.1 2.7e-58 CUB 5/5 860 973 .. 1 116 [] 174.5 1.7e-48 Alignments of top-scoring domains: HCV_capsid: domain 1 of 1, from 99 to 106: score 1.4, E = 46 *->mstnpKPqR<-* stn++PqR gi|1245357 99 -STNGQPQR 106 AT_hook: domain 1 of 1, from 110 to 122: score 1.2, E = 94 *->kRkRGRPrKakta<-* +R RGR r ++a gi|1245357 110 GRWRGRSRSRRAA 122 neur: domain 1 of 1, from 130 to 139: score -2.5, E = 92 *->WpDGAdlpffi<-* WpDG+ +pf i gi|1245357 130 WPDGV-IPFVI 139 NAM: domain 1 of 1, from 74 to 183: score -75.1, E = 60 *->lPpGFRFhPTDEELvvhYLrnKvlgkslshvvevIpeiDlykfdPWd Lr+ ++ks+ gi|1245357 74 ------------------LRRHTARKSSI------------------ 84 LPekakigekDqEWYFFsprdrKYpnG.............dRtnRaTksG + a +g+ s++ +nG+++++ ++ ++++R++Ra++s+ gi|1245357 85 --KAAVPGNT-------STPSCQSTNGqpqrgacgrwrgrSRSRRAATSR 125 ..............YWKaTGkDRpImrksgnrrliGmKKtLVFYkGRapk +++ +++ + ++ TG R ++r + +K t V + gi|1245357 126 pervwpdgvipfviGGNFTGSQRAVFR---QAMRHWEKHTCVTF------ 166 GqkTd...WvMHEYRLt<-* ++Td+++++ +YR + gi|1245357 167 LERTDedsYIVFTYRPC 183 A4_EXTRA: domain 1 of 1, from 46 to 187: score -104.7, E = 19 *->gaaeakaePQiAmlCGrlnlhmnlqtseeGrWetDpsrtktGPtCLr a + iA + l ++ +q D+ r + gi|1245357 46 PCKAAAFLGDIALDEEDLRAFQVQQA-------VDLRRHTA------ 79 dKedvLqYCrkaYPelqITNvvEasqpvk....IedWCrrgrsnAAqCkg K + + a P T +++ +++++ ++W +r+rs A gi|1245357 80 RKSSI----KAAVPGNTSTPSCQSTNGQPqrgaCGRWRGRSRSRRAATSR 125 hhhsViPFrCLvGEFvSdALLVPegCqFlHqermdqCedhqrWhqeakea + V P + +Fv g F r+ +W + gi|1245357 126 P-ERVWPDGVI--PFV-------IGGNFTGSQRAVFRQAMRHWEKHTCVT 165 CsekkskGnkgmilrsfgMLLPCGiDkFrGVEFVCCP<-* +e+ + i+ ++ PCG CC gi|1245357 166 FLER--TDEDSYIVFTY---RPCG----------CCS 187 Astacin: domain 1 of 1, from 128 to 321: score 396.1, E = 3.5e-115 *->rrWpngsGiVvIPYvisssysgrersliraAmrewenkTCirFvprt r Wp+g vIP+vi ++++g +r+++r+Amr+we++TC++F +rt gi|1245357 128 RVWPDG----VIPFVIGGNFTGSQRAVFRQAMRHWEKHTCVTFLERT 170 sagendylrffsgdGCwSyVGrrggGkeQevSlganGCiyfGiivHElmH ++ +++++++++++GC+SyVGrrggG+ Q +S+g ++C++fGi+vHEl+H gi|1245357 171 DE-DSYIVFTYRPCGCCSYVGRRGGGP-QAISIG-KNCDKFGIVVHELGH 217 ALGFwHEQsRpDRDdyVsInwqNIdpgqeynFdKydpdqvdslGvpYDYg ++GFwHE++RpDRD++VsI+++NI+pgqeynF K+ p +v+slG++YD++ gi|1245357 218 VVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFLKMEPQEVESLGETYDFD 267 SiMHYgpyaFSkngskpTIvpkdnk.vyqatiGQReglSflDikkiNklY SiMHY++++FS+ + ++TIvpk + ++ +++iGQR +lS++Di+++ klY gi|1245357 268 SIMHYARNTFSRGIFLDTIVPKYEVnGVKPPIGQRTRLSKGDIAQARKLY 317 nCpe<-* +Cp+ gi|1245357 318 KCPA 321 CUB: domain 1 of 5, from 322 to 431: score 205.5, E = 7.9e-58 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt CG+t l++s+G++sSP+YPn Y +++CvWrI+++pg++ + L+ gi|1245357 322 CGET--LQDSTGNFSSPEYPN--GYSAHMHCVWRISVTPGEK-IILN 363 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr F+++dl++++ C+yDyvE+rDG++++ pl+GrfCGs++Pe+i+St++r gi|1245357 364 FTSLDLYRSRL--CWYDYVEVRDGFWRKaPLRGRFCGSKLPEPIVSTDSR 411 mlikFvsDasvskrGFkAty<-* ++++F s++++ ++GF+A y gi|1245357 412 LWVEFRSSSNWVGKGFFAVY 431 DUF35: domain 1 of 1, from 383 to 487: score -53.7, E = 42 *->aktkffrklkkegkliGqkCkkCGrvffPPRaiC...peCgsktell ++ ++++ + +g++ G+k + P+ ++++ ++ + gi|1245357 383 RDGFWRKAPL-RGRFCGSKLPE------PIVSTDsrlWVEFRSS-S- 420 EwVElSgrGkVetFTvvylpppg.....kagfedeeePyviAvveLdggp +wV g+G F vy + g++ ++ +g + + y+ d+++ gi|1245357 421 NWV---GKG----FFAVYEAICGgdvkkDYGHI-QSPNYP------DDYR 456 eglrvlgqlvdvdpeeVkiGmeVeavwrkvkeeedegkityi<-* + + + ++ +V +G+ V +++ ++e ++++ gi|1245357 457 P--SKVCIWR----IQVSEGFHVGLTFQ-----SFEIERHDS 487 CUB: domain 2 of 5, from 435 to 544: score 228.5, E = 9.6e-65 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt CGg+ +++ +G+i+SPnYP+ dY+p+k C+WrI++ +g++ V Lt gi|1245357 435 CGGD--VKKDYGHIQSPNYPD--DYRPSKVCIWRIQVSEGFH-VGLT 476 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr Fq+F++E+hd+ C+yDy+E+rDG+++s++l+Gr+CG++kP+di+Sts+r gi|1245357 477 FQSFEIERHDS--CAYDYLEVRDGHSESsTLIGRYCGYEKPDDIKSTSSR 524 mlikFvsDasvskrGFkAty<-* +++kFvsD+s++k+GF+ ++ gi|1245357 525 LWLKFVSDGSINKAGFAVNF 544 EGF: domain 1 of 2, from 551 to 587: score 35.0, E = 1.7e-06 *->Capnn..pCsngGtCvntpggssdnfggytCeCppGdyylsytGkrC C+ +n+++C + +C+nt g +y+C C pG y+l+ + +rC gi|1245357 551 CSRPNrgGCEQ--RCLNTLG-------SYKCSCDPG-YELAPDKRRC 587 <-* gi|1245357 - - Somatomedin_B: domain 1 of 1, from 555 to 596: score -16.8, E = 73 *->dqwSCkGfRCgEgfnaglkkCrCd...dlCksygdCCtDYeevCkge + C+ RC + + kC+Cd++ +l ++ C e+ C g gi|1245357 555 NRGGCEQ-RCLNTLGSY--KCSCDpgyELAPDKRRC----EAACGGF 594 vs<-* ++ gi|1245357 595 LT 596 LIM: domain 1 of 2, from 634 to 653: score 2.7, E = 28 *->defyekdgkelYCkhDyyklfg<-* d f+e +g +Ck+D+ + ++ gi|1245357 634 D-FFETEGN-DVCKYDFVEVRS 653 CUB: domain 3 of 5, from 591 to 700: score 210.8, E = 2.1e-59 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt CGg lt+ +Gsi+SP++P+ +Yppnk+C+W+++ap +yr ++L+ gi|1245357 591 CGGF--LTKLNGSITSPGWPK--EYPPNKNCIWQLVAPTQYR-ISLQ 632 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr F+ F++E++d C+yD+vE+r G+ ++++l+G+fCGs+kPe+i+S+ n+ gi|1245357 633 FDFFETEGNDV--CKYDFVEVRSGLTADsKLHGKFCGSEKPEVITSQYNN 680 mlikFvsDasvskrGFkAty<-* m+++F+sD++vsk+GFkA++ gi|1245357 681 MRVEFKSDNTVSKKGFKAHF 700 LIM: domain 2 of 2, from 714 to 723: score 1.6, E = 56 *->CkhDyyklfg<-* C++D+ + fg gi|1245357 714 CQQDCVNTFG 723 TIL: domain 1 of 1, from 707 to 742: score -18.8, E = 19 *->Cpa.neqyteCgpsCepsCsnpdgplettppCegtSpkvPstCkeg. C ++n g+ C+ C n+ g + + gi|1245357 707 CSKdN------GG-CQQDCVNTFG--------------------SYe 726 CvCqpGyVrnndgdkCVprseC<-* C+C++G+V++++ + +C gi|1245357 727 CQCRSGFVLHDN------KHDC 742 EGF: domain 2 of 2, from 707 to 742: score 29.0, E = 0.00011 *->Capnn.pCsngGtCvntpggssdnfggytCeCppGdyylsytGkrC< C+++n++C++ Cvnt g +y+C+C G ++l+ + ++C gi|1245357 707 CSKDNgGCQQ--DCVNTFG-------SYECQCRSG-FVLHDNKHDC 742 -* gi|1245357 - - Metallothio_PEC: domain 1 of 1, from 713 to 760: score -49.6, E = 63 *->gCDDkCGCpsPCPGGkaCRCtsggaaeAsaGdqEHttCpCGEHCGCN gC C + G C C+sg d H C gi|1245357 713 GCQQDC---VNTFGSYECQCRSG----FVLHDNKH---------DCK 743 PCtCpksetptgrkgrRAnCsCGagCtCasCAS<-* C t t++ S gi|1245357 744 EAGCDHKVTSTSGTI----------------TS 760 CUB: domain 4 of 5, from 747 to 856: score 207.1, E = 2.7e-58 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt C + +t++sG+i+SPn+P+ +Yp +keC+W I+ +pg+r V+Lt gi|1245357 747 CDHK--VTSTSGTITSPNWPD--KYPSKKECTWAISSTPGHR-VKLT 788 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr F ++d+E++++ C+yD++E++DG++++ p+lGrfCGs+kPe++ t++r gi|1245357 789 FMEMDIESQPE--CAYDHLEVFDGRDAKaPVLGRFCGSKKPEPVLATGSR 836 mlikFvsDasvskrGFkAty<-* m+++F+sD+sv+++GF+A++ gi|1245357 837 MFLRFYSDNSVQRKGFQASH 856 CUB: domain 5 of 5, from 860 to 973: score 174.5, E = 1.7e-48 *->CGgtl..dltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVe CGg+++ d+++++ + ++ + n +Yp ++C+W+I+a++gy Ve gi|1245357 860 CGGQVraDVKTKDLYSHAQFGDN--NYPGGVDCEWVIVAEEGYG-VE 903 LtFqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirSts L Fq F++E++++ C+yDy E +DG++s+ p lGr+CGsg+Pe ++S + gi|1245357 904 LVFQTFEVEEETD--CGYDYMELFDGYDSTaPRLGRYCGSGPPEEVYSAG 951 nrmlikFvsDasvskrGFkAty<-* +++l+kF+sD++++k+GF+++y gi|1245357 952 DSVLVKFHSDDTITKKGFHLRY 973 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: tem25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|1245357|gb|AAA93462.1| procollagen C-proteinase Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- CUB CUB domain 995.4 1.4e-295 5 Astacin Astacin (Peptidase family M12A) 394.1 1e-120 1 EGF EGF-like domain 52.5 4e-13 2 TIL Trypsin Inhibitor like cysteine rich 5.3 2.7 1 LIM LIM domain containing proteins 4.0 12 2 HypB_UreG HypB/UreG nucleotide-binding domain 3.1 13 1 laminin_EGF Laminin EGF-like (Domains III and V) 1.6 61 1 PseudoU_synth_2 RNA pseudouridylate synthase 1.6 22 1 zf-AN1 AN1-like Zinc finger 1.6 55 1 HCV_capsid Hepatitis C virus capsid protein 1.4 46 1 TFIIS Transcription factor S-II (TFIIS) 1.4 86 1 BAH BAH domain 0.7 62 1 SAPA Saposin A-type domain 0.7 47 1 neur Neuraminidases -2.5 92 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- HCV_capsid 1/1 99 106 .. 1 9 [. 1.4 46 neur 1/1 130 139 .. 477 487 .] -2.5 92 zf-AN1 1/1 156 171 .. 28 43 .] 1.6 55 Astacin 1/1 128 321 .. 1 200 [] 394.1 1e-120 TFIIS 1/1 317 324 .. 1 8 [. 1.4 86 HypB_UreG 1/1 352 360 .. 139 147 .] 3.1 13 CUB 1/5 322 431 .. 1 116 [] 203.6 1e-58 PseudoU_synth_2 1/1 430 462 .. 78 112 .. 1.6 22 CUB 2/5 435 544 .. 1 116 [] 226.6 1.8e-65 EGF 1/2 551 587 .. 1 45 [] 33.3 1.1e-07 LIM 1/2 634 653 .. 40 61 .] 2.7 28 CUB 3/5 591 700 .. 1 116 [] 208.8 3e-60 LIM 2/2 714 723 .. 52 61 .] 1.6 56 TIL 1/1 727 742 .. 46 61 .. 5.3 2.7 laminin_EGF 1/1 728 742 .. 42 59 .] 1.6 61 EGF 2/2 707 742 .. 1 45 [] 27.2 5.6e-06 SAPA 1/1 769 775 .. 1 7 [. 0.7 47 CUB 4/5 747 856 .. 1 116 [] 205.2 3.6e-59 BAH 1/1 946 956 .. 1 11 [. 0.7 62 CUB 5/5 860 973 .. 1 116 [] 172.6 1.4e-49 Alignments of top-scoring domains: HCV_capsid: domain 1 of 1, from 99 to 106: score 1.4, E = 46 *->mstnpKPqR<-* stn++PqR gi|1245357 99 -STNGQPQR 106 neur: domain 1 of 1, from 130 to 139: score -2.5, E = 92 *->WpDGAdlpffi<-* WpDG+ +pf i gi|1245357 130 WPDGV-IPFVI 139 zf-AN1: domain 1 of 1, from 156 to 171: score 1.6, E = 55 *->RLpekHdCpgdykteg<-* R+ ekH+C++ ++ gi|1245357 156 RHWEKHTCVTFLERTD 171 Astacin: domain 1 of 1, from 128 to 321: score 394.1, E = 1e-120 *->rrWpngsGiVvIPYvisssysgrersliraAmrewenkTCirFvprt r Wp+g vIP+vi ++++g +r+++r+Amr+we++TC++F +rt gi|1245357 128 RVWPDG----VIPFVIGGNFTGSQRAVFRQAMRHWEKHTCVTFLERT 170 sagendylrffsgdGCwSyVGrrggGkeQevSlganGCiyfGiivHElmH ++ +++++++++++GC+SyVGrrggG+ Q +S+g ++C++fGi+vHEl+H gi|1245357 171 DE-DSYIVFTYRPCGCCSYVGRRGGGP-QAISIG-KNCDKFGIVVHELGH 217 ALGFwHEQsRpDRDdyVsInwqNIdpgqeynFdKydpdqvdslGvpYDYg ++GFwHE++RpDRD++VsI+++NI+pgqeynF K+ p +v+slG++YD++ gi|1245357 218 VVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFLKMEPQEVESLGETYDFD 267 SiMHYgpyaFSkngskpTIvpkdnk.vyqatiGQReglSflDikkiNklY SiMHY++++FS+ + ++TIvpk + ++ +++iGQR +lS++Di+++ klY gi|1245357 268 SIMHYARNTFSRGIFLDTIVPKYEVnGVKPPIGQRTRLSKGDIAQARKLY 317 nCpe<-* +Cp+ gi|1245357 318 KCPA 321 TFIIS: domain 1 of 1, from 317 to 324: score 1.4, E = 86 *->fpCPkCKs<-* ++CP C++ gi|1245357 317 YKCPACGE 324 HypB_UreG: domain 1 of 1, from 352 to 360: score 3.1, E = 13 *->iSVtEGEDk<-* iSVt GE++ gi|1245357 352 ISVTPGEKI 360 CUB: domain 1 of 5, from 322 to 431: score 203.6, E = 1e-58 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt CG+t l++s+G++sSP+YPn Y +++CvWrI+++pg++ + L+ gi|1245357 322 CGET--LQDSTGNFSSPEYPN--GYSAHMHCVWRISVTPGEK-IILN 363 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr F+++dl++++ C+yDyvE+rDG++++ pl+GrfCGs++Pe+i+St++r gi|1245357 364 FTSLDLYRSRL--CWYDYVEVRDGFWRKaPLRGRFCGSKLPEPIVSTDSR 411 mlikFvsDasvskrGFkAty<-* ++++F s++++ ++GF+A y gi|1245357 412 LWVEFRSSSNWVGKGFFAVY 431 PseudoU_synth_2: domain 1 of 1, from 430 to 462: score 1.6, E = 22 *->tYlAlVeGppedleeegtidapigrdpknrkkqiv<-* +Y A++ G ++ +++g i+ p d++++ k+++ gi|1245357 430 VYEAICGGDVK--KDYGHIQSPNYPDDYRPSKVCI 462 CUB: domain 2 of 5, from 435 to 544: score 226.6, E = 1.8e-65 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt CGg+ +++ +G+i+SPnYP+ dY+p+k C+WrI++ +g++ V Lt gi|1245357 435 CGGD--VKKDYGHIQSPNYPD--DYRPSKVCIWRIQVSEGFH-VGLT 476 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr Fq+F++E+hd+ C+yDy+E+rDG+++s++l+Gr+CG++kP+di+Sts+r gi|1245357 477 FQSFEIERHDS--CAYDYLEVRDGHSESsTLIGRYCGYEKPDDIKSTSSR 524 mlikFvsDasvskrGFkAty<-* +++kFvsD+s++k+GF+ ++ gi|1245357 525 LWLKFVSDGSINKAGFAVNF 544 EGF: domain 1 of 2, from 551 to 587: score 33.3, E = 1.1e-07 *->Capnn..pCsngGtCvntpggssdnfggytCeCppGdyylsytGkrC C+ +n+++C + +C+nt g +y+C C pG y+l+ + +rC gi|1245357 551 CSRPNrgGCEQ--RCLNTLG-------SYKCSCDPG-YELAPDKRRC 587 <-* gi|1245357 - - LIM: domain 1 of 2, from 634 to 653: score 2.7, E = 28 *->defyekdgkelYCkhDyyklfg<-* d f+e +g +Ck+D+ + ++ gi|1245357 634 D-FFETEGN-DVCKYDFVEVRS 653 CUB: domain 3 of 5, from 591 to 700: score 208.8, E = 3e-60 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt CGg lt+ +Gsi+SP++P+ +Yppnk+C+W+++ap +yr ++L+ gi|1245357 591 CGGF--LTKLNGSITSPGWPK--EYPPNKNCIWQLVAPTQYR-ISLQ 632 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr F+ F++E++d C+yD+vE+r G+ ++++l+G+fCGs+kPe+i+S+ n+ gi|1245357 633 FDFFETEGNDV--CKYDFVEVRSGLTADsKLHGKFCGSEKPEVITSQYNN 680 mlikFvsDasvskrGFkAty<-* m+++F+sD++vsk+GFkA++ gi|1245357 681 MRVEFKSDNTVSKKGFKAHF 700 LIM: domain 2 of 2, from 714 to 723: score 1.6, E = 56 *->CkhDyyklfg<-* C++D+ + fg gi|1245357 714 CQQDCVNTFG 723 TIL: domain 1 of 1, from 727 to 742: score 5.3, E = 2.7 *->CvCqpGyVrnndgdkC<-* C+C++G+V++++ C gi|1245357 727 CQCRSGFVLHDNKHDC 742 laminin_EGF: domain 1 of 1, from 728 to 742: score 1.6, E = 61 *->rCkpGyyglpsgdpgqgC<-* +C+ G+ d+ ++C gi|1245357 728 QCRSGFVLH---DNKHDC 742 EGF: domain 2 of 2, from 707 to 742: score 27.2, E = 5.6e-06 *->Capnn.pCsngGtCvntpggssdnfggytCeCppGdyylsytGkrC< C+++n++C++ Cvnt g +y+C+C G ++l+ + ++C gi|1245357 707 CSKDNgGCQQ--DCVNTFG-------SYECQCRSG-FVLHDNKHDC 742 -* gi|1245357 - - SAPA: domain 1 of 1, from 769 to 775: score 0.7, E = 47 *->gpkkCaw<-* ++k+C+w gi|1245357 769 SKKECTW 775 CUB: domain 4 of 5, from 747 to 856: score 205.2, E = 3.6e-59 *->CGgtldltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVeLt C + +t++sG+i+SPn+P+ +Yp +keC+W I+ +pg+r V+Lt gi|1245357 747 CDHK--VTSTSGTITSPNWPD--KYPSKKECTWAISSTPGHR-VKLT 788 FqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirStsnr F ++d+E++++ C+yD++E++DG++++ p+lGrfCGs+kPe++ t++r gi|1245357 789 FMEMDIESQPE--CAYDHLEVFDGRDAKaPVLGRFCGSKKPEPVLATGSR 836 mlikFvsDasvskrGFkAty<-* m+++F+sD+sv+++GF+A++ gi|1245357 837 MFLRFYSDNSVQRKGFQASH 856 BAH: domain 1 of 1, from 946 to 956: score 0.7, E = 62 *->etisvGDfVlv<-* e +s GD+Vlv gi|1245357 946 EVYSAGDSVLV 956 CUB: domain 5 of 5, from 860 to 973: score 172.6, E = 1.4e-49 *->CGgtl..dltessGsisSPnYPnrsdYppnkeCvWrIrappgyrvVe CGg+++ d+++++ + ++ + n +Yp ++C+W+I+a++gy Ve gi|1245357 860 CGGQVraDVKTKDLYSHAQFGDN--NYPGGVDCEWVIVAEEGYG-VE 903 LtFqdFdlEdhdgapCryDyvEirDGdpss.pllGrfCGsgkPedirSts L Fq F++E++++ C+yDy E +DG++s+ p lGr+CGsg+Pe ++S + gi|1245357 904 LVFQTFEVEEETD--CGYDYMELFDGYDSTaPRLGRYCGSGPPEEVYSAG 951 nrmlikFvsDasvskrGFkAty<-* +++l+kF+sD++++k+GF+++y gi|1245357 952 DSVLVKFHSDDTITKKGFHLRY 973 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: tem25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|1245357|gb|AAA93462.1| procollagen C-proteinase Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Mon Nov 13 22:09:25 2000 Sequence file: tem25 ---------------------------------------- Sequence gi|1245357|gb|AAA93462.1| (986 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 91: NTST 142: NFTG 332: NFSS 363: NFTS 599: NGSI Total matches: 5 Matching pattern PS00004 CAMP_PHOSPHO_SITE: 75: RRHT 80: RKSS Total matches: 2 Matching pattern PS00005 PKC_PHOSPHO_SITE: 78: TAR 83: SIK 118: SRR 123: TSR 146: SQR 179: TYR 522: SSR 569: SYK 668: SEK 692: SKK 702: SDK 769: SKK 824: SKK 965: TKK 975: STK 984: SRK Total matches: 16 Matching pattern PS00006 CK2_PHOSPHO_SITE: 124: SRPE 165: TFLE 170: TDED 226: TRPD 259: SLGE 306: SKGD 325: TLQD 334: SSPE 355: TPGE 365: TSLD 702: SDKD 769: SKKE 788: TFME 796: SQPE 949: SAGD Total matches: 15 Matching pattern PS00008 MYRISTYL: 140: GGNFTG 141: GNFTGS 145: GSQRAV 194: GGPQAI 244: GQEYNF 568: GSYKCS 592: GGFLTK 640: GNDVCK 723: GSYECQ 862: GQVRAD 886: GGVDCE Total matches: 11 Matching pattern PS00009 AMIDATION: 189: VGRR Total matches: 1 Matching pattern PS00010 ASX_HYDROXYL: 563: CLNTLGSYKCSC 718: CVNTFGSYECQC Total matches: 2 Matching pattern PS01186 EGF_2: 572: CSCDPGYELAPDKRRC 727: CQCRSGFVLHDNKHDC Total matches: 2 Matching pattern PS01187 EGF_CA: 547: EVDECSRPNRGGCEQRCLNTLGSYKC 703: DKDECSKDNGGCQQDCVNTFGSYEC Total matches: 2 Matching pattern PS00142 ZINC_PROTEASE: 210: IVVHELGHVV Total matches: 1 Total no of hits in this sequence: 57 ======================================== 1314 pattern(s) searched in 1 sequence(s), 986 residues. Total no of hits in all sequences: 57. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search L=0 2054 pos. 322 - 434 PS01180|CUB CUB domain profile. # # P 1 CGGTLRBTSGRISSPNYPNRDYPNNTECVWTIQVPPGFQYRVELQFQDPFDLEDHDECRY -54 # S 322 CGETLQDSTGNFSSPEYPNG-YSAHMHCVWRISVTPGE--KIILNFTS-LDLYRSRLCWY -610 # # P 61 DYVEIRDG-----PILGRFCGNETPPPIISSQSNRMTIKFRSDSSHQKRGFKATYYAV -1 # S 378 DYVEVRDGfwrkaPLRGRFCGSKLPEPIVST-DSRLWVEFRSSSNWVGKGFFAVYEAI -553 # L=0 2197 pos. 435 - 547 PS01180|CUB CUB domain profile. # # P 1 CGGTLRBTSGRISSPNYPNRDYPNNTECVWTIQVPPGFQYRVELQFQDPFDLEDHDECRY -54 # S 435 CGGDVKKDYGHIQSPNYPD-DYRPSKVCIWRIQVSEG--FHVGLTFQS-FEIERHDSCAY -497 # # P 61 DYVEIRDG-----PILGRFCGNETPPPIISSQSNRMTIKFRSDSSHQKRGFKATYYAV -1 # S 491 DYLEVRDGhsessTLIGRYCGYEKPDD-IKSTSSRLWLKFVSDGSINKAGFAVNFFKE -440 # L=0 2136 pos. 591 - 703 PS01180|CUB CUB domain profile. # # P 1 CGGTLRBTSGRISSPNYPNRDYPNNTECVWTIQVPPGFQYRVELQFQDPFDLEDHDECRY -54 # S 591 CGGFLTKLNGSITSPGWPK-EYPPNKNCIWQLVAPTQ--YRISLQFDF-FETEGNDVCKY -341 # # P 61 DYVEIRDG-----PILGRFCGNETPPPIISSQSNRMTIKFRSDSSHQKRGFKATYYAV -1 # S 647 DFVEVRSGltadsKLHGKFCGSEKPE-VITSQYNNMRVEFKSDNTVSKKGFKAHFFSD -284 # L=0 2140 pos. 747 - 859 PS01180|CUB CUB domain profile. # # P 1 CGGTLRBTSGRISSPNYPNRDYPNNTECVWTIQVPPGFQYRVELQFQDPFDLEDHDECRY -54 # S 747 CDHKVTSTSGTITSPNWPDK-YPSKKECTWAISSTPG--HRVKLTFME-MDIESQPECAY -185 # # P 61 DYVEIRDG-----PILGRFCGNETPPPIISSQSNRMTIKFRSDSSHQKRGFKATYYAV -1 # S 803 DHLEVFDGrdakaPVLGRFCGSKKPEPVLAT-GSRMFLRFYSDNSVQRKGFQASHATE -128 # L=0 1861 pos. 860 - 976 PS01180|CUB CUB domain profile. # # P 1 CGGTLRBT---SGRISSPNYPNRDYPNNTECVWTIQVPPGFQYRVELQFQDPFDLEDHDE -57 # S 860 CGGQVRADvktKDLYSHAQFGDNNYPGGVDCEWVIVAEEG--YGVELVFQT-FEVEEETD -71 # # P 58 CRYDYVEIRDG-----PILGRFCGNETPPPIISSQSNRMTIKFRSDSSHQKRGFKATYYA -2 # S 917 CGYDYMELFDGydstaPRLGRYCGSGPPEEVYSA-GDSVLVKFHSDDTITKKGFHLRYTS -12 # # P 113 V -1 # S 976 T -11 # ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** argv[1]=P argv[2]=-m /data/patterns/own/motif.fa argv[4]=-seq tem25 ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 986 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: tem25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|1245357|gb|AAA93462.1| procollagen C-proteinase Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: tem25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|1245357|gb|AAA93462.1| procollagen C-proteinase Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= gi|1245357|gb|AAA93462.1| procollagen C-proteinase (986 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value HECT A ubiquitin conjugating enzyme domain 26 0.19 S1 S1 RNA binding domain 25 0.70 CYCLIN Cyclin/TFIIB domain 23 1.9 AP2 A plant specific DNA binding domain (Apetala 2 like) 22 3.5 ARM Armadillo repeat 22 4.2 BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 22 4.7 POZ Pox virus Zinc finger domain (Also called BTB domain; a... 21 7.7 SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chrom... 21 9.6 >HECT A ubiquitin conjugating enzyme domain Length = 255 Score = 26.3 bits (58), Expect = 0.19 Identities = 8/45 (17%), Positives = 16/45 (34%), Gaps = 1/45 (2%) Query: 514 KPDDIKSTSSRLWLKFVSDGSINKAGFAVNFFKEV-DECSRPNRG 557 D++ S +K V +G+ + G + E+ E Sbjct: 3 NASDLRLPSRAWKVKLVGEGADDAGGVFDDTITEMCQELETGIVD 47 >S1 S1 RNA binding domain Length = 305 Score = 24.5 bits (53), Expect = 0.70 Identities = 9/44 (20%), Positives = 13/44 (29%), Gaps = 3/44 (6%) Query: 203 KNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVSIVRENIQPGQE 246 +G+ V E+ V G H V + GQ Sbjct: 202 AKIQPYGVFV-EIEGVTGLLHVSQV--SGTRVDSLNTLFAFGQA 242 Score = 23.0 bits (49), Expect = 2.0 Identities = 2/19 (10%), Positives = 6/19 (31%) Query: 944 PEEVYSAGDSVLVKFHSDD 962 E+ + + S+ Sbjct: 69 IEDSFPLDSAWDFLVTSEQ 87 Score = 23.0 bits (49), Expect = 2.1 Identities = 3/24 (12%), Positives = 9/24 (37%) Query: 944 PEEVYSAGDSVLVKFHSDDTITKK 967 +++ G ++ V D + Sbjct: 233 LNTLFAFGQAISVYVQEIDEYKNR 256 >CYCLIN Cyclin/TFIIB domain Length = 317 Score = 23.0 bits (49), Expect = 1.9 Identities = 10/27 (37%), Positives = 13/27 (48%) Query: 827 PEPVLATGSRMFLRFYSDNSVQRKGFQ 853 P V+ T F RFY +NSV + Sbjct: 70 PRSVVGTACMYFKRFYLNNSVMEYHPR 96 >AP2 A plant specific DNA binding domain (Apetala 2 like) Length = 218 Score = 22.0 bits (46), Expect = 3.5 Identities = 10/43 (23%), Positives = 13/43 (29%), Gaps = 8/43 (18%) Query: 92 TSTPSCQSTNGQPQRGA----CGRWRGRSRSRRAATSRPERVW 130 P S RG G+W R R + R+W Sbjct: 23 KKKPVKDSGKHPVYRGVRKRNWGKWVSEIREPRKKS----RIW 61 >ARM Armadillo repeat Length = 532 Score = 21.8 bits (46), Expect = 4.2 Identities = 6/18 (33%), Positives = 9/18 (49%) Query: 47 CKAAAFLGDIALDEEDLR 64 +A LG++A D R Sbjct: 176 EQAVWALGNVAGDSPRCR 193 Score = 21.4 bits (45), Expect = 5.9 Identities = 7/13 (53%), Positives = 9/13 (68%) Query: 767 YPSKKECTWAISS 779 + KKE WAIS+ Sbjct: 383 FDIKKEAAWAISN 395 >BRIGHT BRIGHT domain (Alpha helical DNA binding domain) Length = 172 Score = 21.9 bits (46), Expect = 4.7 Identities = 10/45 (22%), Positives = 18/45 (39%), Gaps = 5/45 (11%) Query: 350 WRISVTPGEKI-ILNFTSLDLYRSRLCWYD---YVEV-RDGFWRK 389 + TP ++ I+ + LDLY V+V W++ Sbjct: 35 MQKRGTPINRLPIMAKSVLDLYELYNLVIARGGLVDVINKKLWQE 79 >POZ Pox virus Zinc finger domain (Also called BTB domain; a protein-protein interaction domain) Length = 229 Score = 20.9 bits (43), Expect = 7.7 Identities = 4/40 (10%), Positives = 8/40 (20%) Query: 703 DKDECSKDNGGCQQDCVNTFGSYECQCRSGFVLHDNKHDC 742 D C G + G + + + Sbjct: 2 SGDTCLCPASGAKPKLSGFKGGGLGNKYVQLNVGGSLYYT 41 >SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chromatin associated domain) Length = 219 Score = 20.7 bits (43), Expect = 9.6 Identities = 7/36 (19%), Positives = 11/36 (30%), Gaps = 3/36 (8%) Query: 239 ENIQPGQ---EYNFLKMEPQEVESLGETYDFDSIMH 271 E+I EY + Q + Y F + Sbjct: 105 ESIPAWSYIGEYTGILRRRQALWLDENDYCFRYPVP 140 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 8 Number of calls to ALIGN: 11 Length of query: 986 Total length of test sequences: 20182 Effective length of test sequences: 16637.0 Effective search space size: 15842987.8 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= gi|1245357|gb|AAA93462.1| procollagen C-proteinase (986 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|1731806 [90..289] Metzincin-like 231 3e-62 gi|1616941 [10..247] Metzincin-like 149 2e-37 gi|1174715 [19..356] Thiamin-binding 28 0.76 gi|97186 [16..330] S-adenosyl-L-methionine-dependent methylt... 28 0.94 gi|128081 [13..98] Neurophysin II 27 1.9 gi|2497794 [4..293] Creatinase/methionine aminopeptidase 26 3.1 gi|1815643 [227..431] Metzincin-like 26 3.4 gi|2809057 [242..451] Metzincin-like 26 4.1 gi|2650282 [2..154] P-loop containing nucleotide triphosphat... 25 4.7 gi|2829864 [79..325] Metzincin-like 25 6.0 gi|2370487 [513..702] Gln-tRNA synthetase (GlnRS), C-termina... 25 7.2 gi|1708972 [46..317] FAD/NAD(P)-binding domain 25 9.0 >gi|1731806 [90..289] Metzincin-like Length = 200 Score = 231 bits (585), Expect = 3e-62 Identities = 73/207 (35%), Positives = 107/207 (51%), Gaps = 13/207 (6%) Query: 121 AATSRPERVWPDG-----VIPFVIGGNFTGSQRAVFRQAMRHWEKHTCVTFLERTDEDSY 175 +A + +WP +P+ + ++ Q A+F++A++ +E TCV F+ T E ++ Sbjct: 1 SAINDARFLWPKSADGIVPVPYNLSYSYNADQLALFKKAIQEFEALTCVRFVPWTTEVNF 60 Query: 176 IVFTYRPCGCCSYVGRRGGGPQAISIGKNCDKFGIVVHELGHVVGFWHEHTRPDRDRHVS 235 + GC S +G+ GG + C GI+ HEL H +GF+HE R DRD +V Sbjct: 61 LNIMS-NGGCGSLIGKNGGAQRLELDANGCMNMGIIQHELNHALGFYHEQNRSDRDDYVI 119 Query: 236 IVRENIQPGQEYNFLKMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKYEVNGV 295 I ENI P F K +LG YD+ S+MHY+R +S TI PK + N Sbjct: 120 IHTENIIPDFLKMFEKY---NTNNLGIEYDYASVMHYSRYHYSINGD-ITIEPKPDPN-- 173 Query: 296 KPPIGQRTRLSKGDIAQARKLYKCPAC 322 PIGQR LS DI++ KLY+C C Sbjct: 174 -VPIGQRDGLSILDISKINKLYECNVC 199 >gi|1616941 [10..247] Metzincin-like Length = 238 Score = 149 bits (374), Expect = 2e-37 Identities = 18/191 (9%), Positives = 40/191 (20%), Gaps = 38/191 (19%) Query: 116 SRSRRAATSRPERVWPDGVIPFVIGG--NFTGSQRAVFRQAMRHWEKHTCVTFLERTDED 173 W + +G F+ Q+A + +++ W T + F++ Sbjct: 59 YSFLTKPNDFFNTPWKYVSDIYSLGKFSAFSAQQQAQAKLSLQSWSDVTNIHFVDAGQGH 118 Query: 174 S---YIVFTYRPCGCCSYVGRRGGG---------------PQAISIGKNCDKFGIVVHEL 215 G ++ ++ + HE+ Sbjct: 119 QGDLTFGNFSSSVGGAAFAFLPDVPDALKGQSWYLINSSYSANVNPANANYGRQTLTHEI 178 Query: 216 GHVVGFWHEHTRPDRDRHVSIVRENIQPGQEYNFLKMEPQEVESLGETYDFDSIMH-YAR 274 GH +G + + Y Y S Sbjct: 179 GHTLGLSTPDYN-AGEGDPTY------ADATYAEDT----------RAYSVMSYWEEQNT 221 Query: 275 NTFSRGIFLDT 285 +G + Sbjct: 222 GQDFKGAYSSA 232 >gi|1174715 [19..356] Thiamin-binding Length = 338 Score = 28.1 bits (62), Expect = 0.76 Identities = 5/56 (8%), Positives = 16/56 (27%), Gaps = 2/56 (3%) Query: 265 DFDSIMHYARNTFSRGIFLDTIVPKYE--VNGVKPPIGQRTRLSKGDIAQARKLYK 318 + + A+ R F+ + L + ++A +++ Sbjct: 227 GIEEAIANAKAATDRPSFISLRTIIGYPAPTLINTGKAHGAALGEDEVAATKRILG 282 >gi|97186 [16..330] S-adenosyl-L-methionine-dependent methyltransferases Length = 315 Score = 27.6 bits (61), Expect = 0.94 Identities = 14/117 (11%), Positives = 32/117 (26%), Gaps = 20/117 (17%) Query: 62 DLRAFQVQQAVDLRRHTARKSS------IKAAVPGNTSTPSCQSTNGQPQRGACGRWRGR 115 +L + + + L+ H R + +K G ++ + + + Sbjct: 198 NLPSIESGKHSGLKWHYGRGHTEQQIEWMKHTPTGKSAFENLVHYPRKANGEKVKGYHSS 257 Query: 116 SRSRRAATSRPERVWPDGVIPFVIGGNFTGSQRAVFR----QAMRHWEKHTCVTFLE 168 R R+ D P + N S + + ++ LE Sbjct: 258 YR----------RIRWDEPAPTITIRNDAISSQRNVHPGRPLLDGTYSDARVLSVLE 304 >gi|128081 [13..98] Neurophysin II Length = 86 Score = 26.6 bits (57), Expect = 1.9 Identities = 7/69 (10%), Positives = 11/69 (15%) Query: 662 HGKFCGSEKPEVITSQYNNMRVEFKSDNTVSKKGFKAHFFSDKDECSKDNGGCQQDCVNT 721 G+ G E + C + Sbjct: 18 QGRCFGPSICCADALGCFVGTAEALRCQEENYLPSPCQSGQKPCGSGGRCAANGVCCNDE 77 Query: 722 FGSYECQCR 730 E +CR Sbjct: 78 SCVIEPECR 86 >gi|2497794 [4..293] Creatinase/methionine aminopeptidase Length = 290 Score = 26.0 bits (57), Expect = 3.1 Identities = 19/96 (19%), Positives = 29/96 (29%), Gaps = 19/96 (19%) Query: 207 KFGIVVHELGHVVGFWHEHTRPDRDRHVSIVREN----IQPGQEYNFLKMEPQEVESLGE 262 + + + GHV+ + HT + V E I G +EP + G Sbjct: 143 GYKPISNLSGHVMHRYELHTGI----SIPNVYERTNQYIDVGDLVA---IEPFATDGFGM 195 Query: 263 TYDFDSIMHYARNT-------FSRGIFLDTIVPKYE 291 D + Y +R LD I Y Sbjct: 196 VKDGNLGNIYKFLAKRPIRLPQARK-LLDVISKNYP 230 >gi|1815643 [227..431] Metzincin-like Length = 205 Score = 25.7 bits (56), Expect = 3.4 Identities = 5/17 (29%), Positives = 11/17 (64%) Query: 210 IVVHELGHVVGFWHEHT 226 ++ +GH++G H+ T Sbjct: 134 LLAQSIGHLLGLEHDTT 150 >gi|2809057 [242..451] Metzincin-like Length = 210 Score = 25.7 bits (56), Expect = 4.1 Identities = 7/16 (43%), Positives = 8/16 (49%) Query: 210 IVVHELGHVVGFWHEH 225 HELGHV H+ Sbjct: 141 TTAHELGHVFNMPHDD 156 >gi|2650282 [2..154] P-loop containing nucleotide triphosphate hydrolases Length = 153 Score = 25.2 bits (54), Expect = 4.7 Identities = 16/148 (10%), Positives = 37/148 (24%), Gaps = 16/148 (10%) Query: 173 DSYIVFTYRPCGCCSYVGRRGGGPQ--AISIGKNCDKFGIVVHELGHVVGFWHEHTRPDR 230 D+ + + VG+ G G + K G + + Sbjct: 9 DAILGGGIPEGHIVAVVGQYGTGKTTLGLHFIYEGLKNG------EACMIISFDEDEESI 62 Query: 231 DRHVSIVRENIQPGQEYNFLKMEPQEVESLGETYDFDSIMHYARNTFSRGIFLDTIVPKY 290 V ++ + + + R+ + +D+I Sbjct: 63 IGDAKSVGMDLTAFGDKVHIVRLEASEVKKSLEKLESDLPEIVRSLGVSRMLVDSISVLE 122 Query: 291 EVNGVKPPIGQRTRLSKGDIAQARKLYK 318 + G+ + +A RK+ K Sbjct: 123 ---TLFDDAGRYSM-----LAAFRKMLK 142 >gi|2829864 [79..325] Metzincin-like Length = 247 Score = 24.9 bits (54), Expect = 6.0 Identities = 7/16 (43%), Positives = 11/16 (68%) Query: 208 FGIVVHELGHVVGFWH 223 + VHE+GH++G H Sbjct: 195 ESVAVHEIGHLLGLGH 210 >gi|2370487 [513..702] Gln-tRNA synthetase (GlnRS), C-terminal (anticodon-binding) domain Length = 190 Score = 24.8 bits (53), Expect = 7.2 Identities = 10/24 (41%), Positives = 15/24 (61%) Query: 22 PLDLADYTYDLAEEDDSEPLNYKD 45 P+DL D+ Y + ++ E NYKD Sbjct: 117 PVDLVDFDYLITKDKLEEGENYKD 140 >gi|1708972 [46..317] FAD/NAD(P)-binding domain Length = 272 Score = 24.6 bits (52), Expect = 9.0 Identities = 12/98 (12%), Positives = 20/98 (20%), Gaps = 4/98 (4%) Query: 156 RHWEKHTCVTFLERTDEDSYIVFTYRPCGCC---SYVGRRGGGPQAISIGKNCDKFGIVV 212 + W+ +LE + G S G + + D + V Sbjct: 134 QTWQTVIGTAYLEAGILPNNGFSVDHLAGTRLTGSTFDNNGTRHASDELLNKGDPNNLRV 193 Query: 213 HELGHVVGFWHEHTRP-DRDRHVSIVRENIQPGQEYNF 249 V V N Q + Sbjct: 194 AVQAAVEKIIFSSNTSGVTAIGVIYTDSNGTTHQAFVR 231 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 12 Number of calls to ALIGN: 12 Length of query: 986 Total length of test sequences: 256703 Effective length of test sequences: 209547.0 Effective search space size: 198333793.6 Initial X dropoff for ALIGN: 25.0 bits