analysis of sequence from tem10 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPPGRDGEDGPTGPPGPPGPPGP PGLGGNFAAQYDGKGVGLGPGPMGLMGPRGPPGAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGK AGEDGHPGKPGRPGERGVVGPQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGEPGAPGENGTP GQTGARGLPGERGRVGAPGPAGARGSDGSVGPVGPAGPIGSAGPPGFPGAPGPKGEIGAVGNAGPAGPAG PRGEVGLPGLSGPVGPPGNPGANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLVGEPGP AGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRGSPGSRGLPGADGRAGVMGPP GSRGASGPAGVRGPNGDAGRPGEPGLMGPRGLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPG NIGFPGPKGPTGDPGKNGDKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAGPPGFQGLPGP SGPAGEVGKPGERGLHGEFGLPGPAGPRGERGPPGESGAAGPTGPIGSRGPSGPPGPDGNKGEPGVVGAV GTAGPSGPSGLPGERGAAGIPGGKGEKGEPGLRGEIGNPGRDGARGAHGAVGAPGPAGATGDRGEAGAAG PAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGERGAKGPKGENGVVGPTGPVGAAGPAGPNGP PGPAGSRGDGGPPGMTGFPGAAGRTGPPGPSGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPP GFAGEKGPSGEAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIAGPPGARGPPG AVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGYPGNIGPVGAAGAPGPHGPVGPAGKHGNRGE TGPSGPVGPAGAVGPRGPSGPQGIRGDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGHHGDQGAPGSVGPA GPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGPPGPPGVSGGGYDFGYDGDFYRAD QPRSAPSLRPKDYEVDATLKSLNNQIETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDA IKVYCDFSTGETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLL ANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKKTNEWGKTIIE YKTNKPSRLPFLDIAPLDIGGADHEFFVDIGPVCFK ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > tem10gi|1418930|emb|CAA98969.1| . . . . . 1 MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPP 50 _____HHHHHHHHHHHHHHHHHHHHHHHH_____________________ . . . . . 51 GRDGEDGPTGPPGPPGPPGPPGLGGNFAAQYDGKGVGLGPGPMGLMGPRG 100 __________________________EEEE____________________ . . . . . 101 PPGAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGKAGEDGHPGKP 150 __________________________________________________ . . . . . 151 GRPGERGVVGPQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGE 200 __________________________________________________ . . . . . 201 PGAPGENGTPGQTGARGLPGERGRVGAPGPAGARGSDGSVGPVGPAGPIG 250 ______________________EEE_________________________ . . . . . 251 SAGPPGFPGAPGPKGEIGAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNP 300 ________________EEEE_____________EEE______________ . . . . . 301 GANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLVGEPGP 350 _________________________________________EEE______ . . . . . 351 AGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRG 400 __________________________________________________ . . . . . 401 SPGSRGLPGADGRAGVMGPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPR 450 __________________________________________________ . . . . . 451 GLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPGNIGFPGPKGP 500 __________________________________________________ . . . . . 501 TGDPGKNGDKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAG 550 __________________________________________________ . . . . . 551 PPGFQGLPGPSGPAGEVGKPGERGLHGEFGLPGPAGPRGERGPPGESGAA 600 __________________________________________________ . . . . . 601 GPTGPIGSRGPSGPPGPDGNKGEPGVVGAVGTAGPSGPSGLPGERGAAGI 650 _________________________EEEEEE___________________ . . . . . 651 PGGKGEKGEPGLRGEIGNPGRDGARGAHGAVGAPGPAGATGDRGEAGAAG 700 __________________________________________________ . . . . . 701 PAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGERGAKGPKGEN 750 __________________________________________________ . . . . . 751 GVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGPPGP 800 __________________________________________________ . . . . . 801 SGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSG 850 __________________________________________________ . . . . . 851 EAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIA 900 ______________________________________EEE_________ . . . . . 901 GPPGARGPPGAVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGY 950 __________________________________________________ . . . . . 951 PGNIGPVGAAGAPGPHGPVGPAGKHGNRGETGPSGPVGPAGAVGPRGPSG 1000 __________________________________________________ . . . . . 1001 PQGIRGDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGHHGDQGAPGSVGPA 1050 __________________________________________________ . . . . . 1051 GPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGPPGP 1100 __________________________________________________ . . . . . 1101 PGVSGGGYDFGYDGDFYRADQPRSAPSLRPKDYEVDATLKSLNNQIETLL 1150 ________________EEE________________HHHHHHHHHHHHHH_ . . . . . 1151 TPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTG 1200 ___________HHHHHHH________EEEE_________EEEEEEE____ . . . . . 1201 ETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEM 1250 _EEEE____________________EEE________EEEEE_____HHHH . . . . . 1251 ATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVEL 1300 HHHHHHHHHHHHHH___EEEE_______________EEEEEEE____EEE . . . . . 1301 VAEGNSRFTYTVLVDGCSKKTNEWGKTIIEYKTNKPSRLPFLDIAPLDIG 1350 EEE____EEEEEEE____________EEEEEE________EEEEE_____ . 1351 GADHEFFVDIGPVCFK 1366 _____EEEEEEEE___ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 0.0 % beta-contents : 0.0 % coil-contents : 100.0 % class : irregular method : 2 alpha-contents : 0.0 % beta-contents : 0.0 % coil-contents : 100.0 % class : irregular ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -16.81 -0.07 0.00 -0.30 -4.00 0.00 -8.00 0.00 -0.11 -6.57 -1.82 -12.00 -12.00 0.00 0.00 0.00 -61.68 -7.25 -0.13 -0.04 -0.06 -4.00 0.00 -16.00 0.00 -0.32 -6.57 -1.82 -12.00 -12.00 0.00 0.00 0.00 -60.19 ID: tem10gi|1418930|emb|CAA98969.1| AC: xxx Len: 1330 1:I 1317 Sc: -60.19 Pv: 3.730853e-01 NO_GPI_SITE GPI: learning from protozoa -17.64 0.00 0.00 -0.09 -4.00 0.00 -12.00 0.00 -0.78 -5.70 -7.14 -12.00 -12.00 0.00 0.00 0.00 -71.34 -16.82 -0.04 0.00 0.00 -4.00 0.00 -8.00 0.00 0.00 -5.70 -7.14 -12.00 -12.00 0.00 0.00 0.00 -65.70 ID: tem10gi|1418930|emb|CAA98969.1| AC: xxx Len: 1330 1:I 1315 Sc: -65.70 Pv: 2.885468e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem10gi|141 0.538 23 Y 0.650 23 Y 0.984 9 Y 0.939 Y # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem10gi|141 0.693 732 Y 0.368 27 Y 0.988 9 Y 0.795 Y # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem10gi|141 0.643 486 Y 0.392 321 Y 0.996 12 Y 0.143 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] 1-8 MLSFVDTR tllllavtlclatc 9-22 23-32 QSLQEETVRK gpagdrgprgergppgppgrdgedgptgpp 33-75 gppgppgppglgg 76-82 NFAAQYD gkgvglgpgpmglmgprgppgaagapgpqg 83-163 fqgpagepgepgqtgpagargpagppgkag edghpgkpgrpgergvvgpqg 164-218 ARGFPGTPGLPGFKGIRGHNGLDGLKGQPG APGVKGEPGAPGENGTPGQTGARGL pgergrvgapgpagargsdgsvgpvgpagp 219-265 igsagppgfpgapgpkg 266-267 EI gavgnagpagpagprgevglpglsgpvgpp 268-346 gnpgangltgakgaaglpgvagapglpgpr gipgpvgaagatgarglvg 347-360 EPGPAGSKGESGNK gepgsagpqgppgpsgeegkrgpngeagsa 361-410 gppgppglrgspgsrglpga 411-417 DGRAGVM gppgsrgasgpagvrgp 418-434 435-468 NGDAGRPGEPGLMGPRGLPGSPGNIGPAGK EGPV glpgidgrpgpigpagargepg 469-490 491-492 NI gfpgpkgptgdpgkngdkghaglagargap 493-553 gpdgnngaqgppgpqgvqggkgeqgpagpp g 554-579 FQGLPGPSGPAGEVGKPGERGLHGEF glpgpagprgergppgesgaagptgpigsr 580-640 gpsgppgpdgnkgepgvvgavgtagpsgps g 641-642 LP gergaagipggkgekgepglrge 643-665 666-668 IGN pgrdgargahgavgapgpagatgdrgeaga 669-739 agpagpagprgspgergevgpagpngfagp agaagqpgakg 740-750 ERGAKGPKGEN gvvgptgpvgaagpagpngppgpag 751-775 776-785 SRGDGGPPGM tgfpgaagrtgppgpsgisgppgppgpag 786-814 815-846 KEGLRGPRGDQGPVGRTGEVGAVGPPGFAG EK gpsgeagtagppgtpgpqgllgapgilglp 847-877 g 878-882 SRGER glpgvagavgepgplgiagppgargppgav 883-922 gspgvngapg 923-924 EA grdgnpgndgppgrdgqpg 925-943 944-954 HKGERGYPGNI gpvgaagapgphgpvgpagkhg 955-976 977-981 NRGET gpsgpvgpagavgprgpsgp 982-1001 1002-1041 QGIRGDKGEPGEKGPRGLPGLKGHNGLQGL PGIAGHHGDQ gapgsvgpagprgpagpsgpag 1042-1063 1064-1080 KDGRTGHPGTVGPAGIR gpqghqgpagppgppgppgppgvsggg 1081-1107 1108-1366 YDFGYDGDFYRADQPRSAPSLRPKDYEVDA TLKSLNNQIETLLTPEGSRKNPARTCRDLR LSHPEWSSGYYWIDPNQGCTMDAIKVYCDF STGETCIRAQPENIPAKNWYRSSKDKKHVW LGETINAGSQFEYNVEGVTSKEMATQLAFM RLLANYASQNITYHCKNSIAYMDEETGNLK KAVILQGSNDVELVAEGNSRFTYTVLVDGC SKKTNEWGKTIIEYKTNKPSRLPFLDIAPL DIGGADHEFFVDIGPVCFK low complexity regions: SEG 25 3.0 3.3 >tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] 1-6 MLSFVD trtllllavtlclatcqslqeetvr 7-31 32-32 K gpagdrgprgergppgppgrdgedgptgpp 33-1107 gppgppgppglggnfaaqydgkgvglgpgp mglmgprgppgaagapgpqgfqgpagepge pgqtgpagargpagppgkagedghpgkpgr pgergvvgpqgargfpgtpglpgfkgirgh ngldglkgqpgapgvkgepgapgengtpgq tgarglpgergrvgapgpagargsdgsvgp vgpagpigsagppgfpgapgpkgeigavgn agpagpagprgevglpglsgpvgppgnpga ngltgakgaaglpgvagapglpgprgipgp vgaagatgarglvgepgpagskgesgnkge pgsagpqgppgpsgeegkrgpngeagsagp pgppglrgspgsrglpgadgragvmgppgs rgasgpagvrgpngdagrpgepglmgprgl pgspgnigpagkegpvglpgidgrpgpigp agargepgnigfpgpkgptgdpgkngdkgh aglagargapgpdgnngaqgppgpqgvqgg kgeqgpagppgfqglpgpsgpagevgkpge rglhgefglpgpagprgergppgesgaagp tgpigsrgpsgppgpdgnkgepgvvgavgt agpsgpsglpgergaagipggkgekgepgl rgeignpgrdgargahgavgapgpagatgd rgeagaagpagpagprgspgergevgpagp ngfagpagaagqpgakgergakgpkgengv vgptgpvgaagpagpngppgpagsrgdggp pgmtgfpgaagrtgppgpsgisgppgppgp agkeglrgprgdqgpvgrtgevgavgppgf agekgpsgeagtagppgtpgpqgllgapgi lglpgsrgerglpgvagavgepgplgiagp pgargppgavgspgvngapgeagrdgnpgn dgppgrdgqpghkgergypgnigpvgaaga pgphgpvgpagkhgnrgetgpsgpvgpaga vgprgpsgpqgirgdkgepgekgprglpgl kghnglqglpgiaghhgdqgapgsvgpagp rgpagpsgpagkdgrtghpgtvgpagirgp qghqgpagppgppgppgppgvsggg 1108-1366 YDFGYDGDFYRADQPRSAPSLRPKDYEVDA TLKSLNNQIETLLTPEGSRKNPARTCRDLR LSHPEWSSGYYWIDPNQGCTMDAIKVYCDF STGETCIRAQPENIPAKNWYRSSKDKKHVW LGETINAGSQFEYNVEGVTSKEMATQLAFM RLLANYASQNITYHCKNSIAYMDEETGNLK KAVILQGSNDVELVAEGNSRFTYTVLVDGC SKKTNEWGKTIIEYKTNKPSRLPFLDIAPL DIGGADHEFFVDIGPVCFK low complexity regions: SEG 45 3.4 3.75 >tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] 1-30 MLSFVDTRTLLLLAVTLCLATCQSLQEETV rkgpagdrgprgergppgppgrdgedgptg 31-1107 ppgppgppgppglggnfaaqydgkgvglgp gpmglmgprgppgaagapgpqgfqgpagep gepgqtgpagargpagppgkagedghpgkp grpgergvvgpqgargfpgtpglpgfkgir ghngldglkgqpgapgvkgepgapgengtp gqtgarglpgergrvgapgpagargsdgsv gpvgpagpigsagppgfpgapgpkgeigav gnagpagpagprgevglpglsgpvgppgnp gangltgakgaaglpgvagapglpgprgip gpvgaagatgarglvgepgpagskgesgnk gepgsagpqgppgpsgeegkrgpngeagsa gppgppglrgspgsrglpgadgragvmgpp gsrgasgpagvrgpngdagrpgepglmgpr glpgspgnigpagkegpvglpgidgrpgpi gpagargepgnigfpgpkgptgdpgkngdk ghaglagargapgpdgnngaqgppgpqgvq ggkgeqgpagppgfqglpgpsgpagevgkp gerglhgefglpgpagprgergppgesgaa gptgpigsrgpsgppgpdgnkgepgvvgav gtagpsgpsglpgergaagipggkgekgep glrgeignpgrdgargahgavgapgpagat gdrgeagaagpagpagprgspgergevgpa gpngfagpagaagqpgakgergakgpkgen gvvgptgpvgaagpagpngppgpagsrgdg gppgmtgfpgaagrtgppgpsgisgppgpp gpagkeglrgprgdqgpvgrtgevgavgpp gfagekgpsgeagtagppgtpgpqgllgap gilglpgsrgerglpgvagavgepgplgia gppgargppgavgspgvngapgeagrdgnp gndgppgrdgqpghkgergypgnigpvgaa gapgphgpvgpagkhgnrgetgpsgpvgpa gavgprgpsgpqgirgdkgepgekgprglp glkghnglqglpgiaghhgdqgapgsvgpa gprgpagpsgpagkdgrtghpgtvgpagir gpqghqgpagppgppgppgppgvsggg 1108-1366 YDFGYDGDFYRADQPRSAPSLRPKDYEVDA TLKSLNNQIETLLTPEGSRKNPARTCRDLR LSHPEWSSGYYWIDPNQGCTMDAIKVYCDF STGETCIRAQPENIPAKNWYRSSKDKKHVW LGETINAGSQFEYNVEGVTSKEMATQLAFM RLLANYASQNITYHCKNSIAYMDEETGNLK KAVILQGSNDVELVAEGNSRFTYTVLVDGC SKKTNEWGKTIIEYKTNKPSRLPFLDIAPL DIGGADHEFFVDIGPVCFK low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAgdrgprgergppgppgrdgedgptg ppgppgppgppgLGGNFAAQYDGKGVGLGPgpmglmgprgppgaagapgpqgfqgpagep gepgqtgpagargpagppgkagedghpgkpgrpgergvvgpqgargfpgtpglpgfkgir ghngldglkgqpgapgvkgepgapgengtpgqtgarglpgergrvgapgpagargsdgsv gpvgpagpigsagppgfpgapgpkgeigavgnagpagpagprgevglpglsgpvgppgnp gangltgakgaaglpgvagapglpgprgipgpvgaagatgarglvgepgpagskgesgnk gepgsagpqgppgpsgeegkrgpngeagsagppgppglrgspgsrglpgadgragvmgpp gsrgasgpagvrgpngdagrpgepglmgprglpgspgnigpagkegpvglpgidgrpgpi gpagargepgnigfpgpkgptgdpgkngdkghaglagargapgpdgnngaqgppgpqgvq ggkgeqgpagppgfqglpgpsgpagevgkpgerglhgefglpgpagprgergppgesgaa gptgpigsrgpsgppgpdgnkgepgvvgavgtagpsgpsglpgergaagipggkgekgep glrgeignpgrdgargahgavgapgpagatgdrgeagaagpagpagprgspgergevgpa gpngfagpagaagqpgakgergakgpkgengvvgptgpvgaagpagpngppgpagsrgdg gppgmtgfpgaagrtgppgpsgisgppgppgpagkeglrgprgdqgpvgrtgevgavgpp gfagekgpsgeagtagppgtpgpqgllgapgilglpgsrgerglpgvagavgepgplgia gppgargppgavgspgvngapgeagrdgnpgndgppgrdgqpghkgergypgnigpvgaa gapgphgpvgpagkhgnrgetgpsgpvgpagavgprgpsgpqgirgdkgepgekgprglp glkghnglqglpgiaghhgdqgapgsvgpagprgpagpsgpagkdgrtghpgtvgpagir gpqghqgpagppgppgppgppgvsgggydfgydGDFYRADQPRSAPSLRPKDYEVDATLK SLNNQIETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTG ETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLL ANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKK TNEWGKTIIEYKTNKPSRLPFLDIAPLDIGGADHEFFVDIGPVCFK 1 - 35 MLSFVDTRTL LLLAVTLCLA TCQSLQEETV RKGPA 36 - 72 gdrgp rgergppgpp grdgedgptg ppgppgppgp pg 73 - 90 LGGNFAAQ YDGKGVGLGP 91 - 1113 gpmglmgprg ppgaagapgp qgfqgpagep gepgqtgpag argpagppgk agedghpgkp grpgergvvg pqgargfpgt pglpgfkgir ghngldglkg qpgapgvkge pgapgengtp gqtgarglpg ergrvgapgp agargsdgsv gpvgpagpig sagppgfpga pgpkgeigav gnagpagpag prgevglpgl sgpvgppgnp gangltgakg aaglpgvaga pglpgprgip gpvgaagatg arglvgepgp agskgesgnk gepgsagpqg ppgpsgeegk rgpngeagsa gppgppglrg spgsrglpga dgragvmgpp gsrgasgpag vrgpngdagr pgepglmgpr glpgspgnig pagkegpvgl pgidgrpgpi gpagargepg nigfpgpkgp tgdpgkngdk ghaglagarg apgpdgnnga qgppgpqgvq ggkgeqgpag ppgfqglpgp sgpagevgkp gerglhgefg lpgpagprge rgppgesgaa gptgpigsrg psgppgpdgn kgepgvvgav gtagpsgpsg lpgergaagi pggkgekgep glrgeignpg rdgargahga vgapgpagat gdrgeagaag pagpagprgs pgergevgpa gpngfagpag aagqpgakge rgakgpkgen gvvgptgpvg aagpagpngp pgpagsrgdg gppgmtgfpg aagrtgppgp sgisgppgpp gpagkeglrg prgdqgpvgr tgevgavgpp gfagekgpsg eagtagppgt pgpqgllgap gilglpgsrg erglpgvaga vgepgplgia gppgargppg avgspgvnga pgeagrdgnp gndgppgrdg qpghkgergy pgnigpvgaa gapgphgpvg pagkhgnrge tgpsgpvgpa gavgprgpsg pqgirgdkge pgekgprglp glkghnglqg lpgiaghhgd qgapgsvgpa gprgpagpsg pagkdgrtgh pgtvgpagir gpqghqgpag ppgppgppgp pgvsgggydf gyd 1114 - 1366 GDFYRAD QPRSAPSLRP KDYEVDATLK SLNNQIETLL TPEGSRKNPA RTCRDLRLSH PEW SSGYYWI DPNQGCTMDA IKVYCDFSTG ETCIRAQPEN IPAKNWYRSS KDKKHVWLGE TIN AGSQFEY NVEGVTSKEM ATQLAFMRLL ANYASQNITY HCKNSIAYMD EETGNLKKAV ILQ GSNDVEL VAEGNSRFTY TVLVDGCSKK TNEWGKTIIE YKTNKPSRLP FLDIAPLDIG GAD HEFFVDI GPVCFK low complexity regions: DUST >tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPPGRDGEDGPTG PPGPPGPPGPPGLGGNFAAQYDGKGVGLGPGPMGLMGPRGPPGAAGAPGPQGFQGPAGEP GEPGQTGPAGARGPAGPPGKAGEDGHPGKPGRPGERGVVGPQGARGFPGTPGLPGFKGIR GHNGLDGLKGQPGAPGVKGEPGAPGENGTPGQTGARGLPGERGRVGAPGPAGARGSDGSV GPVGPAGPIGSAGPPGFPGAPGPKGEIGAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNP GANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLVGEPGPAGSKGESGNK GEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRGSPGSRGLPGADGRAGVMGPP GSRGASGPAGVRGPNGDAGRPGEPGLMGPRGLPGSPGNIGPAGKEGPVGLPGIDGRPGPI GPAGARGEPGNIGFPGPKGPTGDPGKNGDKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQ GGKGEQGPAGPPGFQGLPGPSGPAGEVGKPGERGLHGEFGLPGPAGPRGERGPPGESGAA GPTGPIGSRGPSGPPGPDGNKGEPGVVGAVGTAGPSGPSGLPGERGAAGIPGGKGEKGEP GLRGEIGNPGRDGARGAHGAVGAPGPAGATGDRGEAGAAGPAGPAGPRGSPGERGEVGPA GPNGFAGPAGAAGQPGAKGERGAKGPKGENGVVGPTGPVGAAGPAGPNGPPGPAGSRGDG GPPGMTGFPGAAGRTGPPGPSGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPP GFAGEKGPSGEAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIA GPPGARGPPGAVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGYPGNIGPVGAA GAPGPHGPVGPAGKHGNRGETGPSGPVGPAGAVGPRGPSGPQGIRGDKGEPGEKGPRGLP GLKGHNGLQGLPGIAGHHGDQGAPGSVGPAGPRGPAGPSGPAGKDGRTGHPGTVGPAGIR GPQGHQGPAGPPGPPGPPGPPGVSGGGYDFGYDGDFYRADQPRSAPSLRPKDYEVDATLK SLNNQIETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTG ETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLL ANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKK TNEWGKTIIEYKTNKPSRLPFLDIAPLDIGGADHEFFVDIGPVCFK ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for tem10gi|1418930|emb|CAA98969.1| sequence: 1330 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MLSFVDTRTL LLLAVTLCLA TCQSLQEETV RKGPAGDRGP RGERGPPGPP GRDGEDGPTG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 PPGPPGPPGP PGLGGNFAAQ YDGKGVGLGP GPMGLMGPRG PPGAAGAPGP QGFQGPAGEP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 GEPGQTGPAG ARGPAGPPGK AGEDGHPGKP GRPGERGVVG PQGARGFPGT PGLPGFKGIR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 GHNGLDGLKG QPGAPGVKGE PGAPGENGTP GQTGARGLPG ERGRVGAPGP AGARGSDGSV ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 GPVGPAGPIG SAGPPGFPGA PGPKGEIGAV GNAGPAGPAG PRGEVGLPGL SGPVGPPGNP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 360 GANGLTGAKG AAGLPGVAGA PGLPGPRGIP GPVGAAGATG ARGLVGEPGP AGSKGESGNK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 420 GEPGSAGPQG PPGPSGEEGK RGPNGEAGSA GPPGPPGLRG SPGSRGLPGA DGRAGVMGPP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 480 GSRGASGPAG VRGPNGDAGR PGEPGLMGPR GLPGSPGNIG PAGKEGPVGL PGIDGRPGPI ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 540 GPAGARGEPG NIGFPGPKGP TGDPGKNGDK GHAGLAGARG APGPDGNNGA QGPPGPQGVQ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 600 GGKGEQGPAG PPGFQGLPGP SGPAGEVGKP GERGLHGEFG LPGPAGPRGE RGPPGESGAA ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 660 GPTGPIGSRG PSGPPGPDGN KGEPGVVGAV GTAGPSGPSG LPGERGAAGI PGGKGEKGEP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 720 GLRGEIGNPG RDGARGAHGA VGAPGPAGAT GDRGEAGAAG PAGPAGPRGS PGERGEVGPA ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 780 GPNGFAGPAG AAGQPGAKGE RGAKGPKGEN GVVGPTGPVG AAGPAGPNGP PGPAGSRGDG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 840 GPPGMTGFPG AAGRTGPPGP SGISGPPGPP GPAGKEGLRG PRGDQGPVGR TGEVGAVGPP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 900 GFAGEKGPSG EAGTAGPPGT PGPQGLLGAP GILGLPGSRG ERGLPGVAGA VGEPGPLGIA ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 960 GPPGARGPPG AVGSPGVNGA PGEAGRDGNP GNDGPPGRDG QPGHKGERGY PGNIGPVGAA ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 1020 GAPGPHGPVG PAGKHGNRGE TGPSGPVGPA GAVGPRGPSG PQGIRGDKGE PGEKGPRGLP ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 1080 GLKGHNGLQG LPGIAGHHGD QGAPGSVGPA GPRGPAGPSG PAGKDGRTGH PGTVGPAGIR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 1140 GPQGHQGPAG PPGPPGPPGP PGVSGGGYDF GYDGDFYRAD QPRSAPSLRP KDYEVDATLK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~8888888 * 14 M'95 -w local . | . | . | . | . | . 1200 SLNNQIETLL TPEGSRKNPA RTCRDLRLSH PEWSSGYYWI DPNQGCTMDA IKVYCDFSTG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. 8888888888 6~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 1260 ETCIRAQPEN IPAKNWYRSS KDKKHVWLGE TINAGSQFEY NVEGVTSKEM ATQLAFMRLL ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 1320 ANYASQNITY HCKNSIAYMD EETGNLKKAV ILQGSNDVEL VAEGNSRFTY TVLVDGCSKK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | TNEWGKTIIE ~~~~~~~~~~ ---------- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** tem10.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem10.___inter___ (1 sequences) MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPP GRDGEDGPTGPPGPPGPPGPPGLGGNFAAQYDGKGVGLGPGPMGLMGPRG PPGAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGKAGEDGHPGKP GRPGERGVVGPQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGE PGAPGENGTPGQTGARGLPGERGRVGAPGPAGARGSDGSVGPVGPAGPIG SAGPPGFPGAPGPKGEIGAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNP GANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLVGEPGP AGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRG SPGSRGLPGADGRAGVMGPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPR GLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPGNIGFPGPKGP TGDPGKNGDKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAG PPGFQGLPGPSGPAGEVGKPGERGLHGEFGLPGPAGPRGERGPPGESGAA GPTGPIGSRGPSGPPGPDGNKGEPGVVGAVGTAGPSGPSGLPGERGAAGI PGGKGEKGEPGLRGEIGNPGRDGARGAHGAVGAPGPAGATGDRGEAGAAG PAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGERGAKGPKGEN GVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGPPGP SGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSG EAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIA GPPGARGPPGAVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGY PGNIGPVGAAGAPGPHGPVGPAGKHGNRGETGPSGPVGPAGAVGPRGPSG PQGIRGDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGHHGDQGAPGSVGPA GPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGPPGP PGVSGGGYDFGYDGDFYRADQPRSAPSLRPKDYEVDATLKSLNNQIETLL TPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTG ETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEM ATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVEL VAEGNSRFTYTVLVDGCSKKTNEWGKTIIEYKTNKPSRLPFLDIAPLDIG GADHEFFVDIGPVCFK (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 16 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 7 27 1.247 Certain 2 238 258 1.042 Certain 3 306 326 0.850 Putative 4 623 643 1.052 Certain 5 750 770 0.813 Putative 6 858 878 1.015 Certain 7 883 903 0.743 Putative 8 953 973 0.713 Putative ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 3 4 6 7 8 Loop length 6 210 47 296 214 4 49 393 K+R profile 2.00 3.00 + 5.00 + + 2.00 + CYT-EXT prof - - 0.59 - 0.52 0.67 - 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 8.00 Tm probability: 0.06 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.49 -> Orientation: N-in ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 7 Loop length 6 210 47 296 106 87 4 463 K+R profile 2.00 3.00 + 2.00 + + + + CYT-EXT prof - - 0.25 - 0.52 0.67 0.70 0.94 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 7.00 Tm probability: 0.12 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -2.57 -> Orientation: N-in ---------------------------------------------------------------------- Structure 3 Transmembrane segments included in this structure: Segment 1 2 4 5 6 7 8 Loop length 6 210 364 106 87 4 49 393 K+R profile 2.00 + + 5.00 + + 2.00 + CYT-EXT prof - 0.72 0.70 - 0.52 0.25 - 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 0.05 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.26 -> Orientation: N-in ---------------------------------------------------------------------- Structure 4 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 8 Loop length 6 210 47 296 106 87 74 393 K+R profile 2.00 3.00 + + + + + + CYT-EXT prof - - 0.25 0.69 0.52 0.67 0.70 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 0.09 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.84 -> Orientation: N-in ---------------------------------------------------------------------- Structure 5 Transmembrane segments included in this structure: Segment 1 2 3 4 6 8 Loop length 6 210 47 296 214 74 393 K+R profile 2.00 3.00 + + + + + CYT-EXT prof - - 0.59 0.90 0.52 0.67 0.69 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 0.18 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.38 -> Orientation: N-in ---------------------------------------------------------------------- Structure 6 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 Loop length 6 210 47 296 106 87 488 K+R profile 2.00 3.00 + + + + + CYT-EXT prof - - 0.25 0.92 0.52 0.67 0.70 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 0.33 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.71 -> Orientation: N-in ---------------------------------------------------------------------- Structure 7 Transmembrane segments included in this structure: Segment 1 2 3 4 6 Loop length 6 210 47 296 214 488 K+R profile 2.00 3.00 + + + + CYT-EXT prof - - 0.59 0.52 0.67 0.92 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 0.63 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.51 -> Orientation: N-in ---------------------------------------------------------------------- Structure 8 Transmembrane segments included in this structure: Segment 1 2 4 6 7 Loop length 6 210 364 214 4 463 K+R profile 2.00 + 2.00 + + + CYT-EXT prof - 0.72 - 0.52 0.59 0.94 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 0.36 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.33 -> Orientation: N-in ---------------------------------------------------------------------- Structure 9 Transmembrane segments included in this structure: Segment 1 2 3 4 6 7 Loop length 6 210 47 296 214 4 463 K+R profile 2.00 3.00 + + + + 2.00 CYT-EXT prof - - 0.59 0.94 0.52 0.67 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 3.00 Tm probability: 0.22 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.35 -> Orientation: N-out ---------------------------------------------------------------------- Structure 10 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 7 8 Loop length 6 210 47 296 106 87 4 49 393 K+R profile 2.00 3.00 + 2.00 + + + + 5.00 CYT-EXT prof - - 0.25 - 0.90 0.52 0.67 0.70 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 0.03 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.73 -> Orientation: N-in ---------------------------------------------------------------------- Structure 11 Transmembrane segments included in this structure: Segment 1 2 4 5 6 8 Loop length 6 210 364 106 87 74 393 K+R profile 2.00 + + + + + + CYT-EXT prof - 0.72 0.70 0.90 0.52 0.25 0.69 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 0.15 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.85 -> Orientation: N-out ---------------------------------------------------------------------- Structure 12 Transmembrane segments included in this structure: Segment 1 2 4 6 8 Loop length 6 210 364 214 74 393 K+R profile 2.00 + + + + + CYT-EXT prof - 0.72 0.69 0.52 0.59 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 0.28 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.60 -> Orientation: N-in ---------------------------------------------------------------------- Structure 13 Transmembrane segments included in this structure: Segment 1 2 4 5 6 Loop length 6 210 364 106 87 488 K+R profile 2.00 + + + + + CYT-EXT prof - 0.72 0.70 0.52 0.25 0.92 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 0.53 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.28 -> Orientation: N-in ---------------------------------------------------------------------- Structure 14 Transmembrane segments included in this structure: Segment 1 2 4 6 Loop length 6 210 364 214 488 K+R profile 2.00 + + + + CYT-EXT prof - 0.72 0.92 0.52 0.59 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.53 -> Orientation: N-out ---------------------------------------------------------------------- Structure 15 Transmembrane segments included in this structure: Segment 1 2 4 6 7 8 Loop length 6 210 364 214 4 49 393 K+R profile 2.00 + 2.00 + + + 5.00 CYT-EXT prof - 0.72 - 0.90 0.52 0.59 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -1.00 Tm probability: 0.10 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.51 -> Orientation: N-out ---------------------------------------------------------------------- Structure 16 Transmembrane segments included in this structure: Segment 1 2 4 5 6 7 Loop length 6 210 364 106 87 4 463 K+R profile 2.00 + + + + + 2.00 CYT-EXT prof - 0.72 0.70 0.94 0.52 0.25 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 0.00 Tm probability: 0.19 -> Orientation: undecided Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 1.58 -> Orientation: N-out ---------------------------------------------------------------------- "tem10" 1366 7 27 #t 1.24687 238 258 #t 1.04167 306 326 #f 0.85 623 643 #t 1.05208 750 770 #f 0.8125 858 878 #t 1.01458 883 903 #f 0.742708 953 973 #f 0.7125 ************************************ *TOPPREDM with prokaryotic function* ************************************ tem10.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem10.___inter___ (1 sequences) MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPP GRDGEDGPTGPPGPPGPPGPPGLGGNFAAQYDGKGVGLGPGPMGLMGPRG PPGAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGKAGEDGHPGKP GRPGERGVVGPQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGE PGAPGENGTPGQTGARGLPGERGRVGAPGPAGARGSDGSVGPVGPAGPIG SAGPPGFPGAPGPKGEIGAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNP GANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLVGEPGP AGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRG SPGSRGLPGADGRAGVMGPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPR GLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPGNIGFPGPKGP TGDPGKNGDKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAG PPGFQGLPGPSGPAGEVGKPGERGLHGEFGLPGPAGPRGERGPPGESGAA GPTGPIGSRGPSGPPGPDGNKGEPGVVGAVGTAGPSGPSGLPGERGAAGI PGGKGEKGEPGLRGEIGNPGRDGARGAHGAVGAPGPAGATGDRGEAGAAG PAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGERGAKGPKGEN GVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGPPGP SGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSG EAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIA GPPGARGPPGAVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGY PGNIGPVGAAGAPGPHGPVGPAGKHGNRGETGPSGPVGPAGAVGPRGPSG PQGIRGDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGHHGDQGAPGSVGPA GPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGPPGP PGVSGGGYDFGYDGDFYRADQPRSAPSLRPKDYEVDATLKSLNNQIETLL TPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTG ETCIRAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEM ATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVEL VAEGNSRFTYTVLVDGCSKKTNEWGKTIIEYKTNKPSRLPFLDIAPLDIG GADHEFFVDIGPVCFK (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 16 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 7 27 1.247 Certain 2 238 258 1.042 Certain 3 306 326 0.850 Putative 4 623 643 1.052 Certain 5 750 770 0.813 Putative 6 858 878 1.015 Certain 7 883 903 0.743 Putative 8 953 973 0.713 Putative ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 3 4 6 7 8 Loop length 6 210 47 296 214 4 49 393 K+R profile 1.00 3.00 + 5.00 + + 2.00 + CYT-EXT prof - - 0.59 - 0.52 0.67 - 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 7.00 Tm probability: 0.06 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.49 -> Orientation: N-in ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 7 Loop length 6 210 47 296 106 87 4 463 K+R profile 1.00 3.00 + 2.00 + + + + CYT-EXT prof - - 0.25 - 0.52 0.67 0.70 0.94 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 6.00 Tm probability: 0.12 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -2.57 -> Orientation: N-in ---------------------------------------------------------------------- Structure 3 Transmembrane segments included in this structure: Segment 1 2 4 5 6 7 8 Loop length 6 210 364 106 87 4 49 393 K+R profile 1.00 + + 5.00 + + 2.00 + CYT-EXT prof - 0.72 0.70 - 0.52 0.25 - 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 0.05 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.26 -> Orientation: N-in ---------------------------------------------------------------------- Structure 4 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 8 Loop length 6 210 47 296 106 87 74 393 K+R profile 1.00 3.00 + + + + + + CYT-EXT prof - - 0.25 0.69 0.52 0.67 0.70 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 0.09 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.84 -> Orientation: N-in ---------------------------------------------------------------------- Structure 5 Transmembrane segments included in this structure: Segment 1 2 3 4 6 8 Loop length 6 210 47 296 214 74 393 K+R profile 1.00 3.00 + + + + + CYT-EXT prof - - 0.59 0.90 0.52 0.67 0.69 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 0.18 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.38 -> Orientation: N-in ---------------------------------------------------------------------- Structure 6 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 Loop length 6 210 47 296 106 87 488 K+R profile 1.00 3.00 + + + + + CYT-EXT prof - - 0.25 0.92 0.52 0.67 0.70 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 0.33 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.71 -> Orientation: N-in ---------------------------------------------------------------------- Structure 7 Transmembrane segments included in this structure: Segment 1 2 3 4 6 Loop length 6 210 47 296 214 488 K+R profile 1.00 3.00 + + + + CYT-EXT prof - - 0.59 0.52 0.67 0.92 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 0.63 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.51 -> Orientation: N-in ---------------------------------------------------------------------- Structure 8 Transmembrane segments included in this structure: Segment 1 2 4 6 7 Loop length 6 210 364 214 4 463 K+R profile 1.00 + 2.00 + + + CYT-EXT prof - 0.72 - 0.52 0.59 0.94 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 3.00 Tm probability: 0.36 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -1.33 -> Orientation: N-in ---------------------------------------------------------------------- Structure 9 Transmembrane segments included in this structure: Segment 1 2 4 6 7 8 Loop length 6 210 364 214 4 49 393 K+R profile 1.00 + 2.00 + + + 5.00 CYT-EXT prof - 0.72 - 0.90 0.52 0.59 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -2.00 Tm probability: 0.10 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.51 -> Orientation: N-out ---------------------------------------------------------------------- Structure 10 Transmembrane segments included in this structure: Segment 1 2 3 4 6 7 Loop length 6 210 47 296 214 4 463 K+R profile 1.00 3.00 + + + + 2.00 CYT-EXT prof - - 0.59 0.94 0.52 0.67 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 0.22 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.35 -> Orientation: N-out ---------------------------------------------------------------------- Structure 11 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 7 8 Loop length 6 210 47 296 106 87 4 49 393 K+R profile 1.00 3.00 + 2.00 + + + + 5.00 CYT-EXT prof - - 0.25 - 0.90 0.52 0.67 0.70 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.03 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.73 -> Orientation: N-in ---------------------------------------------------------------------- Structure 12 Transmembrane segments included in this structure: Segment 1 2 4 5 6 8 Loop length 6 210 364 106 87 74 393 K+R profile 1.00 + + + + + + CYT-EXT prof - 0.72 0.70 0.90 0.52 0.25 0.69 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.15 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.85 -> Orientation: N-out ---------------------------------------------------------------------- Structure 13 Transmembrane segments included in this structure: Segment 1 2 4 6 8 Loop length 6 210 364 214 74 393 K+R profile 1.00 + + + + + CYT-EXT prof - 0.72 0.69 0.52 0.59 0.90 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.28 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.60 -> Orientation: N-in ---------------------------------------------------------------------- Structure 14 Transmembrane segments included in this structure: Segment 1 2 4 5 6 7 Loop length 6 210 364 106 87 4 463 K+R profile 1.00 + + + + + 2.00 CYT-EXT prof - 0.72 0.70 0.94 0.52 0.25 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -1.00 Tm probability: 0.19 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 1.58 -> Orientation: N-out ---------------------------------------------------------------------- Structure 15 Transmembrane segments included in this structure: Segment 1 2 4 5 6 Loop length 6 210 364 106 87 488 K+R profile 1.00 + + + + + CYT-EXT prof - 0.72 0.70 0.52 0.25 0.92 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.53 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: -0.28 -> Orientation: N-in ---------------------------------------------------------------------- Structure 16 Transmembrane segments included in this structure: Segment 1 2 4 6 Loop length 6 210 364 214 488 K+R profile 1.00 + + + + CYT-EXT prof - 0.72 0.92 0.52 0.59 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -1.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 1.0000 POS: 1.0000 -> Orientation: N-out CYT-EXT difference: 0.53 -> Orientation: N-out ---------------------------------------------------------------------- "tem10" 1366 7 27 #t 1.24687 238 258 #t 1.04167 306 326 #f 0.85 623 643 #t 1.05208 750 770 #f 0.8125 858 878 #t 1.01458 883 903 #f 0.742708 953 973 #f 0.7125 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem10.___saps___ SAPS. Version of April 11, 1996. Date run: Tue Oct 31 14:25:59 2000 File: /people/maria/tem10.___saps___ ID tem10gi|1418930|emb|CAA98969.1| DE prepro-alpha2(I) collagen [Homo sapiens] number of residues: 1366; molecular weight: 129.3 kdal 1 MLSFVDTRTL LLLAVTLCLA TCQSLQEETV RKGPAGDRGP RGERGPPGPP GRDGEDGPTG 61 PPGPPGPPGP PGLGGNFAAQ YDGKGVGLGP GPMGLMGPRG PPGAAGAPGP QGFQGPAGEP 121 GEPGQTGPAG ARGPAGPPGK AGEDGHPGKP GRPGERGVVG PQGARGFPGT PGLPGFKGIR 181 GHNGLDGLKG QPGAPGVKGE PGAPGENGTP GQTGARGLPG ERGRVGAPGP AGARGSDGSV 241 GPVGPAGPIG SAGPPGFPGA PGPKGEIGAV GNAGPAGPAG PRGEVGLPGL SGPVGPPGNP 301 GANGLTGAKG AAGLPGVAGA PGLPGPRGIP GPVGAAGATG ARGLVGEPGP AGSKGESGNK 361 GEPGSAGPQG PPGPSGEEGK RGPNGEAGSA GPPGPPGLRG SPGSRGLPGA DGRAGVMGPP 421 GSRGASGPAG VRGPNGDAGR PGEPGLMGPR GLPGSPGNIG PAGKEGPVGL PGIDGRPGPI 481 GPAGARGEPG NIGFPGPKGP TGDPGKNGDK GHAGLAGARG APGPDGNNGA QGPPGPQGVQ 541 GGKGEQGPAG PPGFQGLPGP SGPAGEVGKP GERGLHGEFG LPGPAGPRGE RGPPGESGAA 601 GPTGPIGSRG PSGPPGPDGN KGEPGVVGAV GTAGPSGPSG LPGERGAAGI PGGKGEKGEP 661 GLRGEIGNPG RDGARGAHGA VGAPGPAGAT GDRGEAGAAG PAGPAGPRGS PGERGEVGPA 721 GPNGFAGPAG AAGQPGAKGE RGAKGPKGEN GVVGPTGPVG AAGPAGPNGP PGPAGSRGDG 781 GPPGMTGFPG AAGRTGPPGP SGISGPPGPP GPAGKEGLRG PRGDQGPVGR TGEVGAVGPP 841 GFAGEKGPSG EAGTAGPPGT PGPQGLLGAP GILGLPGSRG ERGLPGVAGA VGEPGPLGIA 901 GPPGARGPPG AVGSPGVNGA PGEAGRDGNP GNDGPPGRDG QPGHKGERGY PGNIGPVGAA 961 GAPGPHGPVG PAGKHGNRGE TGPSGPVGPA GAVGPRGPSG PQGIRGDKGE PGEKGPRGLP 1021 GLKGHNGLQG LPGIAGHHGD QGAPGSVGPA GPRGPAGPSG PAGKDGRTGH PGTVGPAGIR 1081 GPQGHQGPAG PPGPPGPPGP PGVSGGGYDF GYDGDFYRAD QPRSAPSLRP KDYEVDATLK 1141 SLNNQIETLL TPEGSRKNPA RTCRDLRLSH PEWSSGYYWI DPNQGCTMDA IKVYCDFSTG 1201 ETCIRAQPEN IPAKNWYRSS KDKKHVWLGE TINAGSQFEY NVEGVTSKEM ATQLAFMRLL 1261 ANYASQNITY HCKNSIAYMD EETGNLKKAV ILQGSNDVEL VAEGNSRFTY TVLVDGCSKK 1321 TNEWGKTIIE YKTNKPSRLP FLDIAPLDIG GADHEFFVDI GPVCFK -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A :130( 9.5%); C : 9( 0.7%); D : 43( 3.1%); E : 66( 4.8%); F : 22( 1.6%) G++:381(27.9%); H : 17( 1.2%); I : 32( 2.3%); K : 50( 3.7%); L- : 61( 4.5%) M- : 10( 0.7%); N : 41( 3.0%); P++:230(16.8%); Q : 32( 2.3%); R : 72( 5.3%) S : 52( 3.8%); T- : 42( 3.1%); V : 55( 4.0%); W : 5( 0.4%); Y : 16( 1.2%) KR : 122 ( 8.9%); ED : 109 ( 8.0%); AGP ++: 741 ( 54.2%); KRED : 231 ( 16.9%); KR-ED : 13 ( 1.0%); FIKMNY- : 171 ( 12.5%); LVIFM - : 180 ( 13.2%); ST - : 94 ( 6.9%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 00000-0+00 0000000000 000000--00 ++0000-+00 +0-+000000 0+-0--0000 61 0000000000 0000000000 0-0+000000 00000000+0 0000000000 00000000-0 121 0-00000000 0+0000000+ 00--0000+0 0+00-+0000 0000+00000 000000+00+ 181 00000-00+0 0000000+0- 00000-0000 00000+0000 -+0+000000 000+00-000 241 0000000000 0000000000 000+0-0000 0000000000 0+0-000000 0000000000 301 00000000+0 0000000000 000000+000 0000000000 0+0000-000 000+0-000+ 361 0-00000000 000000--0+ +0000-0000 00000000+0 0000+00000 -0+0000000 421 00+0000000 0+0000-00+ 00-000000+ 0000000000 000+-00000 000-0+0000 481 00000+0-00 0000000+00 00-00+00-+ 00000000+0 0000-00000 0000000000 541 00+0-00000 0000000000 00000-00+0 0-+0000-00 0000000+0- +0000-0000 601 00000000+0 0000000-00 +0-0000000 0000000000 000-+00000 000+0-+0-0 661 00+0-00000 +-00+00000 0000000000 0-+0-00000 0000000+00 00-+0-0000 721 0000000000 0000000+0- +00+00+0-0 0000000000 0000000000 000000+0-0 781 0000000000 000+000000 0000000000 0000+-00+0 0+0-00000+ 00-0000000 841 0000-+0000 -000000000 0000000000 00000000+0 -+00000000 00-0000000 901 00000+0000 0000000000 00-00+-000 00-0000+-0 0000+0-+00 0000000000 961 0000000000 000+000+0- 0000000000 00000+0000 0000+0-+0- 00-+00+000 1021 00+0000000 000000000- 0000000000 00+0000000 000+-0+000 000000000+ 1081 0000000000 0000000000 00000000-0 00-0-00+0- 00+00000+0 +-0-0-000+ 1141 000000-000 00-00++000 +00+-0+000 0-00000000 -0000000-0 0+000-0000 1201 -000+000-0 000+000+00 +-++00000- 00000000-0 00-0000+-0 0000000+00 1261 0000000000 00+000000- --0000++00 000000-0-0 00-000+000 0000-000++ 1321 00-00+000- 0+00+00+00 00-0000-00 00-0-000-0 00000+ A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none Negative charge clusters (cmin = 9/30 or 11/45 or 14/60): none Mixed charge clusters (cmin = 14/30 or 18/45 or 23/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 5 | 5 | 7 | 54 | 10 | 10 | 13 | 12 | 12 | 16 | 6 | 8 | lmin1 6 | 6 | 8 | 65 | 12 | 12 | 15 | 15 | 14 | 19 | 8 | 10 | lmin2 7 | 7 | 10 | 73 | 14 | 13 | 17 | 17 | 16 | 21 | 9 | 11 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 0 - runs >= 3: 1, at 1280; * runs >= 4: 1, at 1221; 0 runs >= 36: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. There are no high scoring transmembrane segments. 2. SPACINGS OF C. H2N-17-C-3-C-1140-C-22-C-8-C-7-C-68-C-44-C-46-C-2-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-17-C-3-C-123-H-35-H-329-H-63-H-101-H-265-H-21-H-8-H-49-H-11-H-H-31-H-14-H-77-C-6-H-15-C-8-C-7-C-21-H-45-H-C-44-C-36-H-9-C-2-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 5 Aligned matching blocks: [ 38- 46] RGPRGERGP [ 819- 827] RGPRGDQGP ______________________________ [ 39- 48]-( -5)-[ 44- 48]--------[ ] [ ]--------[ 99- 103]-( -5)-[ 99- 110] [ 586- 595]-( -5)-[ 591- 595]-( -5)-[ 591- 602] [ ]--------[ 906- 910]--------[ ] [ 39- 48] GPRGERGPPG [ 586- 595] GPRGERGPPG with superset: [ 39- 43] GPRGE [ 280- 284] GPRGE [ 586- 590] GPRGE [ 44- 48] RGPPG [ 99- 103] RGPPG [ 591- 595] RGPPG [ 906- 910] RGPPG [ 99- 110] RGPPGAAGAPGP [ 591- 602] RGPPGESGAAGP ______________________________ [ 35- 40]-( 3)-[ 44- 54] [ 600- 605]-( 3)-[ 609- 619] [ 35- 40] AGDRGP [ 600- 605] AGPTGP [ 44- 54] RGPPGPPGRDG [ 609- 619] RGPSGPPGPDG with superset: [ 44- 49] RGPPGP [ 132- 137] RGPAGP [ 609- 614] RGPSGP [ 996-1001] RGPSGP [1053-1058] RGPAGP and: [ 44- 51] RGPPGPPG [ 132- 139] RGPAGPPG [ 609- 616] RGPSGPPG ______________________________ [ 48- 54] GPPGRDG [ 613- 619] GPPGPDG [ 667- 673] GNPGRDG [ 934- 940] GPPGRDG ______________________________ [ 57- 61] GPTGP [ 601- 605] GPTGP [ 754- 758] GPTGP ______________________________ [ 59- 67] TGPPGPPGP [ 981- 989] TGPSGPVGP with superset: [ 59- 64] TGPPGP [ 795- 800] TGPPGP [ 981- 986] TGPSGP ______________________________ [ 94- 107] GLMGPRGPPGAAGA [ 322- 335] GLPGPRGIPGPVGA with superset: [ 94- 100] GLMGPRG [ 322- 328] GLPGPRG [ 445- 451] GLMGPRG [ 817- 823] GLRGPRG and: [ 94- 103] GLMGPRGPPG [ 322- 331] GLPGPRGIPG [ 445- 454] GLMGPRGLPG ______________________________ [ 97- 104] GPRGPPGA [ 214- 221] GARGLPGE [ 325- 331] GPRGIPG [ 403- 410] GSRGLPGA [ 448- 454] GPRGLPG [ 466- 472] GPVGLPG [ 637- 644] GPSGLPGE [ 706- 713] GPRGSPGE [ 880- 886] GERGLPG [1015-1021] GPRGLPG ______________________________ [ 100- 110] GPPGAAGAPGP [ 955- 965] GPVGAAGAPGP with superset: [ 103- 107] GAAGA [ 334- 338] GAAGA [ 958- 962] GAAGA ______________________________ [ 100- 104] GPPGA [ 901- 905] GPPGA [ 907- 911] GPPGA ______________________________ [ 100- 109] GPPGAAGAPG [ 313- 322] GLPGVAGAPG [ 913- 922] GSPGVNGAPG with superset: [ 100- 107] GPPGAAGA [ 226- 233] GAPGPAGA [ 313- 320] GLPGVAGA [ 622- 629] GEPGVVGA [ 640- 647] GLPGERGA [ 682- 689] GAPGPAGA [ 883- 890] GLPGVAGA [ 913- 920] GSPGVNGA ______________________________ [ 102- 110] PGAAGAPGP [ 789- 797] PGAAGRTGP ______________________________ [ 105- 128] AGAPGPQGFQGPAGEPGEPGQTGP [1077-1100] AGIRGPQGHQGPAGPPGPPGPPGP ______________________________ [ 97- 98]-( 4)-[ 103- 104]-( 3)-[ 108- 119] [ 523- 524]-( 4)-[ 529- 530]-( 3)-[ 534- 545] [ 108- 119] PGPQGFQGPAGE [ 534- 545] PGPQGVQGGKGE with superset: [ 108- 112] PGPQG [ 534- 538] PGPQG [ 861- 865] PGPQG ______________________________ [ 115- 119] GPAGE [ 562- 566] GPAGE ______________________________ [ 117- 128] AGEPGEPGQTGP [ 438- 449] AGRPGEPGLMGP ______________________________ [ 118- 122] GEPGE [1009-1013] GEPGE ______________________________ [ 118- 119]-( 3)-[ 123- 127] [ 205- 206]-( 3)-[ 210- 214] [ 123- 127] PGQTG [ 210- 214] PGQTG ______________________________ [ 127- 134] GPAGARGP [ 901- 908] GPPGARGP ______________________________ [ 127- 133] GPAGARG [ 229- 235] GPAGARG [ 481- 487] GPAGARG with superset: [ 127- 131] GPAGA [ 229- 233] GPAGA [ 481- 485] GPAGA [ 685- 689] GPAGA [ 727- 731] GPAGA [ 988- 992] GPAGA ______________________________ [ 151- 155] GRPGE [ 439- 443] GRPGE ______________________________ [ 154- 161] GERGVVGP [ 748- 755] GENGVVGP ______________________________ [ 168- 172] PGTPG [ 858- 862] PGTPG ______________________________ [ 171- 179] PGLPGFKGI [ 321- 329] PGLPGPRGI ______________________________ [ 181- 188] GHNGLDGL [1024-1031] GHNGLQGL ______________________________ [ 190- 194] GQPGA [ 733- 737] GQPGA ______________________________ [ 192- 196]-( -5)-[ 192- 203] [ 201- 205]--------[ ] [ 258- 262]-( -5)-[ 258- 269] [ 192- 196] PGAPG [ 201- 205] PGAPG [ 258- 262] PGAPG [ 192- 203] PGAPGVKGEPGA [ 258- 269] PGAPGPKGEIGA ______________________________ [ 192- 206] PGAPGVKGEPGAPGE [ 651- 665] PGGKGEKGEPGLRGE with superset: [ 196- 202] GVKGEPG [ 358- 364] GNKGEPG [ 619- 625] GNKGEPG [ 655- 661] GEKGEPG [1006-1012] GDKGEPG ______________________________ [ 202- 206] GAPGE [ 919- 923] GAPGE ______________________________ [ 213- 221] TGARGLPGE [ 339- 347] TGARGLVGE ______________________________ [ 216- 224] RGLPGERGR [ 405- 413] RGLPGADGR with superset: [ 216- 220] RGLPG [ 405- 409] RGLPG [ 450- 454] RGLPG [ 882- 886] RGLPG [1017-1021] RGLPG ______________________________ [ 216- 230] RGLPGERGRVGAPGP [ 708- 722] RGSPGERGEVGPAGP with superset: [ 151- 160] GRPGERGVVG [ 217- 226] GLPGERGRVG [ 709- 718] GSPGERGEVG ______________________________ [ 225- 233] VGAPGPAGA [ 345- 352] VGEPGPAG [ 681- 689] VGAPGPAGA with superset: [ 225- 230] VGAPGP [ 345- 350] VGEPGP [ 681- 686] VGAPGP [ 891- 896] VGEPGP ______________________________ [ 228- 242] PGPAGARGSDGSVGP [ 771- 782] PGPAGSRG_DG__GP with superset: [ 228- 232] PGPAG [ 348- 352] PGPAG [ 582- 586] PGPAG [ 684- 688] PGPAG [ 771- 775] PGPAG [ 810- 814] PGPAG and: [ 228- 235] PGPAGARG [ 582- 589] PGPAGPRG [ 771- 778] PGPAGSRG ______________________________ [ 231- 235] AGARG [ 483- 487] AGARG [ 516- 520] AGARG ______________________________ [ 238- 248] GSVGPVGPAGP [1045-1055] GSVGPAGPRGP ______________________________ [ 241- 247] GPVGPAG [ 967- 973] GPVGPAG [ 985- 991] GPVGPAG ______________________________ [ 244- 254] GPAGPIGSAGP [ 601- 611] GPTGPIGSRGP ______________________________ [ 250- 254]--------[ ]--------[ 250- 259] [ 364- 368]-( -8)-[ 361- 373]--------[ ] [ 388- 392]-( -8)-[ 385- 397]-( -10)-[ 388- 397] [ 250- 254] GSAGP [ 364- 368] GSAGP [ 388- 392] GSAGP [ 361- 373] GEPGSAGPQGPPG [ 385- 397] GEAGSAGPPGPPG [ 250- 259] GSAGPPGFPG [ 388- 397] GSAGPPGPPG ______________________________ [ 252- 263] AGPPGFPGAPGP [1089-1100] AGPPGPPGPPGP with superset: [ 135- 139] AGPPG [ 252- 256] AGPPG [ 390- 394] AGPPG [ 549- 553] AGPPG [ 855- 859] AGPPG [ 900- 904] AGPPG [1089-1093] AGPPG and: [ 252- 259] AGPPGFPG [ 390- 397] AGPPGPPG [ 855- 862] AGPPGTPG [1089-1096] AGPPGPPG ______________________________ [ 253- 257]--------[ ] [ 550- 554]-( 33)-[ 588- 599] [ 838- 842]-( 36)-[ 879- 890] [ 253- 257] GPPGF [ 550- 554] GPPGF [ 838- 842] GPPGF [ 588- 599] RGERGPPGESGA [ 879- 890] RGERGLPGVAGA with superset: [ 41- 48] RGERGPPG [ 588- 595] RGERGPPG [ 879- 886] RGERGLPG ______________________________ [ 256- 260] GFPGA [ 787- 791] GFPGA ______________________________ [ 259- 266] GAPGPKGE [ 742- 749] GAKGPKGE ______________________________ [ 259- 263] GAPGP [ 520- 524] GAPGP [ 682- 686] GAPGP [ 961- 965] GAPGP ______________________________ [ 255- 256]-( 4)-[ 261- 265] [ 489- 490]-( 4)-[ 495- 499] [ 261- 265] PGPKG [ 495- 499] PGPKG ______________________________ [ 273- 283] AGPAGPAGPRG [ 699- 709] AGPAGPAGPRG ______________________________ [ 273- 279] AGPAGPA [ 366- 372] AGPQGPP [ 390- 396] AGPPGPP [ 600- 605] AGPTGP [ 633- 638] AGPSGP [ 699- 705] AGPAGPA [ 762- 767] AGPAGP [1050-1056] AGPRGPA [1056-1062] AGPSGPA [1089-1095] AGPPGPP ______________________________ [ 282- 286] RGEVG [ 714- 718] RGEVG ______________________________ [ 277- 280]-( 4)-[ 285- 289] [ 460- 463]-( 4)-[ 468- 472] [ 277- 280] GPAG [ 460- 463] GPAG [ 285- 289] VGLPG [ 468- 472] VGLPG ______________________________ [ 286- 290] GLPGL [1018-1022] GLPGL ______________________________ [ 291- 298] SGPVGPPG [ 804- 811] SGPPGPPG ______________________________ [ 294- 298] VGPPG [ 837- 841] VGPPG ______________________________ [ 318- 326] AGAPGLPGP [ 960- 968] AGAPGPHGP with superset: [ 105- 109] AGAPG [ 318- 322] AGAPG [ 960- 964] AGAPG ______________________________ [ 336- 343] AGATGARG [ 687- 694] AGATGDRG ______________________________ [ 360- 364]-( -5)-[ 360- 368]-( 27)-[ 396- 400] [ 621- 625]--------[ ]--------[ 660- 664] [ 657- 661]--------[ ]--------[ ] [1008-1012]-( -5)-[1008-1016]--------[ ] [ 360- 364] KGEPG [ 621- 625] KGEPG [ 657- 661] KGEPG [1008-1012] KGEPG [ 360- 368] KGEPGSAGP [1008-1016] KGEPGEKGP [ 396- 400] PGLRG [ 660- 664] PGLRG ______________________________ [ 369- 376] QGPPGPSG [ 555- 562] QGLPGPSG with superset: [ 369- 374] QGPPGP [ 531- 536] QGPPGP [ 546- 551] QGPAGP [ 555- 560] QGLPGP [1086-1091] QGPAGP ______________________________ [ 372- 376] PGPSG [ 558- 562] PGPSG [ 798- 802] PGPSG ______________________________ [ 375- 376]-( 4)-[ 381- 388] [ 426- 427]-( 4)-[ 432- 439] [ 381- 388] RGPNGEAG [ 432- 439] RGPNGDAG ______________________________ [ 390- 397] AGPPGPPG [ 855- 863] AGPPGTPGP [1089-1097] AGPPGPPGP ______________________________ [ 385- 395]-( 3)-[ 399- 406] [ 694- 704]-( 3)-[ 708- 715] [ 385- 395] GEAGSAGPPGP [ 694- 704] GEAGAAGPAGP [ 399- 406] RGSPGSRG [ 708- 715] RGSPGERG ______________________________ [ 373- 377]-( 24)-[ 402- 406] [ ]--------[ 420- 424] [ 847- 851]-( 24)-[ 876- 880] [ 373- 377] GPSGE [ 847- 851] GPSGE [ 402- 406] PGSRG [ 420- 424] PGSRG [ 876- 880] PGSRG ______________________________ [ 426- 430] SGPAG [ 561- 565] SGPAG [1059-1063] SGPAG ______________________________ [ 442- 446] GEPGL [ 658- 662] GEPGL ______________________________ [ 444- 445]-( 4)-[ 450- 454] [ 876- 877]-( 4)-[ 882- 886] [1011-1012]-( 4)-[1017-1021] [ 450- 454] RGLPG [ 882- 886] RGLPG [1017-1021] RGLPG ______________________________ [ 451- 455] GLPGS [ 874- 878] GLPGS ______________________________ [ 456- 463] PGNIGPAG [ 477- 484] PGPIGPAG ______________________________ [ 456- 461]-( 21)-[ 483- 490] [ 489- 493]-( 22)-[ 516- 523] [ 951- 956]--------[ ] [ 456- 461] PGNIGP [ 489- 493] PGNIG [ 951- 956] PGNIGP [ 483- 490] AGARGEPG [ 516- 523] AGARGAPG ______________________________ [ 460- 464]-( -9)-[ 456- 466]--------[ ] [ 811- 815]-( -9)-[ 807- 817]-( 17)-[ 835- 839] [ 970- 974]--------[ ]--------[ 991- 995] [1060-1064]--------[ ]--------[ ] [ 460- 464] GPAGK [ 811- 815] GPAGK [ 970- 974] GPAGK [1060-1064] GPAGK [ 456- 466] PGNIGPAGKEG [ 807- 817] PGPPGPAGKEG [ 835- 839] GAVGP [ 991- 995] GAVGP ______________________________ [ 469- 473] GLPGI [1030-1034] GLPGI ______________________________ [ 517- 521] GARGA [ 673- 677] GARGA ______________________________ [ 522- 527] PGPDGN [ 615- 620] PGPDGN ______________________________ [ 531- 541] QGPPGPQGVQG [ 546- 556] QGPAGPPGFQG with superset: [ 114- 121] QGPAGEPG [ 369- 376] QGPPGPSG [ 531- 538] QGPPGPQG [ 546- 553] QGPAGPPG [1086-1093] QGPAGPPG ______________________________ [ 541- 545] GGKGE [ 652- 656] GGKGE ______________________________ [ 549- 553]--------[ ]--------[ ]--------[ 571- 575] [ 855- 859]--------[ ]--------[ 855- 863]-( 16)-[ 880- 884] [ 900- 904]-( -5)-[ 900- 910]-( -11)-[ 900- 908]--------[ ] [1089-1093]-( -5)-[1089-1099]-( -11)-[1089-1097]--------[ ] [ 549- 553] AGPPG [ 855- 859] AGPPG [ 900- 904] AGPPG [1089-1093] AGPPG [ 900- 910] AGPPGARGPPG [1089-1099] AGPPGPPGPPG [ 855- 863] AGPPGTPGP [ 900- 908] AGPPGARGP [1089-1097] AGPPGPPGP [ 571- 575] GERGL [ 880- 884] GERGL ______________________________ [ 556- 566] GLPGPSGPAGE [ 580- 590] GLPGPAGPRGE ______________________________ [ 559- 565] GPSGPAG [1057-1063] GPSGPAG ______________________________ [ 582- 595] PGPAGPRGERGPPG [ 771- 784] PGPAGSRGDGGPPG ______________________________ [ 598- 605]--------[ ]--------[ 643- 647] [ 697- 704]-( -8)-[ 697- 707]-( 31)-[ 739- 743] [ 760- 767]-( -8)-[ 760- 770]--------[ ] [ 598- 605] GAAGPTGP [ 697- 704] GAAGPAGP [ 760- 767] GAAGPAGP [ 697- 707] GAAGPAGPAGP [ 760- 770] GAAGPAGPNGP [ 643- 647] GERGA [ 739- 743] GERGA ______________________________ [ 610- 617] GPSGPPGP [ 982- 989] GPSGPVGP with superset: [ 610- 614] GPSGP [ 634- 638] GPSGP [ 982- 986] GPSGP [ 997-1001] GPSGP [1057-1061] GPSGP ______________________________ [ 627- 631] VGAVG [ 834- 838] VGAVG ______________________________ [ 631- 635] GTAGP [ 853- 857] GTAGP ______________________________ [ 684- 692] PGPAGATGD [ 771- 779] PGPAGSRGD ______________________________ [ 685- 689] GPAGA [ 727- 731] GPAGA [ 988- 992] GPAGA ______________________________ [ ]--------[ 702- 707] [ 696- 700]-( 25)-[ 726- 730] [ 729- 733]-( 28)-[ 762- 767] [ 696- 700] AGAAG [ 729- 733] AGAAG [ 702- 707] AGPAGP [ 726- 730] AGPAG [ 762- 767] AGPAGP ______________________________ [ 727- 728]-( 4)-[ 733- 742] [ 934- 935]-( 4)-[ 940- 949] [ 733- 742] GQPGAKGERG [ 940- 949] GQPGHKGERG ______________________________ [ 757- 775] GPVGAAGPAGPNGPPGPAG [ 955- 973] GPVGAAGAPGPHGPVGPAG with superset: [ 100- 106] GPPGAAG [ 241- 247] GPVGPAG [ 331- 337] GPVGAAG [ 727- 733] GPAGAAG [ 757- 763] GPVGAAG [ 955- 961] GPVGAAG [ 967- 973] GPVGPAG [ 985- 991] GPVGPAG and: [ 100- 110] GPPGAAGAPGP [ 757- 767] GPVGAAGPAGP [ 955- 965] GPVGAAGAPGP [ 985- 995] GPVGPAGAVGP and: [ 757- 773] GPVGAAGPAGPNGPPGP [ 955- 971] GPVGAAGAPGPHGPVGP [ 985-1001] GPVGPAGAVGPRGPSGP ______________________________ [ 844- 848] GEKGP [1012-1016] GEKGP ______________________________ [ 888- 896] AGAVGEPGP [ 990- 998] AGAVGPRGP ______________________________ [ 987-1001] VGPAGAVGPRGPSGP [1047-1061] VGPAGPRGPAGPSGP Highly repetitive regions: From 45 to 1101 with major motif GPPGPP. From 133 to 1092 with major motif GPAGPR. B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 9 Aligned matching blocks: [ 247- 265] spisossppsipsspsp+s [ 694- 709] s_-ssssspsspssp__+s with superset: [ 247- 256] spisosspps [ 361- 370] s-psosspns [ 385- 394] s-ssosspps [ 544- 553] s-nspsspps [ 694- 703] s-ssssspss [ 850- 859] s-ssosspps and: [ 247- 259] spisossppsips [ 361- 373] s-psosspnspps [ 385- 397] s-ssossppspps [ 694- 705] s-ss_ssspssps [ 850- 862] s-ssossppsops ______________________________ [ 573- 592] +sihs-isipspssp+s-+s [ 675- 694] +sshssisspspsssos-+s with superset: [ 348- 356] pspsso+s- [ 582- 590] pspssp+s- [ 615- 623] psp-sn+s- [ 684- 692] pspsssos- [ 771- 779] pspsso+s- ______________________________ [1044-1063] psoispssp+spsspospss [1071-1090] psoispssi+spnshnspss with superset: [ 33- 42] spss-+sp+s [ 127- 136] spsss+spss [ 427- 436] spssi+spns [1048-1057] spssp+spss [1075-1084] spssi+spns and: [ 33- 46] spss-+sp+s-+sp [1048-1061] spssp+spssposp [1075-1088] spssi+spnshnsp -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 70 (Expected range: 107--175) low 1 .........L LLL....... ......EE.. .......... .....PP.PP .......... 61 PP.PP.PP.P P..GG..AA. .......... .......... PP.AA..... .......... 121 .......... ......PP.. .......... .......VV. .......... .......... 181 .......... .......... .......... .......... .......... .......... 241 .......... ...PP..... .......... .......... .......... .....PP... 301 .......... AA........ .......... ....AA.... .......... .......... 361 .......... PP....EE.. .......... .PP.PP.... .......... ........PP 421 .......... .......... .......... .......... .......... .......... 481 .......... .......... .......... .......... ......NN.. ..PP...... 541 GG........ PP........ .......... .......... .......... ..PP....AA 601 .......... ...PP..... .....VV... .......... ......AA.. .GG....... 661 .......... .......... .......... .......AA. .......... .......... 721 .......... AA........ .......... .VV....... AA.......P P........G 781 GPP....... AA....PP.. .....PP.PP .......... .......... ........PP 841 .......... ......PP.. .....LL... .......... .......... .......... 901 .PP....PP. .......... .......... ....PP.... .......... ........AA 961 .......... .......... .......... .......... .......... .......... 1021 .......... ......HH.. .......... .......... .......... .......... 1081 .......... PP.PP.PP.P P...GGG... .......... .......... .......... 1141 ..NN....LL .......... .......... ...SS.YY.. .......... .......... 1201 .......... ........SS ..KK...... .......... .......... ........LL 1261 .......... .......... EE....KK.. .......... .......... ........KK 1321 .......II. .......... .........G G....FF... ...... 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 25 (6-10) 11 (11-20) 11 (>=21) 24 3. Clusters of amino acid multiplets (cmin = 10/30 or 13/45 or 15/60): none 4. Significant specific amino acid altplet counts: Letters Observed (Critical number) AG 98 (89) at 35 (l= 2) 103 (l= 2) 105 (l= 3) 117 (l= 2) 129 (l= 3) 135 (l= 2) 141 (l= 2) 163 (l= 2) 193 (l= 2) 202 (l= 2) 214 (l= 2) 226 (l= 2) 231 (l= 3) 246 (l= 2) 252 (l= 2) 259 (l= 2) 268 (l= 2) 273 (l= 2) 276 (l= 2) 279 (l= 2) 301 (l= 2) 307 (l= 2) 310 (l= 2) 312 (l= 2) 318 (l= 3) 334 (l= 2) 336 (l= 3) 340 (l= 2) 351 (l= 2) 366 (l= 2) 387 (l= 2) 390 (l= 2) 409 (l= 2) 414 (l= 2) 424 (l= 2) 429 (l= 2) 438 (l= 2) 462 (l= 2) 483 (l= 3) 513 (l= 2) 516 (l= 3) 520 (l= 2) 529 (l= 2) 549 (l= 2) 564 (l= 2) 585 (l= 2) 598 (l= 2) 600 (l= 2) 628 (l= 2) 633 (l= 2) 646 (l= 2) 648 (l= 2) 673 (l= 2) 676 (l= 2) 679 (l= 2) 682 (l= 2) 687 (l= 3) 696 (l= 3) 699 (l= 2) 702 (l= 2) 705 (l= 2) 720 (l= 2) 726 (l= 2) 729 (l= 3) 732 (l= 2) 736 (l= 2) 742 (l= 2) 760 (l= 2) 762 (l= 2) 765 (l= 2) 774 (l= 2) 790 (l= 2) 792 (l= 2) 813 (l= 2) 835 (l= 2) 843 (l= 2) 852 (l= 2) 855 (l= 2) 868 (l= 2) 888 (l= 3) 900 (l= 2) 904 (l= 2) 910 (l= 2) 919 (l= 2) 924 (l= 2) 958 (l= 2) 960 (l= 3) 972 (l= 2) 990 (l= 3) 1035 (l= 2) 1042 (l= 2) 1050 (l= 2) 1056 (l= 2) 1062 (l= 2) 1077 (l= 2) 1089 (l= 2) 1234 (l= 2) 1351 (l= 2) GP 187 (139) at 33 (l= 2) 39 (l= 2) 45 (l= 2) 47 (l= 3) 50 (l= 2) 57 (l= 2) 60 (l= 2) 62 (l= 3) 65 (l= 3) 68 (l= 3) 71 (l= 2) 89 (l= 4) 97 (l= 2) 100 (l= 2) 102 (l= 2) 108 (l= 3) 115 (l= 2) 120 (l= 2) 123 (l= 2) 127 (l= 2) 133 (l= 2) 136 (l= 2) 138 (l= 2) 147 (l= 2) 150 (l= 2) 153 (l= 2) 160 (l= 2) 168 (l= 2) 171 (l= 2) 174 (l= 2) 192 (l= 2) 195 (l= 2) 201 (l= 2) 204 (l= 2) 210 (l= 2) 219 (l= 2) 228 (l= 3) 241 (l= 2) 244 (l= 2) 247 (l= 2) 253 (l= 2) 255 (l= 2) 258 (l= 2) 261 (l= 3) 274 (l= 2) 277 (l= 2) 280 (l= 2) 288 (l= 2) 292 (l= 2) 295 (l= 2) 297 (l= 2) 300 (l= 2) 315 (l= 2) 321 (l= 2) 324 (l= 3) 330 (l= 3) 348 (l= 3) 363 (l= 2) 367 (l= 2) 370 (l= 2) 372 (l= 3) 382 (l= 2) 391 (l= 2) 393 (l= 3) 396 (l= 2) 402 (l= 2) 408 (l= 2) 418 (l= 2) 420 (l= 2) 427 (l= 2) 433 (l= 2) 441 (l= 2) 444 (l= 2) 448 (l= 2) 453 (l= 2) 456 (l= 2) 460 (l= 2) 466 (l= 2) 471 (l= 2) 477 (l= 3) 481 (l= 2) 489 (l= 2) 495 (l= 3) 499 (l= 2) 504 (l= 2) 522 (l= 3) 532 (l= 2) 534 (l= 3) 547 (l= 2) 550 (l= 2) 552 (l= 2) 558 (l= 3) 562 (l= 2) 570 (l= 2) 582 (l= 3) 586 (l= 2) 592 (l= 2) 594 (l= 2) 601 (l= 2) 604 (l= 2) 610 (l= 2) 613 (l= 2) 615 (l= 3) 624 (l= 2) 634 (l= 2) 637 (l= 2) 642 (l= 2) 651 (l= 2) 660 (l= 2) 669 (l= 2) 684 (l= 3) 700 (l= 2) 703 (l= 2) 706 (l= 2) 711 (l= 2) 718 (l= 2) 721 (l= 2) 727 (l= 2) 735 (l= 2) 745 (l= 2) 754 (l= 2) 757 (l= 2) 763 (l= 2) 766 (l= 2) 769 (l= 2) 771 (l= 3) 781 (l= 2) 783 (l= 2) 789 (l= 2) 796 (l= 2) 798 (l= 3) 805 (l= 2) 807 (l= 3) 810 (l= 3) 820 (l= 2) 826 (l= 2) 838 (l= 2) 840 (l= 2) 847 (l= 2) 856 (l= 2) 858 (l= 2) 861 (l= 3) 870 (l= 2) 876 (l= 2) 885 (l= 2) 894 (l= 3) 901 (l= 2) 903 (l= 2) 907 (l= 2) 909 (l= 2) 915 (l= 2) 921 (l= 2) 930 (l= 2) 934 (l= 2) 936 (l= 2) 942 (l= 2) 951 (l= 2) 955 (l= 2) 963 (l= 3) 967 (l= 2) 970 (l= 2) 982 (l= 2) 985 (l= 2) 988 (l= 2) 994 (l= 2) 997 (l= 2) 1000 (l= 2) 1011 (l= 2) 1015 (l= 2) 1020 (l= 2) 1032 (l= 2) 1044 (l= 2) 1048 (l= 2) 1051 (l= 2) 1054 (l= 2) 1057 (l= 2) 1060 (l= 2) 1071 (l= 2) 1075 (l= 2) 1081 (l= 2) 1087 (l= 2) 1090 (l= 2) 1092 (l= 3) 1095 (l= 3) 1098 (l= 3) 1101 (l= 2) 1361 (l= 2) B. CHARGE ALPHABET. 1. Total number of charge multiplets: 11 (Expected range: 5-- 31) 6 +plets (f+: 8.9%), 5 -plets (f-: 8.0%) Total number of charge altplets: 28 (Critical number: 35) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 3 (6-10) 0 (11-20) 0 (>=21) 9 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 10- 13 1 L 4 4 0 33- 77 3 G.. 15 15 ! 0 83- 92 2 G. 5 5 ! 0 91-1116 3 G.. 341 339 ! 1 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 8) Location Period Element Copies Core Errors There are no periodicities of the prescribed length. -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 0-1173 (2.) W(1173)W 1 of 6 0.0003 large 1. maximal spacing 22-1163 (2.) C(1141)C 1 of 10 0.0000 large 1. maximal spacing 24- 236 (1.) S( 212)S 1 of 53 0.0068 large maximal spacing 81- 950 (2.) Y( 869)Y 1 of 17 0.0000 large maximal spacing 222- 224 (1.) R( 2)R 73 of 73 0.0003 large minimal spacing 519- 573 (2.) R( 54)R 1 of 73 0.9955 small 1. maximal spacing 1114-1154 (4.) G( 40)G 2 of 382 0.0000 large 2. maximal spacing 1203-1272 (4.) C( 69)C 2 of 10 1.0000 small 2. maximal spacing 1212-1336 (4.) P( 124)P 1 of 231 0.0000 large maximal spacing 1227-1324 (4.) W( 97)W 2 of 6 0.9987 small 2. maximal spacing 1244-1284 (4.) G( 40)G 1 of 382 0.0006 large 1. maximal spacing 1258-1307 (4.) R( 49)R 2 of 73 0.9931 small 2. maximal spacing ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: tem10 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Collagen Collagen triple helix repeat (20 copi 730.0 1.1e-215 18 COLFI Fibrillar collagen C-terminal domain 538.4 6.3e-210 1 adenylatekinase Adenylate kinase 1.8 23 1 G6PD Glucose-6-phosphate dehydrogenase -2.3 59 1 prion Prion protein -107.8 63 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- Collagen 1/18 25 83 .. 1 60 [] 12.6 0.00069 Collagen 2/18 89 147 .. 1 60 [] 39.8 6.3e-08 Collagen 3/18 148 207 .. 1 60 [] 61.9 1.4e-14 Collagen 4/18 208 267 .. 1 60 [] 50.5 3.6e-11 Collagen 5/18 268 327 .. 1 60 [] 54.4 2.5e-12 Collagen 6/18 328 387 .. 1 60 [] 56.7 4.9e-13 adenylatekinase 1/1 416 422 .. 1 7 [. 1.8 23 Collagen 7/18 388 447 .. 1 60 [] 47.2 3.6e-10 Collagen 8/18 448 507 .. 1 60 [] 45.4 1.3e-09 Collagen 9/18 508 567 .. 1 60 [] 50.4 3.9e-11 G6PD 1/1 605 611 .. 475 481 .] -2.3 59 Collagen 10/18 568 627 .. 1 60 [] 49.1 1e-10 Collagen 11/18 628 687 .. 1 60 [] 51.2 2.3e-11 Collagen 12/18 691 750 .. 1 60 [] 63.7 4e-15 Collagen 13/18 751 810 .. 1 60 [] 47.7 2.6e-10 Collagen 14/18 811 870 .. 1 60 [] 44.8 1.9e-09 Collagen 15/18 871 930 .. 1 60 [] 49.4 8.1e-11 Collagen 16/18 931 990 .. 1 60 [] 49.4 7.7e-11 Collagen 17/18 991 1050 .. 1 60 [] 53.5 4.5e-12 Collagen 18/18 1051 1110 .. 1 60 [] 38.2 1.9e-07 prion 1/1 1059 1315 .. 1 244 [] -107.8 63 COLFI 1/1 1149 1365 .. 1 226 [] 538.4 6.3e-210 Alignments of top-scoring domains: Collagen: domain 1 of 18, from 25 to 83: score 12.6, E = 0.00069 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp + ++Gp+G +Gp+G+ Gp+G+pG G G+ GpPGppGppGp tem10gi|14 25 -LQEETVRKGPAGDRGPRGERGPPGPPGRDGEDGPTGPPGPPGPPGP 70 pGppGapGapGpp<-* pG G+ a tem10gi|14 71 PGLGGNFAAQYDG 83 Collagen: domain 2 of 18, from 89 to 147: score 39.8, E = 6.3e-08 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp pGp+G Gp+GppG++G+pGp+G++Gp+G pGepG+ Gp+G+ Gp tem10gi|14 89 -GPGPMGLMGPRGPPGAAGAPGPQGFQGPAGEPGEPGQTGPAGARGP 134 pGppGapGapGpp<-* +GppG++G+ G p tem10gi|14 135 AGPPGKAGEDGHP 147 Collagen: domain 3 of 18, from 148 to 207: score 61.9, E = 1.4e-14 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+pG+pG++G Gp G++G+pG +G pG +G +G +G +G +G+pG+ tem10gi|14 148 GKPGRPGERGVVGPQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGA 194 pGppGapGapGpp<-* pG +G+pGapG++ tem10gi|14 195 PGVKGEPGAPGEN 207 Collagen: domain 4 of 18, from 208 to 267: score 50.5, E = 3.6e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G pG+ G++G+pG++G+ G+pGpaGa+G+ G G+ Gp Gp G +Gp tem10gi|14 208 GTPGQTGARGLPGERGRVGAPGPAGARGSDGSVGPVGPAGPIGSAGP 254 pGppGapGapGpp<-* pG pGapG++G+ tem10gi|14 255 PGFPGAPGPKGEI 267 Collagen: domain 5 of 18, from 268 to 327: score 54.4, E = 2.5e-12 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+ G +Gp+Gp+Gp+G+ G pG +G+ GppG+pG++G G++G++G tem10gi|14 268 GAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNPGANGLTGAKGAAGL 314 pGppGapGapGpp<-* pG +GapG pGp+ tem10gi|14 315 PGVAGAPGLPGPR 327 Collagen: domain 6 of 18, from 328 to 387: score 56.7, E = 4.9e-13 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G pGp G++G++G++G G+pGpaG++G+ G++GepG+ Gp+GppGp tem10gi|14 328 GIPGPVGAAGATGARGLVGEPGPAGSKGESGNKGEPGSAGPQGPPGP 374 pGppGapGapGpp<-* G++G+ G+ G++ tem10gi|14 375 SGEEGKRGPNGEA 387 adenylatekinase: domain 1 of 1, from 416 to 422: score 1.8, E = 23 *->LlGpPGa<-* ++GpPG+ tem10gi|14 416 VMGPPGS 422 Collagen: domain 7 of 18, from 388 to 447: score 47.2, E = 3.6e-10 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G +GppGppG++G pG +G pG++G +G GppG +G+ Gp+G+ Gp tem10gi|14 388 GSAGPPGPPGLRGSPGSRGLPGADGRAGVMGPPGSRGASGPAGVRGP 434 pGppGapGapGpp<-* G +G+pG+pG tem10gi|14 435 NGDAGRPGEPGLM 447 Collagen: domain 8 of 18, from 448 to 507: score 45.4, E = 1.3e-09 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp Gp+G+pG+pG Gp+G+ Gp G +G G pGp G++G+ G pG G+ tem10gi|14 448 GPRGLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPGNIGF 494 pGppGapGapGpp<-* pGp+G+ G+pG++ tem10gi|14 495 PGPKGPTGDPGKN 507 Collagen: domain 9 of 18, from 508 to 567: score 50.4, E = 3.9e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G +G +G +G++G+pGp G G++G+pGp+G +G +G++Gp+GppG+ tem10gi|14 508 GDKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAGPPGF 554 pGppGapGapGpp<-* +G pG++G++G++ tem10gi|14 555 QGLPGPSGPAGEV 567 G6PD: domain 1 of 1, from 605 to 611: score -2.3, E = 59 *->eyGSrGP<-* + GSrGP tem10gi|14 605 PIGSRGP 611 Collagen: domain 10 of 18, from 568 to 627: score 49.1, E = 1e-10 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+pG++G G+ G pGp+Gp G++G+pG+ G++G+ Gp G++Gp Gp tem10gi|14 568 GKPGERGLHGEFGLPGPAGPRGERGPPGESGAAGPTGPIGSRGPSGP 614 pGppGapGapGpp<-* pGp+G++G+pG + tem10gi|14 615 PGPDGNKGEPGVV 627 Collagen: domain 11 of 18, from 628 to 687: score 51.2, E = 2.3e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+ G +Gp Gp G pG++G++G +G++G++G pG +G+ G pG++G+ tem10gi|14 628 GAVGTAGPSGPSGLPGERGAAGIPGGKGEKGEPGLRGEIGNPGRDGA 674 pGppGapGapGpp<-* +G Ga GapGp+ tem10gi|14 675 RGAHGAVGAPGPA 687 Collagen: domain 12 of 18, from 691 to 750: score 63.7, E = 4e-15 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G +G++G++Gp+Gp+Gp+G pG++G+ Gp+Gp G++Gp G++G+pG+ tem10gi|14 691 GDRGEAGAAGPAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGA 737 pGppGapGapGpp<-* +G++Ga+G++G++ tem10gi|14 738 KGERGAKGPKGEN 750 Collagen: domain 13 of 18, from 751 to 810: score 47.7, E = 2.6e-10 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G Gp Gp G++Gp+Gp GppGpaG++G+ GppG G+PG++G+ Gp tem10gi|14 751 GVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGP 797 pGppGapGapGpp<-* pGp G +G+pGpp tem10gi|14 798 PGPSGISGPPGPP 810 Collagen: domain 14 of 18, from 811 to 870: score 44.8, E = 1.9e-09 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp Gp+G+ G +Gp+G Gp G +G+ Ga GppG +Ge+Gp G +G +Gp tem10gi|14 811 GPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSGEAGTAGP 857 pGppGapGapGpp<-* pG pG+ G G+p tem10gi|14 858 PGTPGPQGLLGAP 870 Collagen: domain 15 of 18, from 871 to 930: score 49.4, E = 8.1e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G G+pG++G++G pG +G+ G++G+ G +GppG++GpPG+ G pG tem10gi|14 871 GILGLPGSRGERGLPGVAGAVGEPGPLGIAGPPGARGPPGAVGSPGV 917 pGppGapGapGpp<-* G pG++G G p tem10gi|14 918 NGAPGEAGRDGNP 930 Collagen: domain 16 of 18, from 931 to 990: score 49.4, E = 7.7e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G GppG+ G pG +G++G pG+ G+ G +G+pG+ Gp Gp+G+ G tem10gi|14 931 GNDGPPGRDGQPGHKGERGYPGNIGPVGAAGAPGPHGPVGPAGKHGN 977 pGppGapGapGpp<-* +G++G++G+ Gp+ tem10gi|14 978 RGETGPSGPVGPA 990 Collagen: domain 17 of 18, from 991 to 1050: score 53.5, E = 4.5e-12 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+ Gp+Gp Gp+G +G++G+pG++G++G pG +G +G +G pG +G tem10gi|14 991 GAVGPRGPSGPQGIRGDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGH 1037 pGppGapGapGpp<-* G +GapG+ Gp+ tem10gi|14 1038 HGDQGAPGSVGPA 1050 Collagen: domain 18 of 18, from 1051 to 1110: score 38.2, E = 1.9e-07 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp Gp+Gp+Gp Gp+G +G+ G+pG G++G +Gp+G Gp GppGppGp tem10gi|14 1051 GPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGP 1097 pGppGapGapGpp<-* pGppG +G tem10gi|14 1098 PGPPGVSGGGYDF 1110 prion: domain 1 of 1, from 1059 to 1315: score -107.8, E = 63 *->KKRPKPGGgWntGGsRYPGqgsPGGnRYPPqggggwgqphG..GtWG P G G PG P G R Pqg g + p G++G G tem10gi|14 1059 ---SGPAG--KDGRTGHPGTVGPAGIR-GPQGHQGPAGPPGppGPPG 1099 qPHG.gGGWGqPHGGGW...GqPHGGGWGqPHGgGGWGqGGGtHnqWnKP P +gGG+ G + + qP P nq tem10gi|14 1100 PPGVsGGGYDFGYDGDFyraDQPRSAPSLRPKDYEVDATLKSLNNQIETL 1149 sKPKtnmKHvAGAAAAGAvvG...GLGGYmL....GsAmsRPliHFGnDy P K A ++ G Y +++++G m + tem10gi|14 1150 LTPEGSRKNPARTCRDLRLSHpewSSGYYWIdpnqGCTMDAIKVYCDFST 1199 EDRYYREnmyRYPnqvYYRPvDqYsnqnnFvHDCvnitiK.qHtvttttK + R + P + +YR + + + + +n + + v t tem10gi|14 1200 GETCIRAQPENIPAKNWYRSSK--DKKHVWLGETINAGSQfEYNVEGVTS 1247 GEnFtEtDiKimERvvEqmCitqYqkEsqAYYqgrRgs...svvLFssPP E t + it k s AY + g +++v+L s tem10gi|14 1248 KEMATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNlkkAVILQGSND 1297 viLLis....FLiflivG<-* v L+ ++++F ++v tem10gi|14 1298 VELVAEgnsrFTYTVLVD 1315 COLFI: domain 1 of 1, from 1149 to 1365: score 538.4, E = 6.3e-210 *->lksPeGksrknPARtCkDLfLchpefksGeYWiDPNqGCikDAikVf l++PeG srknPARtC+DL+L+hpe++sG+YWiDPNqGC++DAikV+ tem10gi|14 1149 LLTPEG-SRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVY 1194 CnkrfetGvgeTCisptpksvpkRiksWykgkskdkKhvWFgetmegGfk C+ f tG eTCi+++p+++p+ k+Wy+++ kdkKhvW get+++G++ tem10gi|14 1195 CD--FSTG--ETCIRAQPENIPA--KNWYRSS-KDKKHVWLGETINAGSQ 1237 fsYiddelnpeisnvQlTFLRLLSteAsQNiTYhCKNSvAYmDeatGNlk f+Y++++++++++++Ql+F+RLL+++AsQNiTYhCKNS+AYmDe+tGNlk tem10gi|14 1238 FEYNVEGVTSKEMATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNLK 1287 kAlilmgSnDvElsadgnskFtYtvlGeDGCssrtgewgKTViEyeTkKt kA+il+gSnDvEl+a+gns+FtYtvl +DGCs++t+ewgKT+iEy+T+K+ tem10gi|14 1288 KAVILQGSNDVELVAEGNSRFTYTVL-VDGCSKKTNEWGKTIIEYKTNKP 1336 tRLPIvDiApsDiGgedQeFGveiGPVCF<-* +RLP++DiAp+DiGg+d+eF+v+iGPVCF tem10gi|14 1337 SRLPFLDIAPLDIGGADHEFFVDIGPVCF 1365 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: tem10 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Collagen Collagen triple helix repeat (20 copi 728.4 3.1e-215 19 COLFI Fibrillar collagen C-terminal domain 538.4 6.3e-210 1 PPR PPR repeat 6.5 3.8 1 adenylatekinase Adenylate kinase 1.8 23 1 Mu_DNA_bind Mu DNA-binding domain -0.1 91 1 G6PD Glucose-6-phosphate dehydrogenase -2.3 59 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- Collagen 1/19 33 72 .. 1 40 [. 26.5 8.5e-06 Collagen 2/19 90 147 .. 3 60 .] 42.7 3.1e-10 Collagen 3/19 148 207 .. 1 60 [] 59.9 5.6e-15 Collagen 4/19 208 267 .. 1 60 [] 48.5 7.5e-12 Mu_DNA_bind 1/1 309 317 .. 1 9 [. -0.1 91 Collagen 5/19 268 327 .. 1 60 [] 52.4 6.6e-13 Collagen 6/19 328 387 .. 1 60 [] 54.7 1.5e-13 Collagen 7/19 388 415 .. 1 28 [. 19.4 0.00075 adenylatekinase 1/1 416 422 .. 1 7 [. 1.8 23 Collagen 8/19 418 465 .. 13 60 .] 40.7 1.1e-09 Collagen 9/19 469 528 .. 1 60 [] 55.7 8.3e-14 Collagen 10/19 529 567 .. 22 60 .] 30.1 8.6e-07 G6PD 1/1 605 611 .. 475 481 .] -2.3 59 Collagen 11/19 568 627 .. 1 60 [] 47.1 1.9e-11 Collagen 12/19 628 687 .. 1 60 [] 49.2 4.9e-12 Collagen 13/19 691 750 .. 1 60 [] 61.7 1.8e-15 Collagen 14/19 751 810 .. 1 60 [] 45.7 4.5e-11 Collagen 15/19 811 870 .. 1 60 [] 42.9 2.7e-10 Collagen 16/19 874 933 .. 1 60 [] 60.0 5.3e-15 Collagen 17/19 934 990 .. 4 60 .] 41.1 8.1e-10 Collagen 18/19 991 1050 .. 1 60 [] 51.6 1.1e-12 Collagen 19/19 1051 1105 .. 1 55 [. 40.4 1.3e-09 PPR 1/1 1308 1320 .. 1 13 [. 6.5 3.8 COLFI 1/1 1149 1365 .. 1 226 [] 538.4 6.3e-210 Alignments of top-scoring domains: Collagen: domain 1 of 19, from 33 to 72: score 26.5, E = 8.5e-06 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPG<-* Gp+G +Gp+G++GppGppG G++G+ GppGppG+pGpPG tem10gi|14 33 GPAGDRGPRGERGPPGPPGRDGEDGPTGPPGPPGPPGPPG 72 Collagen: domain 2 of 19, from 90 to 147: score 42.7, E = 3.1e-10 *->pGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGppG pGp+G Gp+GppG++G+pGp+G++Gp+G pGepG+ Gp+G+ Gp+G tem10gi|14 90 PGPMGLMGPRGPPGAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAG 136 ppGapGapGpp<-* ppG++G+ G p tem10gi|14 137 PPGKAGEDGHP 147 Collagen: domain 3 of 19, from 148 to 207: score 59.9, E = 5.6e-15 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+pG+pG++G Gp G++G+pG +G pG +G +G +G +G +G+pG+ tem10gi|14 148 GKPGRPGERGVVGPQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGA 194 pGppGapGapGpp<-* pG +G+pGapG++ tem10gi|14 195 PGVKGEPGAPGEN 207 Collagen: domain 4 of 19, from 208 to 267: score 48.5, E = 7.5e-12 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G pG+ G++G+pG++G+ G+pGpaGa+G+ G G+ Gp Gp G +Gp tem10gi|14 208 GTPGQTGARGLPGERGRVGAPGPAGARGSDGSVGPVGPAGPIGSAGP 254 pGppGapGapGpp<-* pG pGapG++G+ tem10gi|14 255 PGFPGAPGPKGEI 267 Mu_DNA_bind: domain 1 of 1, from 309 to 317: score -0.1, E = 91 *->kelaGlPGl<-* k +aGlPG+ tem10gi|14 309 KGAAGLPGV 317 Collagen: domain 5 of 19, from 268 to 327: score 52.4, E = 6.6e-13 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+ G +Gp+Gp+Gp+G+ G pG +G+ GppG+pG++G G++G++G tem10gi|14 268 GAVGNAGPAGPAGPRGEVGLPGLSGPVGPPGNPGANGLTGAKGAAGL 314 pGppGapGapGpp<-* pG +GapG pGp+ tem10gi|14 315 PGVAGAPGLPGPR 327 Collagen: domain 6 of 19, from 328 to 387: score 54.7, E = 1.5e-13 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G pGp G++G++G++G G+pGpaG++G+ G++GepG+ Gp+GppGp tem10gi|14 328 GIPGPVGAAGATGARGLVGEPGPAGSKGESGNKGEPGSAGPQGPPGP 374 pGppGapGapGpp<-* G++G+ G+ G++ tem10gi|14 375 SGEEGKRGPNGEA 387 Collagen: domain 7 of 19, from 388 to 415: score 19.4, E = 0.00075 *->GppGppGppGppGppGppGppGpaGapG<-* G +GppGppG++G pG +G pG++G +G tem10gi|14 388 GSAGPPGPPGLRGSPGSRGLPGADGRAG 415 adenylatekinase: domain 1 of 1, from 416 to 422: score 1.8, E = 23 *->LlGpPGa<-* ++GpPG+ tem10gi|14 416 VMGPPGS 422 Collagen: domain 8 of 19, from 418 to 465: score 40.7, E = 1.1e-09 *->GppGppGppGpaGapGppGppGepGpPGppGppGppGppGapGapGp GppG +G+ GpaG +Gp+G++G pG+PG Gp G pG pG++G++G+ tem10gi|14 418 GPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPRGLPGSPGNIGPAGK 464 p<-* tem10gi|14 465 E 465 Collagen: domain 9 of 19, from 469 to 528: score 55.7, E = 8.3e-14 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G pG G+pGp Gp+G++G+pG+ G+pGp+Gp G+pG G +G++G tem10gi|14 469 GLPGIDGRPGPIGPAGARGEPGNIGFPGPKGPTGDPGKNGDKGHAGL 515 pGppGapGapGpp<-* +G +GapG+ G + tem10gi|14 516 AGARGAPGPDGNN 528 Collagen: domain 10 of 19, from 529 to 567: score 30.1, E = 8.6e-07 *->GpaGapGppGppGepGpPGppGppGppGppGapGapGpp<-* G++G+pGp+G +G +G++Gp+GppG++G pG++G++G++ tem10gi|14 529 GAQGPPGPQGVQGGKGEQGPAGPPGFQGLPGPSGPAGEV 567 G6PD: domain 1 of 1, from 605 to 611: score -2.3, E = 59 *->eyGSrGP<-* + GSrGP tem10gi|14 605 PIGSRGP 611 Collagen: domain 11 of 19, from 568 to 627: score 47.1, E = 1.9e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+pG++G G+ G pGp+Gp G++G+pG+ G++G+ Gp G++Gp Gp tem10gi|14 568 GKPGERGLHGEFGLPGPAGPRGERGPPGESGAAGPTGPIGSRGPSGP 614 pGppGapGapGpp<-* pGp+G++G+pG + tem10gi|14 615 PGPDGNKGEPGVV 627 Collagen: domain 12 of 19, from 628 to 687: score 49.2, E = 4.9e-12 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+ G +Gp Gp G pG++G++G +G++G++G pG +G+ G pG++G+ tem10gi|14 628 GAVGTAGPSGPSGLPGERGAAGIPGGKGEKGEPGLRGEIGNPGRDGA 674 pGppGapGapGpp<-* +G Ga GapGp+ tem10gi|14 675 RGAHGAVGAPGPA 687 Collagen: domain 13 of 19, from 691 to 750: score 61.7, E = 1.8e-15 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G +G++G++Gp+Gp+Gp+G pG++G+ Gp+Gp G++Gp G++G+pG+ tem10gi|14 691 GDRGEAGAAGPAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGA 737 pGppGapGapGpp<-* +G++Ga+G++G++ tem10gi|14 738 KGERGAKGPKGEN 750 Collagen: domain 14 of 19, from 751 to 810: score 45.7, E = 4.5e-11 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G Gp Gp G++Gp+Gp GppGpaG++G+ GppG G+PG++G+ Gp tem10gi|14 751 GVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGP 797 pGppGapGapGpp<-* pGp G +G+pGpp tem10gi|14 798 PGPSGISGPPGPP 810 Collagen: domain 15 of 19, from 811 to 870: score 42.9, E = 2.7e-10 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp Gp+G+ G +Gp+G Gp G +G+ Ga GppG +Ge+Gp G +G +Gp tem10gi|14 811 GPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSGEAGTAGP 857 pGppGapGapGpp<-* pG pG+ G G+p tem10gi|14 858 PGTPGPQGLLGAP 870 Collagen: domain 16 of 19, from 874 to 933: score 60.0, E = 5.3e-15 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G pG++G++G+pG +G+ G+pGp G +GppG++G+pG+ G+pG++G+ tem10gi|14 874 GLPGSRGERGLPGVAGAVGEPGPLGIAGPPGARGPPGAVGSPGVNGA 920 pGppGapGapGpp<-* pG++G++G pG tem10gi|14 921 PGEAGRDGNPGND 933 Collagen: domain 17 of 19, from 934 to 990: score 41.1, E = 8.1e-10 *->GppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGppGp GppG+ G pG +G++G pG+ G+ G +G+pG+ Gp Gp+G+ G +G+ tem10gi|14 934 GPPGRDGQPGHKGERGYPGNIGPVGAAGAPGPHGPVGPAGKHGNRGE 980 pGapGapGpp<-* +G++G+ Gp+ tem10gi|14 981 TGPSGPVGPA 990 Collagen: domain 18 of 19, from 991 to 1050: score 51.6, E = 1.1e-12 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp G+ Gp+Gp Gp+G +G++G+pG++G++G pG +G +G +G pG +G tem10gi|14 991 GAVGPRGPSGPQGIRGDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGH 1037 pGppGapGapGpp<-* G +GapG+ Gp+ tem10gi|14 1038 HGDQGAPGSVGPA 1050 Collagen: domain 19 of 19, from 1051 to 1105: score 40.4, E = 1.3e-09 *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp Gp+Gp+Gp Gp+G +G+ G+pG G++G +Gp+G Gp GppGppGp tem10gi|14 1051 GPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGP 1097 pGppGapG<-* pGppG +G tem10gi|14 1098 PGPPGVSG 1105 PPR: domain 1 of 1, from 1308 to 1320: score 6.5, E = 3.8 *->vtYntlIsgycka<-* +tY++l++g++k+ tem10gi|14 1308 FTYTVLVDGCSKK 1320 COLFI: domain 1 of 1, from 1149 to 1365: score 538.4, E = 6.3e-210 *->lksPeGksrknPARtCkDLfLchpefksGeYWiDPNqGCikDAikVf l++PeG srknPARtC+DL+L+hpe++sG+YWiDPNqGC++DAikV+ tem10gi|14 1149 LLTPEG-SRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVY 1194 CnkrfetGvgeTCisptpksvpkRiksWykgkskdkKhvWFgetmegGfk C+ f tG eTCi+++p+++p+ k+Wy+++ kdkKhvW get+++G++ tem10gi|14 1195 CD--FSTG--ETCIRAQPENIPA--KNWYRSS-KDKKHVWLGETINAGSQ 1237 fsYiddelnpeisnvQlTFLRLLSteAsQNiTYhCKNSvAYmDeatGNlk f+Y++++++++++++Ql+F+RLL+++AsQNiTYhCKNS+AYmDe+tGNlk tem10gi|14 1238 FEYNVEGVTSKEMATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNLK 1287 kAlilmgSnDvElsadgnskFtYtvlGeDGCssrtgewgKTViEyeTkKt kA+il+gSnDvEl+a+gns+FtYtvl +DGCs++t+ewgKT+iEy+T+K+ tem10gi|14 1288 KAVILQGSNDVELVAEGNSRFTYTVL-VDGCSKKTNEWGKTIIEYKTNKP 1336 tRLPIvDiApsDiGgedQeFGveiGPVCF<-* +RLP++DiAp+DiGg+d+eF+v+iGPVCF tem10gi|14 1337 SRLPFLDIAPLDIGGADHEFFVDIGPVCF 1365 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: tem10 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Tue Oct 31 14:29:31 2000 Sequence file: tem10 ---------------------------------------- Sequence tem10gi|1418930|emb|CAA98969.1| (1366 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 1267: NITY Total matches: 1 Matching pattern PS00005 PKC_PHOSPHO_SITE: 29: TVR 1127: SLR 1138: TLK 1155: SRK 1162: TCR 1219: SSK 1246: TSK 1318: SKK 1333: TNK Total matches: 9 Matching pattern PS00006 CK2_PHOSPHO_SITE: 3: SFVD 24: SLQE 353: SKGE 375: SGEE 710: SPGE 776: SRGD 878: SRGE 1162: TCRD 1169: SHPE 1198: STGE 1219: SSKD 1236: SQFE 1246: TSKE 1327: TIIE Total matches: 14 Matching pattern PS00008 MYRISTYL: 72: GLGGNF 74: GGNFAA 75: GNFAAQ 211: GQTGAR 232: GARGSD 235: GSDGSV 268: GAVGNA 298: GNPGAN 304: GLTGAK 307: GAKGAA 334: GAAGAT 337: GATGAR 400: GSPGSR 406: GLPGAD 421: GSRGAS 454: GSPGNI 514: GLAGAR 526: GNNGAQ 538: GVQGGK 625: GVVGAV 628: GAVGTA 649: GIPGGK 673: GARGAH 676: GAHGAV 733: GQPGAK 780: GGPPGM 874: GLPGSR 886: GVAGAV 928: GNPGND 1042: GAPGSV 1102: GVSGGG Total matches: 31 Matching pattern PS00009 AMIDATION: 378: EGKR Total matches: 1 Matching pattern PS00016 RGD: 777: RGD 822: RGD 1005: RGD Total matches: 3 Total no of hits in this sequence: 59 ======================================== 1314 pattern(s) searched in 1 sequence(s), 1366 residues. Total no of hits in all sequences: 59. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** argv[1]=P argv[2]=-m /data/patterns/own/motif.fa argv[4]=-seq tem10 ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 1366 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: tem10 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: tem10 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] (1366 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value DNASE1 DNASE-1/Sphingomyelinase like domain 25 0.72 BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 24 1.4 AAA AAA+ ATPase Module 24 1.5 SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chrom... 24 1.6 UBA Ubiquitin pathway associated domain 23 2.9 ARR Arrestin domain 23 3.1 MATH The Meprin associated TRAF homology domain 22 5.0 14-3-3 14-3-3 protein alpha Helical domain 22 5.5 DNAJ DNAJ co-chaperone domain (Hsp70 cofactors) 22 6.1 >DNASE1 DNASE-1/Sphingomyelinase like domain Length = 388 Score = 24.8 bits (53), Expect = 0.72 Identities = 10/91 (10%), Positives = 21/91 (22%), Gaps = 7/91 (7%) Query: 1114 GDFYRA--DQPRSAPSLRPKDYEVDATLKSLNNQIETLLT--PEGSRKNPARTCRDLRLS 1169 GDF ++ + + L S + + E T + T D Sbjct: 280 GDFNADPTEEVYKRFASSSLNLNSAYKLLSEDGESEPPYTTWKIRTTGESCHTL-DYIWY 338 Query: 1170 HPEWSSGYYWIDPNQGCTMDA--IKVYCDFS 1198 + + + + S Sbjct: 339 SQHALRVNAALGLPTEEQIGPNRLPSFNYPS 369 >BRIGHT BRIGHT domain (Alpha helical DNA binding domain) Length = 172 Score = 23.8 bits (51), Expect = 1.4 Identities = 8/29 (27%), Positives = 8/29 (27%) Query: 41 RGERGPPGPPGRDGEDGPTGPPGPPGPPG 69 G R G P P P PG Sbjct: 131 EGRRSSYGQYEAMHNQMPMTPISRPSLPG 159 Score = 21.5 bits (45), Expect = 7.0 Identities = 7/28 (25%), Positives = 8/28 (28%) Query: 793 GRTGPPGPSGISGPPGPPGPAGKEGLRG 820 GR G P P + L G Sbjct: 132 GRRSSYGQYEAMHNQMPMTPISRPSLPG 159 Score = 21.5 bits (45), Expect = 8.2 Identities = 7/30 (23%), Positives = 10/30 (33%) Query: 377 EEGKRGPNGEAGSAGPPGPPGLRGSPGSRG 406 EG+R G+ + P P G Sbjct: 130 REGRRSSYGQYEAMHNQMPMTPISRPSLPG 159 >AAA AAA+ ATPase Module Length = 298 Score = 23.7 bits (50), Expect = 1.5 Identities = 20/138 (14%), Positives = 41/138 (29%), Gaps = 12/138 (8%) Query: 1222 DKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLLANYASQNITYHCKNSIAYMDE 1281 D V A Y + + L + ++ + TY I +D+ Sbjct: 1 DINDV-TPNCRVALRNDSYTLHKI-LPNKVDPLVSLMMVEKVP--DSTY---EMIGGLDK 53 Query: 1282 ETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKKTNEWGKTIIEYKTNKPSRLPF 1341 + +K+ + L E VL+ G T GKT++ + F Sbjct: 54 QIKEIKEVIELP-VKHPEHFEALGIAQPKGVLLYG-PPGT---GKTLLARAVAHHTDCTF 108 Query: 1342 LDIAPLDIGGADHEFFVD 1359 + ++ ++ Sbjct: 109 IRVSGSELVQKFIGEGAR 126 Score = 22.5 bits (47), Expect = 3.9 Identities = 8/25 (32%), Positives = 10/25 (40%) Query: 380 KRGPNGEAGSAGPPGPPGLRGSPGS 404 K + EA P L G PG+ Sbjct: 67 KHPEHFEALGIAQPKGVLLYGPPGT 91 Score = 21.8 bits (45), Expect = 6.2 Identities = 5/11 (45%), Positives = 7/11 (63%) Query: 894 PGPLGIAGPPG 904 P + + GPPG Sbjct: 80 PKGVLLYGPPG 90 >SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chromatin associated domain) Length = 219 Score = 23.8 bits (51), Expect = 1.6 Identities = 11/72 (15%), Positives = 19/72 (26%), Gaps = 9/72 (12%) Query: 1217 YRSSKDKKH-VWLGETINAGSQF-EYNVEGVTSKEMATQLAFMRLLANYASQNITYHCKN 1274 + + V+ E+I A S EY + +Y + Sbjct: 91 CWINSYVGYGVFARESIPAWSYIGEYTGILRRRQA------LWLDENDYCFRYPVPRYSF 144 Query: 1275 SIAYMD-EETGN 1285 +D GN Sbjct: 145 RYFTIDSGMQGN 156 >UBA Ubiquitin pathway associated domain Length = 255 Score = 22.7 bits (48), Expect = 2.9 Identities = 15/47 (31%), Positives = 20/47 (41%) Query: 729 AGAAGQPGAKGERGAKGPKGENGVVGPTGPVGAAGPAGPNGPPGPAG 775 A + A+ + A+ +G N G G G A A GPPG G Sbjct: 83 AATTAEQPAEDDLFAQAAQGGNASSGALGTTGGATDAAQGGPPGSIG 129 Score = 21.6 bits (45), Expect = 6.9 Identities = 16/52 (30%), Positives = 22/52 (41%) Query: 347 EPGPAGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGL 398 +P A + E + + + QG SG G G +A GPPG GL Sbjct: 79 QPSTAATTAEQPAEDDLFAQAAQGGNASSGALGTTGGATDAAQGGPPGSIGL 130 Score = 21.2 bits (44), Expect = 8.3 Identities = 12/33 (36%), Positives = 15/33 (45%) Query: 674 ARGAHGAVGAPGPAGATGDRGEAGAAGPAGPAG 706 A+ A G + G G TG +A GP G G Sbjct: 97 AQAAQGGNASSGALGTTGGATDAAQGGPPGSIG 129 Score = 21.2 bits (44), Expect = 9.9 Identities = 12/40 (30%), Positives = 15/40 (37%) Query: 225 VGAPGPAGARGSDGSVGPVGPAGPIGSAGPPGFPGAPGPK 264 + A G S G++G G A GPPG G Sbjct: 95 LFAQAAQGGNASSGALGTTGGATDAAQGGPPGSIGLTVED 134 >ARR Arrestin domain Length = 454 Score = 22.9 bits (49), Expect = 3.1 Identities = 5/21 (23%), Positives = 6/21 (27%), Gaps = 4/21 (19%) Query: 551 PPGFQGLPGPSGPAGEVGKPG 571 P P + GKP Sbjct: 149 ASSVTLQPAPG----DTGKPC 165 >MATH The Meprin associated TRAF homology domain Length = 209 Score = 22.1 bits (47), Expect = 5.0 Identities = 6/16 (37%), Positives = 7/16 (43%) Query: 803 ISGPPGPPGPAGKEGL 818 +S P PP PA Sbjct: 1 MSRVPSPPPPAEMSSG 16 Score = 22.1 bits (47), Expect = 5.2 Identities = 6/14 (42%), Positives = 7/14 (49%) Query: 1095 PGPPGPPGVSGGGY 1108 P PP P +S G Sbjct: 5 PSPPPPAEMSSGPV 18 Score = 21.4 bits (45), Expect = 7.4 Identities = 9/71 (12%), Positives = 22/71 (30%), Gaps = 17/71 (23%) Query: 1148 TLLTPEGSRKNPARTCRDLRLSHPEWSSGYYW---------IDPNQGCTM-DAIKVYCDF 1197 ++L +G + R R + + + +D G D + ++C+ Sbjct: 105 SILNAKGEETKAMESQRAYRFVQGK---DWGFKKFIRRDFLLDEANGLLPDDKLTLFCEV 161 Query: 1198 S----TGETCI 1204 S + Sbjct: 162 SVVQDSVNISG 172 Score = 21.0 bits (44), Expect = 9.7 Identities = 4/18 (22%), Positives = 6/18 (33%) Query: 855 AGPPGTPGPQGLLGAPGI 872 + P P P + P Sbjct: 2 SRVPSPPPPAEMSSGPVA 19 >14-3-3 14-3-3 protein alpha Helical domain Length = 270 Score = 21.9 bits (46), Expect = 5.5 Identities = 6/34 (17%), Positives = 10/34 (28%) Query: 831 TGEVGAVGPPGFAGEKGPSGEAGTAGPPGTPGPQ 864 T + +G A + P G P+ Sbjct: 234 TSDAEYSAAAAGGNTEGAQENAPSNAPEGEAEPK 267 >DNAJ DNAJ co-chaperone domain (Hsp70 cofactors) Length = 75 Score = 21.7 bits (46), Expect = 6.1 Identities = 7/28 (25%), Positives = 10/28 (35%) Query: 1126 PSLRPKDYEVDATLKSLNNQIETLLTPE 1153 P D E +A K + E L + Sbjct: 33 PDRNQGDKEAEAKFKEIKEAYEVLTDSQ 60 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 9 Number of calls to ALIGN: 19 Length of query: 1366 Total length of test sequences: 20182 Effective length of test sequences: 16637.0 Effective search space size: 22175247.6 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= tem10gi|1418930|emb|CAA98969.1| prepro-alpha2(I) collagen [Homo sapiens] (1366 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|1815643 [227..431] Metzincin-like 27 1.6 gi|2780210 [404..848] beta/alpha (TIM)-barrel 26 3.2 gi|137178 [233..498] Ligand-binding domain of nuclear recept... 26 3.9 gi|585620 [104..580] Periplasmic binding protein-like II 26 4.0 gi|538698 [52..502] Periplasmic binding protein-like II 26 4.4 gi|227130 [11..85] Cytochrome c oxidase subunit h 26 4.9 >gi|1815643 [227..431] Metzincin-like Length = 205 Score = 27.2 bits (60), Expect = 1.6 Identities = 9/43 (20%), Positives = 15/43 (33%), Gaps = 2/43 (4%) Query: 73 LGGNFAAQYDGKGVGLGPGPMGLMGPR--GPPGAAGAPGPQGF 113 +G ++D P P +M + G G+P F Sbjct: 139 IGHLLGLEHDTTACSCEPSPECVMRQQPGRVGGGGGSPFSWQF 181 >gi|2780210 [404..848] beta/alpha (TIM)-barrel Length = 445 Score = 26.4 bits (57), Expect = 3.2 Identities = 18/163 (11%), Positives = 31/163 (18%), Gaps = 17/163 (10%) Query: 1111 GYDGDFYRADQPRSAPSLRPKDYEVDATLKSLNNQIETLLTPEGSRKNPARTCRDLRLSH 1170 GY + P + N I + D + Sbjct: 155 GYTLYDCQWPSPT---------SGMFPKALYHNCWIGNWEGEDSRSCWLHEDLADFNTEN 205 Query: 1171 PE-----WSSGYYWID-PNQGCTMDAIKVYCDFSTGETCIRAQPENIPAKNWYRSSKDKK 1224 P+ + +ID G +D + + K Sbjct: 206 PQVQNYLIGAYDKYIDMGVDGFRVDTAVHIPRT--TWNRRFLPAIQERVAQQHGAEAAKN 263 Query: 1225 HVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLLANYASQN 1267 GE + S + T A+ A Sbjct: 264 FFVFGEVAAFVNDKWNRGSVTHSAQFYTWKERKEYSADDAKAA 306 >gi|137178 [233..498] Ligand-binding domain of nuclear receptor Length = 266 Score = 26.3 bits (57), Expect = 3.9 Identities = 12/30 (40%), Positives = 16/30 (53%), Gaps = 5/30 (16%) Query: 1102 GVSGGGYDFGYDGDFYRADQPRSAPSLRPK 1131 G GGG G+DG F R +P L+P+ Sbjct: 106 GAGGGGGGLGHDGSF-----ERRSPGLQPQ 130 >gi|585620 [104..580] Periplasmic binding protein-like II Length = 477 Score = 26.1 bits (57), Expect = 4.0 Identities = 20/148 (13%), Positives = 44/148 (29%), Gaps = 28/148 (18%) Query: 1213 AKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLLANYASQNITYHC 1272 A + K+ K + T+ ++ E VT+K+ + Y S T Sbjct: 5 AADV-ALDKESKTATI--TLRKDLKWSDGSE-VTAKDYEFTYETI-ANPAYGSDRWTDS- 58 Query: 1273 KNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKK-------TNEWG 1325 +I + + + + + + + V K + Sbjct: 59 LANIVGLSDY--------HTGKAKTISGITFPDGENGKVIKVQFKEMKPGMTQSGNGYFL 110 Query: 1326 KTIIEYKTNKPSRLPFLDIAPLDIGGAD 1353 +T+ Y+ D+AP D+ + Sbjct: 111 ETVAPYQ-------YLKDVAPKDLASSP 131 >gi|538698 [52..502] Periplasmic binding protein-like II Length = 451 Score = 26.1 bits (57), Expect = 4.4 Identities = 16/105 (15%), Positives = 26/105 (24%), Gaps = 13/105 (12%) Query: 1088 PAGPPGPPGPPG-PPGVSGGGYD----------FGYDGDFYRADQPRSAPSLRPKDYE-V 1135 PAG P P Y+ GYDG + + + Sbjct: 276 PAGFNFPNYGETFDPKRKAMEYNVEEAKRLVKESGYDGTPITYHTMGNYYANAMPALMMM 335 Query: 1136 DATLKSLNNQIETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWI 1180 K + + + T P S+ +W + Y Sbjct: 336 IEMWKQIGVNVV-MKTYAPGSFPPDNQTWMRNWSNGQWMTDAYAT 379 >gi|227130 [11..85] Cytochrome c oxidase subunit h Length = 75 Score = 25.6 bits (55), Expect = 4.9 Identities = 12/37 (32%), Positives = 18/37 (48%) Query: 1182 PNQGCTMDAIKVYCDFSTGETCIRAQPENIPAKNWYR 1218 PNQ T + + Y DF E + A+ ++ WYR Sbjct: 11 PNQNQTRNCWQNYLDFHRCEKAMTAKGGDVSVCEWYR 47 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 6 Number of calls to ALIGN: 6 Length of query: 1366 Total length of test sequences: 256703 Effective length of test sequences: 210706.0 Effective search space size: 279679373.4 Initial X dropoff for ALIGN: 25.0 bits