analysis of sequence from tem38
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRICVCDNGKVLC
DDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGL
PGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGE
PGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDA
GPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGP
GSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG
QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGE
RGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIG
PPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPR
GETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSG
EPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPP
GPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPG
PPSAGFDFSFLPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESM
TDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
sec.str. with predator
> tem38_gi|1418928|emb|CAA98968.1|
. . . . .
1 MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDR 50
___HHHHHHHHHHHHHHHHH______________________________
. . . . .
51 DVWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSE 100
________EEEEE_____EEE_EEE_________________________
. . . . .
101 SPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPP 150
__________________________________________________
. . . . .
151 GPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQG 200
__________________________________________________
. . . . .
201 FQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQ 250
__________________________________________________
. . . . .
251 GARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ 300
__________________________________________________
. . . . .
301 MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG 350
__________________________________________________
. . . . .
351 AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN 400
__________________________________________________
. . . . .
401 GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGE 450
__________________________________________________
. . . . .
451 PGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADG 500
__________________________________________________
. . . . .
501 VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPD 550
__________________________________________________
. . . . .
551 GKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV 600
__________________________________________________
. . . . .
601 PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAG 650
__________________________________________________
. . . . .
651 PPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN 700
__________________________________________________
. . . . .
701 GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGP 750
__________________________________________________
. . . . .
751 KGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPG 800
__________________________________________________
. . . . .
801 DRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPP 850
__________________________________________________
. . . . .
851 GPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP 900
__________________________________________________
. . . . .
901 AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPG 950
__________________________________________________
. . . . .
951 PQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPM 1000
________EEEE______________________________________
. . . . .
1001 GPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA 1050
__________________________________________________
. . . . .
1051 PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETG 1100
__________________________________________________
. . . . .
1101 EQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAP 1150
__________________________________________________
. . . . .
1151 GKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF 1200
__________________________________________________
. . . . .
1201 LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEG 1250
______________EEE_____EEE____HHHHHHHHHHHHHHHH_____
. . . . .
1251 SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCV 1300
_______HHHHHHH_________EEE_________EEEEEEE_____EEE
. . . . .
1301 YPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI 1350
E_______EEEEEE_______EEEEE________EEE_________HHHH
. . . . .
1351 QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA 1400
HHHHHHHHHHHHHHEEEEEE_____________HHHHHHH____EEEEEE
. . . . .
1401 EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAP 1450
_____EEEEEEE____________EEEEEE_______EEEEEE_______
.
1451 DQEFGFDVGPVCFL 1464
______________
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
method : 1
alpha-contents : 0.0 %
beta-contents : 0.0 %
coil-contents : 100.0 %
class : irregular
method : 2
alpha-contents : 0.0 %
beta-contents : 0.0 %
coil-contents : 100.0 %
class : irregular
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
GPI: learning from metazoa
-16.14 -1.94 -1.37 -2.18 0.00 0.00 0.00 0.00 -0.48 -2.13 -1.80 -12.00 -12.00 0.00 0.00 0.00 -50.05
-18.84 -0.22 -0.33 0.00 0.00 0.00 0.00 0.00 -0.85 -2.10 -1.80 -12.00 -12.00 0.00 0.00 0.00 -48.15
ID: tem38_gi|1418928|emb|CAA98968.1| AC: xxx Len: 1400 1:I 1373 Sc: -48.15 Pv: 2.046476e-01 NO_GPI_SITE
GPI: learning from protozoa
-26.23 -2.20 -1.13 -0.72 -4.00 0.00 0.00 0.00 -0.08 -2.00 -7.07 -12.00 -12.00 0.00 0.00 0.00 -67.42
-24.64 -1.30 -1.78 -0.22 -4.00 0.00 0.00 0.00 -0.04 -2.20 -7.07 -12.00 -12.00 0.00 0.00 0.00 -65.26
ID: tem38_gi|1418928|emb|CAA98968.1| AC: xxx Len: 1400 1:I 1371 Sc: -65.26 Pv: 2.831094e-01 NO_GPI_SITE
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
# SignalP euk predictions
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ?
tem38_gi|14 0.931 23 Y 0.884 23 Y 0.990 10 Y 0.921 Y
# SignalP gram- predictions
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ?
tem38_gi|14 0.574 589 Y 0.485 23 Y 0.995 9 Y 0.789 Y
# SignalP gram+ predictions
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ?
tem38_gi|14 0.683 382 Y 0.334 1366 N 0.998 10 Y 0.083 N
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
low complexity regions: SEG 12 2.2 2.5
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
1-6 MFSFVD
lrlllllaatallt 7-20
21-21 H
gqeegqvegq 22-31
32-111 DEDIPPITCVQNGLRYHDRDVWKPEPCRIC
VCDNGKVLCDDVICDETKNCPGAEVPEGEC
CPVCPDGSESPTDQETTGVE
gpkgdtgprgprgpagppgrdgipgqpglp 112-157
gppgppgppgppglgg
158-177 NFAPQLSYGYDEKSTGGISV
pgpmgpsgprglpgppgapgpqgfqgppge 178-230
pgepgasgpmgprgppgppgkng
231-232 DD
geagkpgrpgergppgpqgarglpg 233-257
258-271 TAGLPGMKGHRGFS
gldgakgdagpagpkgepgspgeng 272-296
297-301 APGQM
gprglpgergrpgapgpagargndgatgaa 302-353
gppgptgpagppgfpgavgakg
354-364 EAGPQGPRGSE
gpqgvrgepgppgpagaagpagnpgadgqp 365-437
gakgangapgiagapgfpgargpsgpqgpg
gppgpkgnsgepg
438-447 APGSKGDTGA
kgepgpvgvqgppgpageegkrgargepgp 448-497
tglpgppgerggpgsrgfpg
498-511 ADGVAGPKGPAGER
gspgpagpkgspgeag 512-527
528-537 RPGEAGLPGA
kgltgspgspgpdgktgppgpagqdgrpgp 538-578
pgppgargqag
579-582 VMGF
pgpkgaagepgkage 583-597
598-598 R
gvpgppgavgpag 599-611
612-613 KD
geagaqgppgpagpagergeqgpag 614-638
639-639 S
pgfqglpgpagppgeagkpgeqg 640-662
663-685 VPGDLGAPGPSGARGERGFPGER
gvqgppgpagprgangapgndgakgdagap 686-725
gapgsqgapg
726-766 LQGMPGERGAAGLPGPKGDRGDAGPKGADG
SPGKDGVRGLT
gpigppgpagapg 767-779
780-781 DK
gesgpsgpagptgargapgdrgepgppgpa 782-861
gfagppgadgqpgakgepgdagakgdagpp
gpagpagppgpignvgapga
862-865 KGAR
gsagppgatgfpgaagrvgppgpsgnagpp 866-956
gppgpagkeggkgprgetgpagrpgevgpp
gppgpagekgspgadgpagapgtpgpqgia
g
957-991 QRGVVGLPGQRGERGFPGLPGPSGEPGKQG
PSGAS
gergppgpmgppglagppg 992-1010
1011-1039 ESGREGAPAAEGSPGRDGSPGAKGDRGET
gpagppgapgapgapgpvgpag 1040-1061
1062-1069 KSGDRGET
gpagpagpvgpvgargpagpqgprg 1070-1094
1095-1117 DKGETGEQGDRGIKGHRGFSGLQ
gppgppgspgeqgpsgasgpagprgppgsa 1118-1151
gapg
1152-1152 K
dglnglpgpigppgprgrtgdagpvgppgp 1153-1192
pgppgppgpp
1193-1216 SAGFDFSFLPQPPQEKAHDGGRYY
raddanvvrdrd 1217-1228
1229-1464 LEVDTTLKSLSQQIENIRSPEGSRKNPART
CRDLKMCHSDWKSGEYWIDPNQGCNLDAIK
VFCNMETGETCVYPTQPSVAQKNWYISKNP
KDKRHVWFGESMTDGFQFEYGGQGSDPADV
AIQLTFLRLMSTEASQNITYHCKNSVAYMD
QQTGNLKKALLLKGSNEIEIRAEGNSRFTY
SVTVDGCTSHTGAWGKTVIEYKTTKSSRLP
IIDVAPLDVGAPDQEFGFDVGPVCFL
low complexity regions: SEG 25 3.0 3.3
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
1-6 MFSFVD
lrlllllaatallt 7-20
21-81 HGQEEGQVEGQDEDIPPITCVQNGLRYHDR
DVWKPEPCRICVCDNGKVLCDDVICDETKN
C
pgaevpegeccpvcpdgsesptdqettgve 82-166
gpkgdtgprgprgpagppgrdgipgqpglp
gppgppgppgppglggnfapqlsyg
167-172 YDEKST
ggisvpgpmgpsgprglpgppgapgpqgfq 173-1195
gppgepgepgasgpmgprgppgppgkngdd
geagkpgrpgergppgpqgarglpgtaglp
gmkghrgfsgldgakgdagpagpkgepgsp
gengapgqmgprglpgergrpgapgpagar
gndgatgaagppgptgpagppgfpgavgak
geagpqgprgsegpqgvrgepgppgpagaa
gpagnpgadgqpgakgangapgiagapgfp
gargpsgpqgpggppgpkgnsgepgapgsk
gdtgakgepgpvgvqgppgpageegkrgar
gepgptglpgppgerggpgsrgfpgadgva
gpkgpagergspgpagpkgspgeagrpgea
glpgakgltgspgspgpdgktgppgpagqd
grpgppgppgargqagvmgfpgpkgaagep
gkagergvpgppgavgpagkdgeagaqgpp
gpagpagergeqgpagspgfqglpgpagpp
geagkpgeqgvpgdlgapgpsgargergfp
gergvqgppgpagprgangapgndgakgda
gapgapgsqgapglqgmpgergaaglpgpk
gdrgdagpkgadgspgkdgvrgltgpigpp
gpagapgdkgesgpsgpagptgargapgdr
gepgppgpagfagppgadgqpgakgepgda
gakgdagppgpagpagppgpignvgapgak
gargsagppgatgfpgaagrvgppgpsgna
gppgppgpagkeggkgprgetgpagrpgev
gppgppgpagekgspgadgpagapgtpgpq
giagqrgvvglpgqrgergfpglpgpsgep
gkqgpsgasgergppgpmgppglagppges
gregapaaegspgrdgspgakgdrgetgpa
gppgapgapgapgpvgpagksgdrgetgpa
gpagpvgpvgargpagpqgprgdkgetgeq
gdrgikghrgfsglqgppgppgspgeqgps
gasgpagprgppgsagapgkdglnglpgpi
gppgprgrtgdagpvgppgppgppgppgpp
sag
1196-1464 FDFSFLPQPPQEKAHDGGRYYRADDANVVR
DRDLEVDTTLKSLSQQIENIRSPEGSRKNP
ARTCRDLKMCHSDWKSGEYWIDPNQGCNLD
AIKVFCNMETGETCVYPTQPSVAQKNWYIS
KNPKDKRHVWFGESMTDGFQFEYGGQGSDP
ADVAIQLTFLRLMSTEASQNITYHCKNSVA
YMDQQTGNLKKALLLKGSNEIEIRAEGNSR
FTYSVTVDGCTSHTGAWGKTVIEYKTTKSS
RLPIIDVAPLDVGAPDQEFGFDVGPVCFL
low complexity regions: SEG 45 3.4 3.75
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
1-95 MFSFVDLRLLLLLAATALLTHGQEEGQVEG
QDEDIPPITCVQNGLRYHDRDVWKPEPCRI
CVCDNGKVLCDDVICDETKNCPGAEVPEGE
CCPVC
pdgsesptdqettgvegpkgdtgprgprgp 96-1195
agppgrdgipgqpglpgppgppgppgppgl
ggnfapqlsygydekstggisvpgpmgpsg
prglpgppgapgpqgfqgppgepgepgasg
pmgprgppgppgkngddgeagkpgrpgerg
ppgpqgarglpgtaglpgmkghrgfsgldg
akgdagpagpkgepgspgengapgqmgprg
lpgergrpgapgpagargndgatgaagppg
ptgpagppgfpgavgakgeagpqgprgseg
pqgvrgepgppgpagaagpagnpgadgqpg
akgangapgiagapgfpgargpsgpqgpgg
ppgpkgnsgepgapgskgdtgakgepgpvg
vqgppgpageegkrgargepgptglpgppg
erggpgsrgfpgadgvagpkgpagergspg
pagpkgspgeagrpgeaglpgakgltgspg
spgpdgktgppgpagqdgrpgppgppgarg
qagvmgfpgpkgaagepgkagergvpgppg
avgpagkdgeagaqgppgpagpagergeqg
pagspgfqglpgpagppgeagkpgeqgvpg
dlgapgpsgargergfpgergvqgppgpag
prgangapgndgakgdagapgapgsqgapg
lqgmpgergaaglpgpkgdrgdagpkgadg
spgkdgvrgltgpigppgpagapgdkgesg
psgpagptgargapgdrgepgppgpagfag
ppgadgqpgakgepgdagakgdagppgpag
pagppgpignvgapgakgargsagppgatg
fpgaagrvgppgpsgnagppgppgpagkeg
gkgprgetgpagrpgevgppgppgpagekg
spgadgpagapgtpgpqgiagqrgvvglpg
qrgergfpglpgpsgepgkqgpsgasgerg
ppgpmgppglagppgesgregapaaegspg
rdgspgakgdrgetgpagppgapgapgapg
pvgpagksgdrgetgpagpagpvgpvgarg
pagpqgprgdkgetgeqgdrgikghrgfsg
lqgppgppgspgeqgpsgasgpagprgppg
sagapgkdglnglpgpigppgprgrtgdag
pvgppgppgppgppgppsag
1196-1464 FDFSFLPQPPQEKAHDGGRYYRADDANVVR
DRDLEVDTTLKSLSQQIENIRSPEGSRKNP
ARTCRDLKMCHSDWKSGEYWIDPNQGCNLD
AIKVFCNMETGETCVYPTQPSVAQKNWYIS
KNPKDKRHVWFGESMTDGFQFEYGGQGSDP
ADVAIQLTFLRLMSTEASQNITYHCKNSVA
YMDQQTGNLKKALLLKGSNEIEIRAEGNSR
FTYSVTVDGCTSHTGAWGKTVIEYKTTKSS
RLPIIDVAPLDVGAPDQEFGFDVGPVCFL
low complexity regions: XNU
# Score cutoff = 21, Search from offsets 1 to 4
# both members of each repeat flagged
# lambda = 0.347, K = 0.200, H = 0.664
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
MFSFVDLRlllllaatallTHgqeegqvegqdeDIPPITCVQNGLRYHDRDVWKPEPCRI
CVCDNGKVLCDDVICDETKNCPGAEVPEGECcpvcpdgsesptdqettgvegpkgdtgpr
gprgpagppgrdgipgqpglpgppgppgppgppgLGGNFAPQLSYGYDEKSTGGISVPgp
mgpsgprglpgppgapgpqgfqgppgepgepgasgpmgprgppgppgkngddgeagkpgr
pgergppgpqgarglpgtaglpgmkghrgfsgldgakgdagpagpkgepgspgengapgq
mgprglpgergrpgapgpagargndgatgaagppgptgpagppgfpgavgakgeagpqgp
rgsegpqgvrgepgppgpagaagpagnpgadgqpgakgangapgiagapgfpgargpsgp
qgpggppgpkgnsgepgapgskgdtgakgepgpvgvqgppgpageegkrgargepgptgl
pgppgerggpgsrgfpgadgvagpkgpagergspgpagpkgspgeagrpgeaglpgakgl
tgspgspgpdgktgppgpagqdgrpgppgppgargqagvmgfpgpkgaagepgkagergv
pgppgavgpagkdgeagaqgppgpagpagergeqgpagspgfqglpgpagppgeagkpge
qgvpgdlgapgpsgargergfpgergvqgppgpagprgangapgndgakgdagapgapgs
qgapglqgmpgergaaglpgpkgdrgdagpkgadgspgkdgvrgltgpigppgpagapgd
kgesgpsgpagptgargapgdrgepgppgpagfagppgadgqpgakgepgdagakgdagp
pgpagpagppgpignvgapgakgargsagppgatgfpgaagrvgppgpsgnagppgppgp
agkeggkgprgetgpagrpgevgppgppgpagekgspgadgpagapgtpgpqgiagqrgv
vglpgqrgergfpglpgpsgepgkqgpsgasgergppgpmgppglagppgesgregapaa
egspgrdgspgakgdrgetgpagppgapgapgapgpvgpagksgdrgetgpagpagpvgp
vgargpagpqgprgdkgetgeqgdrgikghrgfsglqgppgppgspgeqgpsgasgpagp
rgppgsagapgkdglnglpgpigppgprgrtgdagpvgppgppgppgppgppsagfdfsf
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD
KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ
TGNLKKALLLKGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPII
DVAPLDVGAPDQEFGFDVGPVCFL
1 - 8 MFSFVDLR
9 - 19 ll lllaatall
20 - 21 T H
22 - 33 gqeegqveg qde
34 - 91 DIPPITC VQNGLRYHDR DVWKPEPCRI CVCDNGKVLC DDVICDETKN CPGAEVPEGE C
92 - 154 cpvcpdgse sptdqettgv egpkgdtgpr gprgpagppg rdgipgqpgl pgppgppgpp g
ppg
155 - 178 LGGNFA PQLSYGYDEK STGGISVP
179 - 1200 gp mgpsgprglp gppgapgpqg fqgppgepge pgasgpmgpr gppgppgkng ddgeagkp
gr pgergppgpq garglpgtag lpgmkghrgf sgldgakgda gpagpkgepg spgengap
gq mgprglpger grpgapgpag argndgatga agppgptgpa gppgfpgavg akgeagpq
gp rgsegpqgvr gepgppgpag aagpagnpga dgqpgakgan gapgiagapg fpgargps
gp qgpggppgpk gnsgepgapg skgdtgakge pgpvgvqgpp gpageegkrg argepgpt
gl pgppgerggp gsrgfpgadg vagpkgpage rgspgpagpk gspgeagrpg eaglpgak
gl tgspgspgpd gktgppgpag qdgrpgppgp pgargqagvm gfpgpkgaag epgkager
gv pgppgavgpa gkdgeagaqg ppgpagpage rgeqgpagsp gfqglpgpag ppgeagkp
ge qgvpgdlgap gpsgargerg fpgergvqgp pgpagprgan gapgndgakg dagapgap
gs qgapglqgmp gergaaglpg pkgdrgdagp kgadgspgkd gvrgltgpig ppgpagap
gd kgesgpsgpa gptgargapg drgepgppgp agfagppgad gqpgakgepg dagakgda
gp pgpagpagpp gpignvgapg akgargsagp pgatgfpgaa grvgppgpsg nagppgpp
gp agkeggkgpr getgpagrpg evgppgppgp agekgspgad gpagapgtpg pqgiagqr
gv vglpgqrger gfpglpgpsg epgkqgpsga sgergppgpm gppglagppg esgregap
aa egspgrdgsp gakgdrgetg pagppgapga pgapgpvgpa gksgdrgetg pagpagpv
gp vgargpagpq gprgdkgetg eqgdrgikgh rgfsglqgpp gppgspgeqg psgasgpa
gp rgppgsagap gkdglnglpg pigppgprgr tgdagpvgpp gppgppgppg ppsagfdf
sf
1201 - 1464 LPQPPQEKAH DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR
DLKMCHSDWK SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPSVAQK NWYISKNPKD
KRHVWFGESM TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ
TGNLKKALLL KGSNEIEIRA EGNSRFTYSV TVDGCTSHTG AWGKTVIEYK TTKSSRLPII
DVAPLDVGAP DQEFGFDVGP VCFL
low complexity regions: DUST
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRI
CVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPR
GPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGP
MGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGR
PGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGP
RGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGL
PGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGL
TGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGE
QGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGS
QGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGD
KGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGV
VGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPAA
EGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGP
VGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGP
RGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD
KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ
TGNLKKALLLKGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPII
DVAPLDVGAPDQEFGFDVGPVCFL
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
coiled coil prediction for tem38_gi|1418928|emb|CAA98968.1|
sequence: 1400 amino acids, 0 residue(s) in coiled coil state
. | . | . | . | . | . 60
MFSFVDLRLL LLLAATALLT HGQEEGQVEG QDEDIPPITC VQNGLRYHDR DVWKPEPCRI
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 120
CVCDNGKVLC DDVICDETKN CPGAEVPEGE CCPVCPDGSE SPTDQETTGV EGPKGDTGPR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 180
GPRGPAGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA PQLSYGYDEK STGGISVPGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 240
MGPSGPRGLP GPPGAPGPQG FQGPPGEPGE PGASGPMGPR GPPGPPGKNG DDGEAGKPGR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 300
PGERGPPGPQ GARGLPGTAG LPGMKGHRGF SGLDGAKGDA GPAGPKGEPG SPGENGAPGQ
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 360
MGPRGLPGER GRPGAPGPAG ARGNDGATGA AGPPGPTGPA GPPGFPGAVG AKGEAGPQGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 420
RGSEGPQGVR GEPGPPGPAG AAGPAGNPGA DGQPGAKGAN GAPGIAGAPG FPGARGPSGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 480
QGPGGPPGPK GNSGEPGAPG SKGDTGAKGE PGPVGVQGPP GPAGEEGKRG ARGEPGPTGL
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 540
PGPPGERGGP GSRGFPGADG VAGPKGPAGE RGSPGPAGPK GSPGEAGRPG EAGLPGAKGL
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 600
TGSPGSPGPD GKTGPPGPAG QDGRPGPPGP PGARGQAGVM GFPGPKGAAG EPGKAGERGV
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 660
PGPPGAVGPA GKDGEAGAQG PPGPAGPAGE RGEQGPAGSP GFQGLPGPAG PPGEAGKPGE
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 720
QGVPGDLGAP GPSGARGERG FPGERGVQGP PGPAGPRGAN GAPGNDGAKG DAGAPGAPGS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 780
QGAPGLQGMP GERGAAGLPG PKGDRGDAGP KGADGSPGKD GVRGLTGPIG PPGPAGAPGD
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 840
KGESGPSGPA GPTGARGAPG DRGEPGPPGP AGFAGPPGAD GQPGAKGEPG DAGAKGDAGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 900
PGPAGPAGPP GPIGNVGAPG AKGARGSAGP PGATGFPGAA GRVGPPGPSG NAGPPGPPGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 960
AGKEGGKGPR GETGPAGRPG EVGPPGPPGP AGEKGSPGAD GPAGAPGTPG PQGIAGQRGV
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1020
VGLPGQRGER GFPGLPGPSG EPGKQGPSGA SGERGPPGPM GPPGLAGPPG ESGREGAPAA
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1080
EGSPGRDGSP GAKGDRGETG PAGPPGAPGA PGAPGPVGPA GKSGDRGETG PAGPAGPVGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1140
VGARGPAGPQ GPRGDKGETG EQGDRGIKGH RGFSGLQGPP GPPGSPGEQG PSGASGPAGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1200
RGPPGSAGAP GKDGLNGLPG PIGPPGPRGR TGDAGPVGPP GPPGPPGPPG PPSAGFDFSF
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1260
LPQPPQEKAH DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~4 4467777777 7777777~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1320
DLKMCHSDWK SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPSVAQK NWYISKNPKD
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 1380
KRHVWFGESM TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . |
TGNLKKALLL KGSNEIEIRA
~~~~~~~~~~ ~~~~~~~~~~
---------- ----------
~~~~~~~~~~ ~~~~~~~~~~
~~~~~~~~~~ ~~~~~~~~~~
~~~~~~~~~~ ~~~~~~~~~~
~~~~~~~~~~ ~~~~~~~~~~
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
prediction of transmembrane regions with toppred2
***********************************
*TOPPREDM with eukaryotic function*
***********************************
tem38.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: tem38.___inter___
(1 sequences)
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDR
DVWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSE
SPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPP
GPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQG
FQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQ
GARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN
GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGE
PGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADG
VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPD
GKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAG
PPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGP
KGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPG
DRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPP
GPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPG
PQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPM
GPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETG
EQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAP
GKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEG
SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCV
YPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI
QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAP
DQEFGFDVGPVCFL
(p)rokaryotic or (e)ukaryotic: e
Charge-pair energy: 0
Length of full window (odd number!): 21
Length of core window (odd number!): 11
Number of residues to add to each end of helix: 1
Critical length: 60
Upper cutoff for candidates: 1
Lower cutoff for candidates: 0.6
Total of 8 structures are to be tested
Candidate membrane-spanning segments:
Helix Begin End Score Certainity
1 2 22 0.700 Putative
2 331 351 0.844 Putative
3 1041 1061 0.758 Putative
----------------------------------------------------------------------
Structure 1
Transmembrane segments included in this structure:
Segment 1 2 3
Loop length 1 308 689 403
K+R profile 1.00 +
+ +
CYT-EXT prof - 0.61
0.33 0.81
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 1.00
Tm probability: 0.06
-> Orientation: N-in
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: -0.54
-> Orientation: N-in
----------------------------------------------------------------------
Structure 2
Transmembrane segments included in this structure:
Segment 1 3
Loop length 1 1018 403
K+R profile 1.00 +
+
CYT-EXT prof - 0.81
0.56
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 1.00
Tm probability: 0.10
-> Orientation: N-in
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: 0.26
-> Orientation: N-out
----------------------------------------------------------------------
Structure 3
Transmembrane segments included in this structure:
Segment 1 2
Loop length 1 308 1113
K+R profile 1.00 +
+
CYT-EXT prof - 0.67
0.33
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 1.00
Tm probability: 0.15
-> Orientation: N-in
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 339144483842856283565095402520707072.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: 0.34
-> Orientation: N-out
----------------------------------------------------------------------
Structure 4
Transmembrane segments included in this structure:
Segment 1
Loop length 1 1442
K+R profile 1.00
+
CYT-EXT prof -
0.61
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 1.00
Tm probability: 0.25
-> Orientation: N-in
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 0.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: -0.61
-> Orientation: N-in
----------------------------------------------------------------------
Structure 5
Transmembrane segments included in this structure:
Segment 2 3
Loop length 330 689 403
K+R profile + +
+
CYT-EXT prof 0.31 0.81
0.61
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.24
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 0.00
(NEG-POS)/(NEG+POS): 0.1818
NEG: 39.0000
POS: 27.0000
-> Orientation: undecided
CYT-EXT difference: 0.51
-> Orientation: N-out
----------------------------------------------------------------------
Structure 6
Transmembrane segments included in this structure:
Segment 3
Loop length 1040 403
K+R profile +
+
CYT-EXT prof 0.55
0.81
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.40
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 0.00
(NEG-POS)/(NEG+POS): 0.0492
NEG: 96.0000
POS: 87.0000
-> Orientation: undecided
CYT-EXT difference: -0.26
-> Orientation: N-in
----------------------------------------------------------------------
Structure 7
Transmembrane segments included in this structure:
Segment 2
Loop length 330 1113
K+R profile +
+
CYT-EXT prof 0.31
0.67
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.61
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 0.00
(NEG-POS)/(NEG+POS): 0.1818
NEG: 39.0000
POS: 27.0000
-> Orientation: undecided
CYT-EXT difference: -0.37
-> Orientation: N-in
----------------------------------------------------------------------
Structure 8
Transmembrane segments included in this structure:
Segment
Loop length 1464
K+R profile +
CYT-EXT prof 0.61
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 1.00
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): -4.00
(NEG-POS)/(NEG+POS): 0.0444
NEG: 141.0000
POS: 129.0000
-> Orientation: N-out
CYT-EXT difference: 0.61
-> Orientation: N-out
----------------------------------------------------------------------
"tem38" 1464
2 22 #f 0.7
331 351 #f 0.84375
1041 1061 #f 0.758333
************************************
*TOPPREDM with prokaryotic function*
************************************
tem38.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: tem38.___inter___
(1 sequences)
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDR
DVWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSE
SPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPP
GPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQG
FQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQ
GARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN
GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGE
PGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADG
VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPD
GKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAG
PPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGP
KGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPG
DRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPP
GPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPG
PQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPM
GPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETG
EQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAP
GKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEG
SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCV
YPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI
QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAP
DQEFGFDVGPVCFL
(p)rokaryotic or (e)ukaryotic: p
Charge-pair energy: 0
Length of full window (odd number!): 21
Length of core window (odd number!): 11
Number of residues to add to each end of helix: 1
Critical length: 60
Upper cutoff for candidates: 1
Lower cutoff for candidates: 0.6
Total of 8 structures are to be tested
Candidate membrane-spanning segments:
Helix Begin End Score Certainity
1 2 22 0.700 Putative
2 331 351 0.844 Putative
3 1041 1061 0.758 Putative
----------------------------------------------------------------------
Structure 1
Transmembrane segments included in this structure:
Segment 1 2 3
Loop length 1 308 689 403
K+R profile 0.00 +
+ +
CYT-EXT prof - 0.61
0.33 0.81
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.06
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: -0.54
-> Orientation: N-in
----------------------------------------------------------------------
Structure 2
Transmembrane segments included in this structure:
Segment 2 3
Loop length 330 689 403
K+R profile + +
+
CYT-EXT prof 0.31 0.81
0.61
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.24
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 0.00
(NEG-POS)/(NEG+POS): 0.1818
NEG: 39.0000
POS: 27.0000
-> Orientation: undecided
CYT-EXT difference: 0.51
-> Orientation: N-out
----------------------------------------------------------------------
Structure 3
Transmembrane segments included in this structure:
Segment 1 3
Loop length 1 1018 403
K+R profile 0.00 +
+
CYT-EXT prof - 0.81
0.56
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.10
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: 0.26
-> Orientation: N-out
----------------------------------------------------------------------
Structure 4
Transmembrane segments included in this structure:
Segment 3
Loop length 1040 403
K+R profile +
+
CYT-EXT prof 0.55
0.81
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.40
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 0.00
(NEG-POS)/(NEG+POS): 0.0492
NEG: 96.0000
POS: 87.0000
-> Orientation: undecided
CYT-EXT difference: -0.26
-> Orientation: N-in
----------------------------------------------------------------------
Structure 5
Transmembrane segments included in this structure:
Segment 1 2
Loop length 1 308 1113
K+R profile 0.00 +
+
CYT-EXT prof - 0.67
0.33
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.15
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 339144483842856283565095402520707072.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: 0.34
-> Orientation: N-out
----------------------------------------------------------------------
Structure 6
Transmembrane segments included in this structure:
Segment 2
Loop length 330 1113
K+R profile +
+
CYT-EXT prof 0.31
0.67
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.61
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 0.00
(NEG-POS)/(NEG+POS): 0.1818
NEG: 39.0000
POS: 27.0000
-> Orientation: undecided
CYT-EXT difference: -0.37
-> Orientation: N-in
----------------------------------------------------------------------
Structure 7
Transmembrane segments included in this structure:
Segment 1
Loop length 1 1442
K+R profile 0.00
+
CYT-EXT prof -
0.61
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 0.25
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): 6.00
(NEG-POS)/(NEG+POS): 0.0000
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: -0.61
-> Orientation: N-in
----------------------------------------------------------------------
Structure 8
Transmembrane segments included in this structure:
Segment
Loop length 1464
K+R profile +
CYT-EXT prof 0.61
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 0.00
Tm probability: 1.00
-> Orientation: undecided
Charge-difference over N-terminal Tm (+-15 residues): -4.00
(NEG-POS)/(NEG+POS): 0.0444
NEG: 141.0000
POS: 129.0000
-> Orientation: N-out
CYT-EXT difference: 0.61
-> Orientation: N-out
----------------------------------------------------------------------
"tem38" 1464
2 22 #f 0.7
331 351 #f 0.84375
1041 1061 #f 0.758333
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem38.___saps___
SAPS. Version of April 11, 1996.
Date run: Tue Oct 31 18:34:55 2000
File: /people/maria/tem38.___saps___
ID tem38_gi|1418928|emb|CAA98968.1|
DE prepro-alpha1(I) collagen [Homo sapiens]
number of residues: 1464; molecular weight: 138.9 kdal
1 MFSFVDLRLL LLLAATALLT HGQEEGQVEG QDEDIPPITC VQNGLRYHDR DVWKPEPCRI
61 CVCDNGKVLC DDVICDETKN CPGAEVPEGE CCPVCPDGSE SPTDQETTGV EGPKGDTGPR
121 GPRGPAGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA PQLSYGYDEK STGGISVPGP
181 MGPSGPRGLP GPPGAPGPQG FQGPPGEPGE PGASGPMGPR GPPGPPGKNG DDGEAGKPGR
241 PGERGPPGPQ GARGLPGTAG LPGMKGHRGF SGLDGAKGDA GPAGPKGEPG SPGENGAPGQ
301 MGPRGLPGER GRPGAPGPAG ARGNDGATGA AGPPGPTGPA GPPGFPGAVG AKGEAGPQGP
361 RGSEGPQGVR GEPGPPGPAG AAGPAGNPGA DGQPGAKGAN GAPGIAGAPG FPGARGPSGP
421 QGPGGPPGPK GNSGEPGAPG SKGDTGAKGE PGPVGVQGPP GPAGEEGKRG ARGEPGPTGL
481 PGPPGERGGP GSRGFPGADG VAGPKGPAGE RGSPGPAGPK GSPGEAGRPG EAGLPGAKGL
541 TGSPGSPGPD GKTGPPGPAG QDGRPGPPGP PGARGQAGVM GFPGPKGAAG EPGKAGERGV
601 PGPPGAVGPA GKDGEAGAQG PPGPAGPAGE RGEQGPAGSP GFQGLPGPAG PPGEAGKPGE
661 QGVPGDLGAP GPSGARGERG FPGERGVQGP PGPAGPRGAN GAPGNDGAKG DAGAPGAPGS
721 QGAPGLQGMP GERGAAGLPG PKGDRGDAGP KGADGSPGKD GVRGLTGPIG PPGPAGAPGD
781 KGESGPSGPA GPTGARGAPG DRGEPGPPGP AGFAGPPGAD GQPGAKGEPG DAGAKGDAGP
841 PGPAGPAGPP GPIGNVGAPG AKGARGSAGP PGATGFPGAA GRVGPPGPSG NAGPPGPPGP
901 AGKEGGKGPR GETGPAGRPG EVGPPGPPGP AGEKGSPGAD GPAGAPGTPG PQGIAGQRGV
961 VGLPGQRGER GFPGLPGPSG EPGKQGPSGA SGERGPPGPM GPPGLAGPPG ESGREGAPAA
1021 EGSPGRDGSP GAKGDRGETG PAGPPGAPGA PGAPGPVGPA GKSGDRGETG PAGPAGPVGP
1081 VGARGPAGPQ GPRGDKGETG EQGDRGIKGH RGFSGLQGPP GPPGSPGEQG PSGASGPAGP
1141 RGPPGSAGAP GKDGLNGLPG PIGPPGPRGR TGDAGPVGPP GPPGPPGPPG PPSAGFDFSF
1201 LPQPPQEKAH DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR
1261 DLKMCHSDWK SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPSVAQK NWYISKNPKD
1321 KRHVWFGESM TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ
1381 TGNLKKALLL KGSNEIEIRA EGNSRFTYSV TVDGCTSHTG AWGKTVIEYK TTKSSRLPII
1441 DVAPLDVGAP DQEFGFDVGP VCFL
--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s)
A :141( 9.6%); C : 18( 1.2%); D : 66( 4.5%); E : 75( 5.1%); F : 27( 1.8%)
G++:390(26.6%); H : 9( 0.6%); I- : 24( 1.6%); K : 58( 4.0%); L--: 48( 3.3%)
M : 13( 0.9%); N : 28( 1.9%); P++:278(19.0%); Q : 48( 3.3%); R : 71( 4.8%)
S : 61( 4.2%); T- : 43( 2.9%); V- : 47( 3.2%); W : 6( 0.4%); Y- : 13( 0.9%)
KR : 129 ( 8.8%); ED : 141 ( 9.6%); AGP ++: 809 ( 55.3%);
KRED : 270 ( 18.4%); KR-ED : -12 ( -0.8%); FIKMNY- : 163 ( 11.1%);
LVIFM --: 159 ( 10.9%); ST - : 104 ( 7.1%).
--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS
1 00000-0+00 0000000000 000--000-0 0---000000 00000+00-+ -00+0-00+0
61 000-00+000 --000--0+0 0000-00-0- 000000-00- 000-0-0000 -00+0-000+
121 00+0000000 +-00000000 0000000000 0000000000 0000000--+ 0000000000
181 000000+000 0000000000 000000-00- 000000000+ 0000000+00 --0-00+00+
241 00-+000000 00+0000000 0000+00+00 000-00+0-0 00000+0-00 000-000000
301 000+0000-+ 0+00000000 0+00-00000 0000000000 0000000000 0+0-000000
361 +00-00000+ 0-00000000 0000000000 -00000+000 0000000000 0000+00000
421 000000000+ 0000-00000 0+0-000+0- 0000000000 0000--0++0 0+0-000000
481 00000-+000 00+00000-0 0000+0000- +00000000+ 0000-00+00 -000000+00
541 000000000- 0+00000000 0-0+000000 000+000000 00000+0000 -00+00-+00
601 0000000000 0+-0-00000 000000000- +0-0000000 0000000000 000-00+00-
661 00000-0000 00000+0-+0 000-+00000 000000+000 00000-00+0 -000000000
721 0000000000 0-+0000000 0+0-+0-000 +00-0000+- 00+0000000 000000000-
781 +0-0000000 00000+0000 -+0-000000 000000000- 00000+0-00 -000+0-000
841 0000000000 0000000000 0+00+00000 0000000000 0+00000000 0000000000
901 00+-00+00+ 0-00000+00 -000000000 00-+00000- 0000000000 0000000+00
961 000000+0-+ 0000000000 -00+000000 00-+000000 0000000000 -00+-00000
1021 -0000+-000 00+0-+0-00 0000000000 0000000000 0+00-+0-00 0000000000
1081 000+000000 00+0-+0-00 -00-+00+00 +000000000 0000000-00 0000000000
1141 +000000000 0+-0000000 0000000+0+ 00-0000000 0000000000 000000-000
1201 000000-+00 -00+00+0-- 0000+-+-0- 0-000+0000 00-00+00-0 0++000+00+
1261 -0+0000-0+ 00-000-000 0000-00+00 000-00-000 000000000+ 00000+00+-
1321 ++00000-00 0-0000-000 000-00-000 00000+0000 -000000000 +000000-00
1381 0000++0000 +000-0-0+0 -000+00000 00-0000000 000+000-0+ 00+00+0000
1441 -0000-0000 -0-000-000 0000
A. CHARGE CLUSTERS.
Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none
Negative charge clusters (cmin = 10/30 or 13/45 or 16/60): none
Mixed charge clusters (cmin = 15/30 or 20/45 or 24/60): none
B. HIGH SCORING (UN)CHARGED SEGMENTS.
There are no high scoring positive charge segments.
There are no high scoring negative charge segments.
There are no high scoring mixed charge segments.
There are no high scoring uncharged segments.
C. CHARGE RUNS AND PATTERNS.
pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)|
lmin0 5 | 5 | 7 | 50 | 10 | 10 | 13 | 12 | 12 | 16 | 6 | 7 |
lmin1 6 | 6 | 9 | 60 | 12 | 12 | 16 | 15 | 15 | 20 | 7 | 9 |
lmin2 7 | 8 | 10 | 67 | 13 | 14 | 18 | 17 | 17 | 22 | 8 | 10 |
(Significance level: 0.010000; Minimal displayed length: 6)
There are no charge runs or patterns exceeding the given minimal lengths.
Run count statistics:
+ runs >= 3: 0
- runs >= 3: 1, at 32;
* runs >= 5: 0
0 runs >= 33: 1, at 133;
--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES
1. HIGH SCORING SEGMENTS.
There are no high scoring hydrophobic segments.
There are no high scoring transmembrane segments.
2. SPACINGS OF C.
H2N-39-C-17-C-2-C-1-C-6-C-4-C-5-C-9-C-C-2-C-1163-C-5-C-16-C-8-C-7-C-70-C-44-C-46-C-2-COOH
2*. SPACINGS OF C and H. (additional deluxe function for ALEX)
H2N-20-H-18-C-7-H-9-C-2-C-1-C-6-C-4-C-5-C-9-C-C-2-C-171-H-842-H-99-H-48-C-5-C-H-15-C-8-C-7-C-23-H-45-H-C-44-C-2-H-43-C-2-COOH
--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.
A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length: 5
Aligned matching blocks:
[ 112- 116] GPKGD
[ 740- 744] GPKGD
______________________________
[ 114- 118] KGDTG
[ 442- 446] KGDTG
______________________________
[ ]--------[ ]--------[ 118- 122]-( -5)-[ 118- 125]-( -8)-
[ ]--------[ ]--------[ 121- 125]-( -5)-[ 121- 128]--------
[ 206- 207]-( 4)-[ 212- 224]-( -7)-[ 218- 222]-( -5)-[ 218- 225]-( -11)-
[1127-1128]-( 4)-[1133-1145]-( -7)-[1139-1143]--------[ ]--------
[ 118- 130]
[ ]
[ 215- 227]
[ ]
[ 212- 224] GASGPMGPRGPPG
[1133-1145] GASGPAGPRGPPG
[ 118- 122] GPRGP
[ 121- 125] GPRGP
[ 218- 222] GPRGP
[1139-1143] GPRGP
[ 118- 125] GPRGPRGP
[ 121- 128] GPRGPAGP
[ 218- 225] GPRGPPGP
[ 118- 130] GPRGPRGPAGPPG
[ 215- 227] GPMGPRGPPGPPG
______________________________
[ 123- 130] RGPAGPPG
[ 220- 227] RGPPGPPG
[ 415- 424] RGPSGPQGPG
[ 865- 872] RGSAGP__PG
with superset:
[ 123- 128] RGPAGP
[ 220- 225] RGPPGP
[ 244- 249] RGPPGP
[ 415- 420] RGPSGP
[ 745- 750] RGDAGP
[ 865- 870] RGSAGP
[ 994- 999] RGPPGP
[1084-1089] RGPAGP
______________________________
[ 126- 130] AGPPG
[ 331- 336] AGPPGP
[ 340- 344] AGPPG
[ 649- 654] AGPPGE
[ 814- 819] AGPPGA
[ 838- 843] AGPPGP
[ 847- 852] AGPPGP
[ 868- 873] AGPPGA
[ 892- 897] AGPPGP
[1006-1011] AGPPGE
[1042-1047] AGPPGA
______________________________
[ 129- 136] PGRDGIPG
[1024-1031] PGRDGSPG
______________________________
[ 139- 149] GLPGPPGPPGP
[ 188- 198] GLPGPPGAPGP
with superset:
[ 139- 143] GLPGP
[ 188- 192] GLPGP
[ 479- 483] GLPGP
[ 644- 648] GLPGP
[ 737- 741] GLPGP
[ 974- 978] GLPGP
[1157-1161] GLPGP
and:
[ 139- 145] GLPGPPG
[ 188- 194] GLPGPPG
[ 479- 485] GLPGPPG
______________________________
[ 145- 155] GPPGPPGPPGL
[ 995-1005] GPPGPMGPPGL
______________________________
[ 179- 186] GPMGPSGP
[ 215- 222] GPMGPRGP
with superset:
[ 179- 183] GPMGP
[ 215- 219] GPMGP
[ 998-1002] GPMGP
______________________________
[ 185- 194] GPRGLPGPPG
[ 218- 227] GPRGPPGPPG
[ 476- 485] GPTGLPGPPG
with superset:
[ 185- 191] GPRGLPG
[ 218- 224] GPRGPPG
[ 251- 257] GARGLPG
[ 302- 308] GPRGLPG
[ 476- 482] GPTGLPG
[1139-1145] GPRGPPG
______________________________
[ 187- 191] RGLPG
[ 253- 257] RGLPG
[ 304- 308] RGLPG
______________________________
[ 191- 197] GPPGAPG
[1043-1049] GPPGAPG
______________________________
[ 196- 200] PGPQG
[ 247- 251] PGPQG
[ 949- 953] PGPQG
______________________________
[ 203- 207] GPPGE
[ 482- 486] GPPGE
[ 650- 654] GPPGE
[1007-1011] GPPGE
______________________________
[ 208- 227] PGEPGASGPMGPRGPPGPPG
[ 409- 428] PGFPGARGPSGPQGPGGPPG
with superset:
[ 190- 195] PGPPGA
[ 208- 213] PGEPGA
[ 343- 348] PGFPGA
[ 409- 414] PGFPGA
[ 568- 573] PGPPGA
[ 601- 606] PGPPGA
[1045-1050] PGAPGA
and:
[ 190- 198] PGPPGAPGP
[ 208- 216] PGEPGASGP
[ 409- 417] PGFPGARGP
[ 601- 609] PGPPGAVGP
______________________________
[ 191- 192]-( 4)-[ 197- 198]-( 4)-[ 203- 204]-( 4)-[ 209- 213]
[ 416- 417]-( 4)-[ 422- 423]-( 4)-[ 428- 429]-( 4)-[ 434- 438]
[ 209- 213] GEPGA
[ 434- 438] GEPGA
______________________________
[ 205- 213]-( 3)-[ 217- 224]
[ 289- 297]-( 3)-[ 301- 308]
[ 205- 213] PGEPGEPGA
[ 289- 297] PGSPGENGA
[ 217- 224] MGPRGPPG
[ 301- 308] MGPRGLPG
______________________________
[ 232- 236] DGEAG
[ 613- 617] DGEAG
______________________________
[ 233- 234]-( 4)-[ 239- 249]
[ 911- 912]-( 4)-[ 917- 927]
[ 239- 249] GRPGERGPPGP
[ 917- 927] GRPGEVGPPGP
with superset:
[ 239- 243] GRPGE
[ 527- 531] GRPGE
[ 917- 921] GRPGE
______________________________
[ 241- 248] PGERGPPG
[ 307- 314] PGERGRPG
[ 484- 491] PGERGGPG
with superset:
[ 241- 245] PGERG
[ 307- 311] PGERG
[ 484- 488] PGERG
[ 682- 686] PGERG
[ 730- 734] PGERG
______________________________
[ ]--------[ 244- 248]-( -5)-[ 244- 252]
[ 982- 999]-( -6)-[ 994- 998]--------[ ]
[1126-1143]-( -3)-[1141-1145]-( -5)-[1141-1149]
[ 982- 999] PGKQGPSGASGERGPPGP
[1126-1143] PGEQGPSGASGPAGPRGP
with superset:
[ 671- 675] GPSGA
[ 986- 990] GPSGA
[1130-1134] GPSGA
[ 244- 248] RGPPG
[ 994- 998] RGPPG
[1141-1145] RGPPG
[ 244- 252] RGPPGPQGA
[1141-1149] RGPPGSAGA
______________________________
[ 259- 266]-( -8)-[ 259- 269]
[ 532- 539]--------[ ]
[ 736- 743]-( -8)-[ 736- 746]
[ 259- 266] AGLPGMKG
[ 532- 539] AGLPGAKG
[ 736- 743] AGLPGPKG
[ 259- 269] AGLPGMKGHRG
[ 736- 746] AGLPGPKGDRG
______________________________
[ 265- 273] KGHRGFSGL
[1108-1116] KGHRGFSGL
______________________________
[ 281- 290]-( -8)-[ 283- 287]
[ ]--------[ 502- 506]
[ 515- 524]-( -8)-[ 517- 521]
[ ]--------[ 748- 752]
[ 281- 290] GPAGPKGEPG
[ 515- 524] GPAGPKGSPG
[ 283- 287] AGPKG
[ 502- 506] AGPKG
[ 517- 521] AGPKG
[ 748- 752] AGPKG
______________________________
[ 286- 290] KGEPG
[ 448- 452] KGEPG
[ 826- 830] KGEPG
______________________________
[ 289- 294] PGSPGE
[ 544- 548] PGSPG
[1123-1128] PGSPGE
______________________________
[ 284- 285]-( 4)-[ 290- 294]
[ 515- 516]-( 4)-[ 521- 525]
[1118-1119]-( 4)-[1124-1128]
[ 290- 294] GSPGE
[ 521- 525] GSPGE
[1124-1128] GSPGE
______________________________
[ 314- 323] GAPGPAGARG
[ 668- 677] GAPGPSGARG
with superset:
[ 194- 198] GAPGP
[ 314- 318] GAPGP
[ 668- 672] GAPGP
[1052-1056] GAPGP
______________________________
[ 320- 335] GARGNDGATGAAGPPG
[ 701- 716] GAPGNDGAKGDAGAPG
with superset:
[ 274- 281] DGAKGDAG
[ 325- 332] DGATGAAG
[ 706- 713] DGAKGDAG
______________________________
[ 337- 341]-( -5)-[ 337- 351]-( -15)-[ 337- 344]-( 10)-[ 355- 362]
[ 913- 917]--------[ ]--------[ 913- 920]--------[ ]
[1039-1043]-( -5)-[1039-1053]-( -15)-[1039-1046]--------[ ]
[1069-1073]--------[ ]--------[ ]--------[1087-1094]
[ 337- 341] TGPAG
[ 913- 917] TGPAG
[1039-1043] TGPAG
[1069-1073] TGPAG
[ 337- 351] TGPAGPPGFPGAVGA
[1039-1053] TGPAGPPGAPGAPGA
[ 337- 344] TGPAGPPG
[ 913- 920] TGPAGRPG
[1039-1046] TGPAGPPG
[ 355- 362] AGPQGPRG
[1087-1094] AGPQGPRG
with superset:
[ 356- 360] GPQGP
[ 419- 423] GPQGP
[1088-1092] GPQGP
______________________________
[ 343- 350] PGFPGAVG
[ 601- 608] PGPPGAVG
______________________________
[ ]--------[ 344- 348]
[ ]--------[ 410- 414]
[ 446- 452]-( 41)-[ 494- 498]
[ 824- 830]-( 44)-[ 875- 879]
[ 446- 452] GAKGEPG
[ 824- 830] GAKGEPG
with superset:
[ 350- 354] GAKGE
[ 446- 450] GAKGE
[ 824- 828] GAKGE
[ 344- 348] GFPGA
[ 410- 414] GFPGA
[ 494- 498] GFPGA
[ 875- 879] GFPGA
______________________________
[ 370- 384] RGEPGPPGPAGAAGP
[ 802- 816] RGEPGPPGPAGFAGP
with superset:
[ 370- 377] RGEPGPPG
[ 469- 476] RGARGEPG
[ 676- 683] RGERGFPG
[ 802- 809] RGEPGPPG
[ 967- 974] RGERGFPG
______________________________
[ 377- 378]-( 3)-[ 382- 389]
[ 839- 840]-( 3)-[ 844- 851]
[ 382- 389] AGPAGNPG
[ 844- 851] AGPAGPPG
with superset:
[ 280- 284] AGPAG
[ 382- 386] AGPAG
[ 625- 629] AGPAG
[ 844- 848] AGPAG
[1072-1076] AGPAG
______________________________
[ 388- 392]-( -11)-[ 382- 398]--------[ ]--------[ 433- 437]
[ 496- 500]--------[ ]--------[ 496- 503]--------[ ]
[ 817- 821]-( -11)-[ 811- 827]--------[ ]--------[ ]
[ 937- 941]--------[ ]--------[ 937- 944]-( 34)-[ 979- 983]
[ 388- 392] PGADG
[ 496- 500] PGADG
[ 817- 821] PGADG
[ 937- 941] PGADG
with superset:
[ 388- 396] PGADGQPGA
[ 817- 825] PGADGQPGA
[ 937- 945] PGADGPAGA
[ 382- 398] AGPAGNPGADGQPGAKG
[ 811- 827] AGFAGPPGADGQPGAKG
with superset:
[ 388- 396] PGADGQPGA
[ 817- 825] PGADGQPGA
[ 937- 945] PGADGPAGA
[ 496- 503] PGADGVAG
[ 937- 944] PGADGPAG
[ 433- 437] SGEPG
[ 979- 983] SGEPG
______________________________
[ 398- 408] GANGAPGIAGA
[ 698- 708] GANGAPGNDGA
with superset:
[ 295- 299] NGAPG
[ 400- 404] NGAPG
[ 700- 704] NGAPG
______________________________
[ 413- 423] GARGPSGPQGP
[1082-1092] GARGPAGPQGP
______________________________
[ 416- 423] GPSGPQGP
[ 785- 792] GPSGPAGP
with superset:
[ 182- 186] GPSGP
[ 416- 420] GPSGP
[ 785- 789] GPSGP
______________________________
[ 409- 416]-( 10)-[ 427- 431]-( -5)-[ 427- 437]
[ 568- 575]-( 7)-[ 583- 587]-( -5)-[ 583- 593]
[ ]--------[ 739- 743]--------[ ]
[ 409- 416] PGFPGARG
[ 568- 575] PGPPGARG
[ 427- 431] PGPKG
[ 583- 587] PGPKG
[ 739- 743] PGPKG
[ 427- 437] PGPKGNSGEPG
[ 583- 593] PGPKGAAGEPG
______________________________
[ 449- 453] GEPGP
[ 473- 477] GEPGP
[ 803- 807] GEPGP
______________________________
[ 446- 447]-( 3)-[ 451- 455]
[1049-1050]-( 3)-[1054-1058]
[ 451- 455] PGPVG
[1054-1058] PGPVG
______________________________
[ 473- 474]-( 4)-[ 479- 486]
[ 968- 969]-( 4)-[ 974- 981]
[ 479- 486] GLPGPPGE
[ 974- 981] GLPGPSGE
______________________________
[ 493- 512] RGFPGADGVAGPKGPAGERG
[ 679- 698] RGFPGERGVQGPPGPAGPRG
with superset:
[ 493- 497] RGFPG
[ 679- 683] RGFPG
[ 970- 974] RGFPG
______________________________
[ 503- 518]-( -11)-[ 508- 516]
[ ]--------[ 595- 603]
[ 623- 638]-( -11)-[ 628- 636]
[ 503- 518] GPKGPAGERGSPGPAG
[ 623- 638] GPAGPAGERGEQGPAG
[ 508- 516] AGERGSPGP
[ 595- 603] AGERGVPGP
[ 628- 636] AGERGEQGP
______________________________
[ 512- 516] GSPGP
[ 545- 549] GSPGP
______________________________
[ 514- 524] PGPAGPKGSPG
[ 928- 938] PGPAGEKGSPG
______________________________
[ 523- 536] PGEAGRPGEAGLPG
[ 646- 659] PGPAGPPGEAGKPG
with superset:
[ 526- 531] AGRPGE
[ 649- 654] AGPPGE
[ 655- 660] AGKPGE
[ 916- 921] AGRPGE
[1006-1011] AGPPGE
______________________________
[ 569- 576] GPPGARGQ
[ 815- 822] GPPGADGQ
with superset:
[ 569- 573] GPPGA
[ 602- 606] GPPGA
[ 815- 819] GPPGA
[ 869- 873] GPPGA
[1043-1047] GPPGA
______________________________
[ 584- 588] GPKGA
[ 749- 753] GPKGA
______________________________
[ 590- 594] GEPGK
[ 980- 984] GEPGK
______________________________
[ 601- 612]-( -9)-[ 604- 612]
[ ]--------[ 895- 903]
[1051-1062]-( -9)-[1054-1062]
[ 601- 612] PGPPGAVGPAGK
[1051-1062] PGAPGPVGPAGK
[ 604- 612] PGAVGPAGK
[ 895- 903] PGPPGPAGK
[1054-1062] PGPVGPAGK
______________________________
[ 614- 615]-( 3)-[ 619- 627]
[ 683- 684]-( 3)-[ 688- 696]
[ 619- 627] QGPPGPAGP
[ 688- 696] QGPPGPAGP
with superset:
[ 202- 206] QGPPG
[ 457- 461] QGPPG
[ 619- 623] QGPPG
[ 688- 692] QGPPG
[1117-1121] QGPPG
and:
[ 457- 463] QGPPGPA
[ 619- 625] QGPPGPA
[ 688- 694] QGPPGPA
______________________________
[ 619- 630] QGPPGPAGPAGE
[ 643- 654] QGLPGPAGPPGE
with superset:
[ 421- 426] QGPGGP
[ 457- 462] QGPPGP
[ 619- 624] QGPPGP
[ 643- 648] QGLPGP
[ 688- 693] QGPPGP
[1117-1122] QGPPGP
and:
[ 457- 464] QGPPGPAG
[ 619- 626] QGPPGPAG
[ 643- 650] QGLPGPAG
[ 688- 695] QGPPGPAG
______________________________
[ 623- 627]-( 4)-[ 632- 636]
[1118-1122]-( 4)-[1127-1131]
[ 623- 627] GPAGP
[1118-1122] GPPGP
[ 632- 636] GEQGP
[1127-1131] GEQGP
______________________________
[ 644- 648]-( -5)-[ 644- 653]
[ 737- 741]--------[ ]
[ 974- 978]--------[ ]
[1157-1161]-( -5)-[1157-1166]
[ 644- 648] GLPGP
[ 737- 741] GLPGP
[ 974- 978] GLPGP
[1157-1161] GLPGP
[ 644- 653] GLPGPAGPPG
[1157-1166] GLPGPIGPPG
______________________________
[ 650- 653]-( 4)-[ 658- 662]
[1118-1121]-( 4)-[1126-1130]
[ 650- 653] GPPG
[1118-1121] GPPG
[ 658- 662] PGEQG
[1126-1130] PGEQG
______________________________
[ 670- 674]--------[ ]
[ 886- 890]-( 44)-[ 935- 939]
[ 976- 980]-( 47)-[1028-1032]
[ 670- 674] PGPSG
[ 886- 890] PGPSG
[ 976- 980] PGPSG
[ 935- 939] GSPGA
[1028-1032] GSPGA
______________________________
[ 682- 686] PGERG
[ 730- 734] PGERG
______________________________
[ 703- 716] PGNDGAKGDAGAPG
[ 829- 842] PGDAGAKGDAGPPG
with superset:
[ 275- 279] GAKGD
[ 707- 711] GAKGD
[ 833- 837] GAKGD
[1031-1035] GAKGD
and:
[ 275- 281] GAKGDAG
[ 707- 713] GAKGDAG
[ 833- 839] GAKGDAG
______________________________
[ 710- 714] GDAGA
[ 830- 834] GDAGA
______________________________
[ 707- 708]-( 3)-[ 712- 722]
[ 938- 939]-( 3)-[ 943- 953]
[ 712- 722] AGAPGAPGSQG
[ 943- 953] AGAPGTPGPQG
with superset:
[ 406- 410] AGAPG
[ 712- 716] AGAPG
[ 775- 779] AGAPG
[ 943- 947] AGAPG
[1147-1151] AGAPG
and:
[ 406- 413] AGAPGFPG
[ 712- 719] AGAPGAPG
[ 943- 950] AGAPGTPG
______________________________
[ 739- 750] PGPKGDRGDAGP
[1030-1041] PGAKGDRGETGP
______________________________
[ 754- 758] DGSPG
[1027-1031] DGSPG
______________________________
[ 757- 774] PGKDGVRGLTGPIGPPGP
[1150-1167] PGKDGLNGLPGPIGPPGP
______________________________
[ 773- 779] GPAGAPG
[ 941- 947] GPAGAPG
______________________________
[ 775- 779] AGAPG
[ 943- 947] AGAPG
[1147-1151] AGAPG
______________________________
[ 767- 771]-( 4)-[ 776- 791]
[ 788- 792]-( 4)-[ 797- 812]
[ 767- 771] GPIGP
[ 788- 792] GPAGP
[ 776- 791] GAPGDKGESGPSGPAG
[ 797- 812] GAPGDRGEPGPPGPAG
______________________________
[ 770- 774]-( 4)-[ 779- 783]
[1085-1089]-( 4)-[1094-1098]
[ 770- 774] GPPGP
[1085-1089] GPAGP
[ 779- 783] GDKGE
[1094-1098] GDKGE
______________________________
[ 800- 819] GDRGEPGPPGPAGFAGPPGA
[1064-1083] GDRGETGPAGPAGPVGPVGA
______________________________
[ 836- 852] GDAGPPGPAGPAGPPGP
[1172-1188] GDAGPVGPPGPPGPPGP
______________________________
[ 850- 854] PGPIG
[1159-1163] PGPIG
______________________________
[ 859- 870] PGAKGARGSAGP
[1030-1041] PGAKGDRGETGP
______________________________
[ 884- 885]-( 3)-[ 889- 902]
[1130-1131]-( 3)-[1135-1148]
[ 889- 902] SGNAGPPGPPGPAG
[1135-1148] SGPAGPRGPPGSAG
with superset:
[ 214- 219] SGPMGP
[ 418- 423] SGPQGP
[ 787- 792] SGPAGP
[ 889- 894] SGNAGP
[1135-1140] SGPAGP
and:
[ 214- 224] SGPMGPRGPPG
[ 418- 428] SGPQGPGGPPG
[ 889- 899] SGNAGPPGPPG
[1135-1145] SGPAGPRGPPG
______________________________
[ 922- 930] VGPPGPPGP
[1177-1185] VGPPGPPGP
with superset:
[ 883- 888] VGPPGP
[ 922- 927] VGPPGP
[1177-1182] VGPPGP
______________________________
[1055-1059] GPVGP
[1076-1080] GPVGP
[1175-1179] GPVGP
______________________________
[1114-1125] SGLQGPPGPPGS
[1135-1146] SGPAGPRGPPGS
______________________________
Simple tandem repeat:
[ 523- 528] PGEAGR
[ 529- 534] PGEAGL
[ 535- 540] PGAKGL
Highly repetitive regions:
From 118 to 1192 with major motif GERGPPGPA.
From 124 to 1141 with major motif GPAGPP.
From 138 to 1192 with major motif PGPPGPP.
From 141 to 1192 with major motif PGPPGPA.
From 141 to 1191 with major motif PGPPGP.
From 142 to 1192 with major motif GPPGPP.
From 187 to 1072 with major motif RGEPGPP.
From 280 to 1180 with major motif AGPPGPP.
From 280 to 1182 with major motif AGPPGPRGP.
From 316 to 933 with major motif PGPAGP.
B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
(i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length: 9
Aligned matching blocks:
[ 747- 761] -ssp+ss-sops+-s
[1015-1028] -ssp_ss-sops+-s
______________________________
[ 776- 800] ssps-+s-ospospssposs+ssps
[1031-1055] ss+s-+s-ospssppsspsspssps
with superset:
[ 112- 122] sp+s-osp+sp
[ 275- 285] ss+s-sspssp
[ 509- 519] s-+sopspssp
[ 779- 789] s-+s-osposp
[ 800- 810] s-+s-psppsp
[1034-1044] s-+s-ospssp
[1064-1074] s-+s-ospssp
and:
[ 779- 791] s-+s-osposp_ss
[ 800- 812] s-+s-psppspss
[1034-1047] s-+s-ospssppss
[1064-1076] s-+s-ospsspss
and:
[ 779- 798] s-+s-ospos_pssposs+ss
[ 800- 819] s-+s-pspps_pssissppss
[1034-1053] s-+s-ospssppssps_spss
--------------------------------------------------------------------------------
MULTIPLETS.
A. AMINO ACID ALPHABET.
1. Total number of amino acid multiplets: 83 (Expected range: 118--189) low
1 ........LL LLLAA..LL. ...EE..... .....PP... .......... ..........
61 .......... DD........ .......... CC........ ......TT.. ..........
121 .......PP. .......... ..PP.PP.PP .PP..GG... .......... ..GG......
181 .......... .PP....... ...PP..... .......... .PP.PP.... DD........
241 .....PP... .......... .......... .......... .......... ..........
301 .......... .......... .........A A.PP...... .PP....... ..........
361 .......... ....PP.... AA........ .......... .......... ..........
421 ...GGPP... .......... .......... ........PP ....EE.... ..........
481 ..PP...GG. .......... .......... .......... .......... ..........
541 .......... ....PP.... ......PP.P P......... .......AA. ..........
601 ..PP...... .......... PP........ .......... .......... PP........
661 .......... .......... .........P P......... .......... ..........
721 .......... ....AA.... .......... .......... .......... PP........
781 .......... .......... ......PP.. .....PP... .......... .........P
841 P.......PP .......... .........P P.......AA ....PP.... ...PP.PP..
901 ....GG.... .......... ...PP.PP.. .......... .......... .........V
961 V......... .......... .......... .....PP... .PP....PP. ........AA
1021 .......... .......... ...PP..... .......... .......... ..........
1081 .......... .......... .......... ........PP .PP....... ..........
1141 ..PP...... .......... ...PP..... ........PP .PP.PP.PP. PP........
1201 ...PP..... .GG.YY..DD ..VV...... ..TT.....Q Q......... ..........
1261 .......... .......... .......... .......... .......... ..........
1321 .......... ........GG .......... .......... .......... ........QQ
1381 ....KK.LLL .......... .......... .......... .......... TT.SS...II
1441 .......... .......... ....
2. Histogram of spacings between consecutive amino acid multiplets:
(1-5) 32 (6-10) 11 (11-20) 21 (>=21) 20
3. Clusters of amino acid multiplets (cmin = 10/30 or 13/45 or 16/60): none
4. Significant specific amino acid altplet counts:
Letters Observed (Critical number)
AG 113 (93)
at 83 (l= 2) 126 (l= 2) 194 (l= 2) 212 (l= 2) 235 (l= 2)
251 (l= 2) 259 (l= 2) 275 (l= 2) 280 (l= 2) 283 (l= 2)
296 (l= 2) 314 (l= 2) 319 (l= 3) 326 (l= 2) 329 (l= 2)
331 (l= 2) 340 (l= 2) 347 (l= 2) 350 (l= 2) 355 (l= 2)
379 (l= 3) 382 (l= 2) 385 (l= 2) 389 (l= 2) 395 (l= 2)
398 (l= 2) 401 (l= 2) 406 (l= 3) 413 (l= 2) 437 (l= 2)
446 (l= 2) 463 (l= 2) 470 (l= 2) 497 (l= 2) 502 (l= 2)
508 (l= 2) 517 (l= 2) 526 (l= 2) 532 (l= 2) 536 (l= 2)
559 (l= 2) 572 (l= 2) 577 (l= 2) 587 (l= 2) 589 (l= 2)
595 (l= 2) 605 (l= 2) 610 (l= 2) 616 (l= 3) 625 (l= 2)
628 (l= 2) 637 (l= 2) 649 (l= 2) 655 (l= 2) 668 (l= 2)
674 (l= 2) 694 (l= 2) 698 (l= 2) 701 (l= 2) 707 (l= 2)
712 (l= 3) 716 (l= 2) 722 (l= 2) 734 (l= 2) 736 (l= 2)
748 (l= 2) 752 (l= 2) 775 (l= 3) 790 (l= 2) 794 (l= 2)
797 (l= 2) 811 (l= 2) 814 (l= 2) 818 (l= 2) 824 (l= 2)
832 (l= 3) 838 (l= 2) 844 (l= 2) 847 (l= 2) 857 (l= 2)
860 (l= 2) 863 (l= 2) 868 (l= 2) 872 (l= 2) 878 (l= 2)
880 (l= 2) 892 (l= 2) 901 (l= 2) 916 (l= 2) 931 (l= 2)
938 (l= 2) 943 (l= 3) 955 (l= 2) 989 (l= 2) 1006 (l= 2)
1016 (l= 2) 1031 (l= 2) 1042 (l= 2) 1046 (l= 2) 1049 (l= 2)
1052 (l= 2) 1060 (l= 2) 1072 (l= 2) 1075 (l= 2) 1082 (l= 2)
1087 (l= 2) 1133 (l= 2) 1138 (l= 2) 1147 (l= 3) 1174 (l= 2)
1194 (l= 2) 1420 (l= 2) 1448 (l= 2)
GP 203 (156)
at 82 (l= 2) 112 (l= 2) 118 (l= 2) 121 (l= 2) 124 (l= 2)
127 (l= 2) 129 (l= 2) 135 (l= 2) 138 (l= 2) 141 (l= 3)
144 (l= 3) 147 (l= 3) 150 (l= 3) 153 (l= 2) 178 (l= 3)
182 (l= 2) 185 (l= 2) 190 (l= 3) 193 (l= 2) 196 (l= 3)
203 (l= 2) 205 (l= 2) 208 (l= 2) 211 (l= 2) 215 (l= 2)
218 (l= 2) 221 (l= 2) 223 (l= 3) 226 (l= 2) 238 (l= 2)
241 (l= 2) 245 (l= 2) 247 (l= 3) 256 (l= 2) 262 (l= 2)
281 (l= 2) 284 (l= 2) 289 (l= 2) 292 (l= 2) 298 (l= 2)
302 (l= 2) 307 (l= 2) 313 (l= 2) 316 (l= 3) 332 (l= 2)
334 (l= 3) 338 (l= 2) 341 (l= 2) 343 (l= 2) 346 (l= 2)
356 (l= 2) 359 (l= 2) 365 (l= 2) 373 (l= 3) 376 (l= 3)
383 (l= 2) 388 (l= 2) 394 (l= 2) 403 (l= 2) 409 (l= 2)
412 (l= 2) 416 (l= 2) 419 (l= 2) 422 (l= 3) 425 (l= 2)
427 (l= 3) 436 (l= 2) 439 (l= 2) 451 (l= 3) 458 (l= 2)
460 (l= 3) 475 (l= 3) 481 (l= 3) 484 (l= 2) 489 (l= 3)
496 (l= 2) 503 (l= 2) 506 (l= 2) 514 (l= 3) 518 (l= 2)
523 (l= 2) 529 (l= 2) 535 (l= 2) 544 (l= 2) 547 (l= 3)
554 (l= 2) 556 (l= 3) 565 (l= 3) 568 (l= 3) 571 (l= 2)
583 (l= 3) 592 (l= 2) 601 (l= 3) 604 (l= 2) 608 (l= 2)
620 (l= 2) 622 (l= 3) 626 (l= 2) 635 (l= 2) 640 (l= 2)
646 (l= 3) 650 (l= 2) 652 (l= 2) 658 (l= 2) 664 (l= 2)
670 (l= 3) 682 (l= 2) 689 (l= 2) 691 (l= 3) 695 (l= 2)
703 (l= 2) 715 (l= 2) 71
8 (l= 2) 724 (l= 2) 730 (l= 2)
739 (l= 3) 749 (l= 2) 757 (l= 2) 767 (l= 2) 770 (l= 2)
772 (l= 3) 778 (l= 2) 785 (l= 2) 788 (l= 2) 791 (l= 2)
799 (l= 2) 805 (l= 3) 808 (l= 3) 815 (l= 2) 817 (l= 2)
823 (l= 2) 829 (l= 2) 839 (l= 2) 841 (l= 3) 845 (l= 2)
848 (l= 2) 850 (l= 3) 859 (l= 2) 869 (l= 2) 871 (l= 2)
877 (l= 2) 884 (l= 2) 886 (l= 3) 893 (l= 2) 895 (l= 3)
898 (l= 3) 908 (l= 2) 914 (l= 2) 919 (l= 2) 923 (l= 2)
925 (l= 3) 928 (l= 3) 937 (l= 2) 941 (l= 2) 946 (l= 2)
949 (l= 3) 964 (l= 2) 973 (l= 2) 976 (l= 3) 982 (l= 2)
986 (l= 2) 995 (l= 2) 997 (l= 3) 1001 (l= 2) 1003 (l= 2)
1007 (l= 2) 1009 (l= 2) 1024 (l= 2) 1030 (l= 2) 1040 (l= 2)
1043 (l= 2) 1045 (l= 2) 1048 (l= 2) 1051 (l= 2) 1054 (l= 3)
1058 (l= 2) 1070 (l= 2) 1073 (l= 2) 1076 (l= 2) 1079 (l= 2)
1085 (l= 2) 1088 (l= 2) 1091 (l= 2) 1118 (l= 2) 1120 (l= 3)
1123 (l= 2) 1126 (l= 2) 1130 (l= 2) 1136 (l= 2) 1139 (l= 2)
1142 (l= 2) 1144 (l= 2) 1150 (l= 2) 1159 (l= 3) 1163 (l= 2)
1165 (l= 3) 1175 (l= 2) 1178 (l= 2) 1180 (l= 3) 1183 (l= 3)
1186 (l= 3) 1189 (l= 3) 1459 (l= 2)
5. Long amino acid multiplets (>= 5; Letter/Length/Position):
L/5/9
B. CHARGE ALPHABET.
1. Total number of charge multiplets: 12 (Expected range: 8-- 37)
4 +plets (f+: 8.8%), 8 -plets (f-: 9.6%)
Total number of charge altplets: 32 (Critical number: 42)
2. Histogram of spacings between consecutive charge multiplets:
(1-5) 2 (6-10) 1 (11-20) 0 (>=21) 10
--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.
A. AMINO ACID ALPHABET (core: 4; !-core: 5)
Location Period Element Copies Core Errors
9- 13 1 L 5 5 ! 0
109- 159 3 G.. 17 17 ! 0
173-1192 3 G.. 338 280 ! 2
B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6)
and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 8)
Location Period Element Copies Core Errors
228- 245 3 *00 6 6 /0/2/0/
--------------------------------------------------------------------------------
SPACING ANALYSIS.
Location (Quartile) Spacing Rank P-value Interpretation
47- 165 (1.) Y( 118)Y 2 of 14 0.9994 small 2. maximal spacing
53-1269 (2.) W(1216)W 1 of 7 0.0002 large 1. maximal spacing
95-1259 (2.) C(1164)C 1 of 19 0.0000 large 1. maximal spacing
167-1215 (2.) Y(1048)Y 1 of 14 0.0000 large 1. maximal spacing
170- 228 (1.) K( 58)K 2 of 59 0.9982 small 2. maximal spacing
267-1110 (2.) H( 843)H 1 of 10 0.0043 large maximal spacing
286- 352 (1.) K( 66)K 1 of 59 0.9976 small 1. maximal spacing
310- 312 (1.) R( 2)R 72 of 72 0.0006 large minimal spacing
1168-1170 (4.) R( 2)R 70 of 72 0.0006 matching minimum
1205-1248 (4.) P( 43)P 2 of 279 0.0003 large 2. maximal spacing
1213-1250 (4.) G( 37)G 2 of 391 0.0000 large 2. maximal spacing
1225-1227 (4.) R( 2)R 71 of 72 0.0006 matching minimum
1299-1370 (4.) C( 71)C 2 of 19 1.0000 small 2. maximal spacing
1325-1422 (4.) W( 97)W 2 of 7 0.9996 small 2. maximal spacing
1342-1382 (4.) G( 40)G 1 of 391 0.0013 large 1. maximal spacing
1345-1438 (4.) P( 93)P 1 of 279 0.0000 large 1. maximal spacing
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with Pfam (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/pfam/Pfam
Sequence file: tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
Collagen Collagen triple helix repeat (20 copies) 970.8 3.3e-288 18
COLFI Fibrillar collagen C-terminal domain 565.2 2e-220 1
vwc von Willebrand factor type C domain 89.7 5.8e-23 1
fibrinogen_C Fibrinogen beta and gamma chains, C-term -0.3 50 1
DUF41 Domain of unknown function DUF41 -71.4 30 1
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
vwc 1/1 40 95 .. 1 84 [] 89.7 5.8e-23
Collagen 1/18 107 165 .. 1 60 [] 26.8 0.00013
Collagen 2/18 177 235 .. 1 60 [] 51.4 2e-11
Collagen 3/18 236 295 .. 1 60 [] 77.7 2.4e-19
Collagen 4/18 296 355 .. 1 60 [] 66.9 4.3e-16
Collagen 5/18 356 415 .. 1 60 [] 63.6 4.2e-15
Collagen 6/18 416 475 .. 1 60 [] 63.1 5.9e-15
Collagen 7/18 476 535 .. 1 60 [] 65.9 8.5e-16
Collagen 8/18 536 595 .. 1 60 [] 66.6 5.3e-16
Collagen 9/18 596 655 .. 1 60 [] 64.1 3e-15
Collagen 10/18 656 715 .. 1 60 [] 62.6 8.4e-15
Collagen 11/18 716 775 .. 1 60 [] 72.2 1.1e-17
Collagen 12/18 779 838 .. 1 60 [] 70.3 3.9e-17
Collagen 13/18 839 898 .. 1 60 [] 62.4 9.4e-15
Collagen 14/18 899 958 .. 1 60 [] 61.2 2.3e-14
Collagen 15/18 959 1018 .. 1 60 [] 64.6 2.1e-15
Collagen 16/18 1020 1078 .. 1 60 [] 55.4 1.2e-12
Collagen 17/18 1079 1138 .. 1 60 [] 75.9 8.5e-19
Collagen 18/18 1139 1198 .. 1 60 [] 35.6 1.1e-06
fibrinogen_C 1/1 1271 1295 .. 18 43 .. -0.3 50
DUF41 1/1 4 1308 .. 1 247 [] -71.4 30
COLFI 1/1 1245 1463 .. 1 226 [] 565.2 2e-220
Alignments of top-scoring domains:
vwc: domain 1 of 1, from 40 to 95: score 89.7, E = 5.8e-23
*->CvqnGvvYengetWkpdsqPnGvdkCtyiCtCddiedavrlggkvlC
CvqnG +Y+++++Wkp++ C+ iC+Cd+ gkvlC
tem38_gi|1 40 CVQNGLRYHDRDVWKPEP-------CR-ICVCDN--------GKVLC 70
dkitCppelLpsldCpnprrvdalvippGECCpewvC<-*
d+++C+++ +Cp + + p+GECCp vC
tem38_gi|1 71 DDVICDET----KNCPGA------EVPEGECCP--VC 95
Collagen: domain 1 of 18, from 107 to 165: score 26.8, E = 0.00013
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G Gp+G++Gp+Gp+Gp+Gp+G G pG pG pGpPGppGppGp
tem38_gi|1 107 -TTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGP 152
pGppGapGapGpp<-*
pG G+ +
tem38_gi|1 153 PGLGGNFAPQLSY 165
Collagen: domain 2 of 18, from 177 to 235: score 51.4, E = 2e-11
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
pGp+Gp Gp+G pGppG+pGp+G++GppG pGepG+ Gp Gp Gp
tem38_gi|1 177 -VPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGP 222
pGppGapGapGpp<-*
pGppG+ G+ G++
tem38_gi|1 223 PGPPGKNGDDGEA 235
Collagen: domain 3 of 18, from 236 to 295: score 77.7, E = 2.4e-19
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG+pG++GppGp G++G pG aG pG++G++G++G +G++G +Gp
tem38_gi|1 236 GKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGP 282
pGppGapGapGpp<-*
+Gp+G+pG+pG++
tem38_gi|1 283 AGPKGEPGSPGEN 295
Collagen: domain 4 of 18, from 296 to 355: score 66.9, E = 4.3e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG++Gp+G+pG++G+pG+pGpaGa+G+ G+ G++GpPGp Gp+Gp
tem38_gi|1 296 GAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGP 342
pGppGapGapGpp<-*
pG pGa Ga+G++
tem38_gi|1 343 PGFPGAVGAKGEA 355
Collagen: domain 5 of 18, from 356 to 415: score 63.6, E = 4.2e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp Gp+G+ Gp+G +G+pGppGpaGa+Gp+G+pG++G+PG++G++G+
tem38_gi|1 356 GPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGA 402
pGppGapGapGpp<-*
pG +GapG pG++
tem38_gi|1 403 PGIAGAPGFPGAR 415
Collagen: domain 6 of 18, from 416 to 475: score 63.1, E = 5.9e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp Gp+Gp GppGp+G++G+pG++G++G+ G++GepGp G +GppGp
tem38_gi|1 416 GPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGP 462
pGppGapGapGpp<-*
+G++G+ Ga G+p
tem38_gi|1 463 AGEEGKRGARGEP 475
Collagen: domain 7 of 18, from 476 to 535: score 65.9, E = 8.5e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp G+pGppG++G pG +G+pG++G +Gp+Gp+Ge+G+PGp+Gp G+
tem38_gi|1 476 GPTGLPGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGS 522
pGppGapGapGpp<-*
pG++G+pG++G p
tem38_gi|1 523 PGEAGRPGEAGLP 535
Collagen: domain 8 of 18, from 536 to 595: score 66.6, E = 5.3e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G++G+ G+pG pGp+G+ GppGpaG G pGppG+pG+ G++G++G+
tem38_gi|1 536 GAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGF 582
pGppGapGapGpp<-*
pGp+Ga+G+pG++
tem38_gi|1 583 PGPKGAAGEPGKA 595
Collagen: domain 9 of 18, from 596 to 655: score 64.1, E = 3e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G++G pGppG+ Gp+G+ G++G++G+pGp+Gp+Ge+G++Gp+G pG+
tem38_gi|1 596 GERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGF 642
pGppGapGapGpp<-*
+G pG++G+pG++
tem38_gi|1 643 QGLPGPAGPPGEA 655
Collagen: domain 10 of 18, from 656 to 715: score 62.6, E = 8.4e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG++G pG+ G+pGp+G+ G++G+pG++G +G+pGp Gp+G++G+
tem38_gi|1 656 GKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGA 702
pGppGapGapGpp<-*
pG++Ga+G++G+p
tem38_gi|1 703 PGNDGAKGDAGAP 715
Collagen: domain 11 of 18, from 716 to 775: score 72.2, E = 1.1e-17
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG++G+pG++G+pG++G++G +G++G++G++G++G++G+pG++G
tem38_gi|1 716 GAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGV 762
pGppGapGapGpp<-*
+G +G++G+pGp+
tem38_gi|1 763 RGLTGPIGPPGPA 775
Collagen: domain 12 of 18, from 779 to 838: score 70.3, E = 3.9e-17
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G +G+ Gp Gp+Gp+G++G+pG++G+pGppGp+G++GpPG++G+pG+
tem38_gi|1 779 GDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGA 825
pGppGapGapGpp<-*
+G+pG +Ga+G +
tem38_gi|1 826 KGEPGDAGAKGDA 838
Collagen: domain 13 of 18, from 839 to 898: score 62.4, E = 9.4e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
GppGp+Gp+GppGp G+ G+pG++Ga+G++GppG+ G+PG++G+ Gp
tem38_gi|1 839 GPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGP 885
pGppGapGapGpp<-*
pGp G++G+pGpp
tem38_gi|1 886 PGPSGNAGPPGPP 898
Collagen: domain 14 of 18, from 899 to 958: score 61.2, E = 2.3e-14
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp+G+ G++Gp+G++Gp+G pG+ G+pGppGp+Ge+G+PG++Gp+G+
tem38_gi|1 899 GPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGA 945
pGppGapGapGpp<-*
pG pG+ G +G++
tem38_gi|1 946 PGTPGPQGIAGQR 958
Collagen: domain 15 of 18, from 959 to 1018: score 64.6, E = 2.1e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G G+pG +G++G pG pGp G++G++Gp G++Ge+GpPGp GppG
tem38_gi|1 959 GVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGL 1005
pGppGapGapGpp<-*
+GppG++G +G+p
tem38_gi|1 1006 AGPPGESGREGAP 1018
Collagen: domain 16 of 18, from 1020 to 1078: score 55.4, E = 1.2e-12
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
+ G+pG+ G pG++G++G++GpaG+pG pG+pG+pGp Gp+G+ G
tem38_gi|1 1020 -AEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGD 1065
pGppGapGapGpp<-*
+G++G++G++Gp+
tem38_gi|1 1066 RGETGPAGPAGPV 1078
Collagen: domain 17 of 18, from 1079 to 1138: score 75.9, E = 8.5e-19
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp G +Gp+Gp+Gp+G++G++G++G +G +G++G++G +GppGppG+
tem38_gi|1 1079 GPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGS 1125
pGppGapGapGpp<-*
pG++G++Ga Gp+
tem38_gi|1 1126 PGEQGPSGASGPA 1138
Collagen: domain 18 of 18, from 1139 to 1198: score 35.6, E = 1.1e-06
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp+GppG++G+pG +G G pGp G+pGp+G G++Gp GppGppGp
tem38_gi|1 1139 GPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGP 1185
pGppGapGapGpp<-*
pGppG+p a
tem38_gi|1 1186 PGPPGPPSAGFDF 1198
fibrinogen_C: domain 1 of 1, from 1271 to 1295: score -0.3, E = 50
*->SPPGlYtIqPd.gakeqpllVYCDmet<-*
S G Y I P++g + +++V+C met
tem38_gi|1 1271 S--GEYWIDPNqGCNLDAIKVFCNMET 1295
DUF41: domain 1 of 1, from 4 to 1308: score -71.4, E = 30
*->lteeQLlstFsNvkhliGslevqnTnfkslsFLanLesIecg.....
+++ l+ + l T + + + + ++e+++++ +
tem38_gi|1 4 FVD---LRLL---------LLLAATALLTHG--QEEGQVEGQdedip 36
..................................................
+ + +++ + ++++ ++++ + ++++ ++ +++++ ++ +
tem38_gi|1 37 pitcvqnglryhdrdvwkpepcricvcdngkvlcddvicdetkncpgaev 86
..................................................
++++ + ++++++++++++++ +++++++++++++++ +++++++ ++
tem38_gi|1 87 pegeccpvcpdgsesptdqettgvegpkgdtgprgprgpagppgrdgipg 136
..................................................
+++ ++++++++++++++ +++ ++ + + +++++++ + +++ +++++
tem38_gi|1 137 qpglpgppgppgppgppglggnfapqlsygydekstggisvpgpmgpsgp 186
..................................................
++ +++++ +++++ +++++++++++ +++ +++++++++++++++++ +
tem38_gi|1 187 rglpgppgapgpqgfqgppgepgepgasgpmgprgppgppgkngddgeag 236
..................................................
+++++++++++++++ ++ +++ + ++ +++++ ++ ++ +++ ++ +++
tem38_gi|1 237 kpgrpgergppgpqgarglpgtaglpgmkghrgfsgldgakgdagpagpk 286
..................................................
++++++++++ +++ ++++ ++++++++ +++ + +++++ ++ +++++
tem38_gi|1 287 gepgspgengapgqmgprglpgergrpgapgpagargndgatgaagppgp 336
..................................................
+++ ++++ ++ + +++ +++++++++++++ +++++++++ + ++ +
tem38_gi|1 337 tgpagppgfpgavgakgeagpqgprgsegpqgvrgepgppgpagaagpag 386
..................................................
+++ +++++ ++ ++ ++ + ++ ++ ++++++++++++++++++++++
tem38_gi|1 387 npgadgqpgakgangapgiagapgfpgargpsgpqgpggppgpkgnsgep 436
..................................................
+ ++++++++ ++++++ + ++++++ +++++++ ++++++++ ++++++
tem38_gi|1 437 gapgskgdtgakgepgpvgvqgppgpageegkrgargepgptglpgppge 486
..................................................
++++++++ ++ ++ +++++ ++++++++ ++++++++ +++++ + ++
tem38_gi|1 487 rggpgsrgfpgadgvagpkgpagergspgpagpkgspgeagrpgeaglpg 536
..................................................
++ ++++++++++++++++++ +++++++++++++ +++ + + ++++
tem38_gi|1 537 akgltgspgspgpdgktgppgpagqdgrpgppgppgargqagvmgfpgpk 586
..................................................
+ +++++ ++++ +++++ ++ +++++ + ++++++ ++ ++++++++
tem38_gi|1 587 gaagepgkagergvpgppgavgpagkdgeagaqgppgpagpagergeqgp 636
..................................................
++++ ++ +++ +++++ +++++++ +++ + +++++ +++++ +++++
tem38_gi|1 637 agspgfqglpgpagppgeagkpgeqgvpgdlgapgpsgargergfpgerg 686
..................................................
++++++ ++++ ++ +++++ +++ + ++ +++++ ++ ++ +++++
tem38_gi|1 687 vqgppgpagprgangapgndgakgdagapgapgsqgapglqgmpgergaa 736
..................................................
+ +++++++++ ++++ ++++++++ ++ +++ +++++ + +++++++++
tem38_gi|1 737 glpgpkgdrgdagpkgadgspgkdgvrgltgpigppgpagapgdkgesgp 786
..................................................
+++ ++++ ++ ++++++++++++ + ++++ +++++ ++++++ + ++
tem38_gi|1 787 sgpagptgargapgdrgepgppgpagfagppgadgqpgakgepgdagakg 836
..................................................
+ +++++ ++ +++++ ++ + ++ ++ +++ ++++ ++ ++ ++ +++
tem38_gi|1 837 dagppgpagpagppgpignvgapgakgargsagppgatgfpgaagrvgpp 886
..................................................
+++++ ++++++++ ++++++++++++++ +++++ ++++++++ +++++
tem38_gi|1 887 gpsgnagppgppgpagkeggkgprgetgpagrpgevgppgppgpagekgs 936
..................................................
++ +++ + ++++++++ ++++ + ++++++++ ++ +++++++++++
tem38_gi|1 937 pgadgpagapgtpgpqgiagqrgvvglpgqrgergfpglpgpsgepgkqg 986
..................................................
+++ +++++++++ ++++ ++++++++++ + +++++++++++ ++++
tem38_gi|1 987 psgasgergppgpmgppglagppgesgregapaaegspgrdgspgakgdr 1036
..................................................
+++++ ++++ ++ ++ +++ ++ +++++++++++ ++ ++ ++ + +++
tem38_gi|1 1037 getgpagppgapgapgapgpvgpagksgdrgetgpagpagpvgpvgargp 1086
..................................................
+++++++++++++++++++ +++++ ++ +++++++++++++++++ ++
tem38_gi|1 1087 agpqgprgdkgetgeqgdrgikghrgfsglqgppgppgspgeqgpsgasg 1136
..................................................
+ ++++++++ + +++++ ++ +++ +++++++++++ ++ +++++++++
tem38_gi|1 1137 pagprgppgsagapgkdglnglpgpigppgprgrtgdagpvgppgppgpp 1186
..................................irk.rnkdrvrkildn
+++++++ + + + +++++++ +++++ + ++ r +d + +
tem38_gi|1 1187 gppgppsagfdfsflpqppqekahdggryyraddANVvRDRDLEVDTT-- 1234
ihdnpfswidnqnmlelgllnlTnmtrlgLpilsnldlnkLnlpnlknis
lk++s
tem38_gi|1 1235 ---------------------------------------------LKSLS 1239
npnstgekiivnfenlhpdFClTteEllnfflnsnvsienleakyCepks
++ +en +++ E+ +++a C +
tem38_gi|1 1240 QQ----------IEN------IRSPEGS----------RKNPARTCRDL- 1262
rifflikktdngivyklCnfkslsssvnLdngCtiIfGdLvIgpgdEeyV
k+C++ s G ++I+p+
tem38_gi|1 1263 ---------------KMCHSDWKS-------------GEYWIDPNQG--- 1281
skLknveviFGsLiIqNTnLtnidFLenLkyIasLedsvs<-*
+L+ +v + n ++ ++ + sv+
tem38_gi|1 1282 CNLDAIKV-------F-CNMETGE-----TCVYPTQPSVA 1308
COLFI: domain 1 of 1, from 1245 to 1463: score 565.2, E = 2e-220
*->lksPeGksrknPARtCkDLfLchpefksGeYWiDPNqGCikDAikVf
++sPeG srknPARtC+DL++ch+++ksGeYWiDPNqGC++DAikVf
tem38_gi|1 1245 IRSPEG-SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVF 1290
CnkrfetGvgeTCisptpksvpkRiksWykgks.kdkKhvWFgetmegGf
Cn +etG eTC++pt+ sv++ k+Wy +k++kdk+hvWFge+m++Gf
tem38_gi|1 1291 CN--METG--ETCVYPTQPSVAQ--KNWYISKNpKDKRHVWFGESMTDGF 1334
kfsYiddelnpeisnvQlTFLRLLSteAsQNiTYhCKNSvAYmDeatGNl
+f+Y++++++p+++++QlTFLRL+SteAsQNiTYhCKNSvAYmD++tGNl
tem38_gi|1 1335 QFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNL 1384
kkAlilmgSnDvElsadgnskFtYtvlGeDGCssrtgewgKTViEyeTkK
kkAl+l+gSn++E++a+gns+FtY+v+ +DGC+s+tg+wgKTViEy+T+K
tem38_gi|1 1385 KKALLLKGSNEIEIRAEGNSRFTYSVT-VDGCTSHTGAWGKTVIEYKTTK 1433
ttRLPIvDiApsDiGgedQeFGveiGPVCF<-*
+RLPI+D+Ap+D+G +dQeFG+++GPVCF
tem38_gi|1 1434 SSRLPIIDVAPLDVGAPDQEFGFDVGPVCF 1463
//
Start with PfamFrag (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/pfam/PfamFrag
Sequence file: tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
Collagen Collagen triple helix repeat (20 copies) 946.7 5.9e-281 18
COLFI Fibrillar collagen C-terminal domain 565.2 2e-220 1
fibrinogen_C Fibrinogen beta and gamma chains, C-term -0.3 50 1
CBIA Cobyrinic acid a,c-diamide synthase -0.7 93 1
LBP_BPI_CETP LBP / BPI / CETP family -0.7 57 1
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
LBP_BPI_CETP 1/1 7 29 .. 1 23 [. -0.7 57
Collagen 1/18 109 158 .. 1 50 [. 27.3 5e-06
CBIA 1/1 174 189 .. 1 16 [. -0.7 93
Collagen 2/18 177 235 .. 1 60 [] 50.4 2.3e-12
Collagen 3/18 236 295 .. 1 60 [] 75.7 2.5e-19
Collagen 4/18 296 355 .. 1 60 [] 64.9 2.4e-16
Collagen 5/18 356 415 .. 1 60 [] 61.6 1.9e-15
Collagen 6/18 416 475 .. 1 60 [] 61.1 2.6e-15
Collagen 7/18 476 535 .. 1 60 [] 63.9 4.4e-16
Collagen 8/18 536 595 .. 1 60 [] 64.6 2.9e-16
Collagen 9/18 596 655 .. 1 60 [] 62.1 1.4e-15
Collagen 10/18 656 715 .. 1 60 [] 60.6 3.6e-15
Collagen 11/18 716 775 .. 1 60 [] 70.2 8.4e-18
Collagen 12/18 779 838 .. 1 60 [] 68.4 2.7e-17
Collagen 13/18 839 898 .. 1 60 [] 60.5 4e-15
Collagen 14/18 899 958 .. 1 60 [] 59.2 8.8e-15
Collagen 15/18 959 1018 .. 1 60 [] 62.7 9.9e-16
Collagen 16/18 1020 1078 .. 1 60 [] 54.4 1.8e-13
Collagen 17/18 1079 1138 .. 1 60 [] 73.9 8.1e-19
Collagen 18/18 1139 1192 .. 1 54 [. 40.6 1.2e-09
fibrinogen_C 1/1 1271 1295 .. 18 43 .. -0.3 50
COLFI 1/1 1245 1463 .. 1 226 [] 565.2 2e-220
Alignments of top-scoring domains:
LBP_BPI_CETP: domain 1 of 1, from 7 to 29: score -0.7, E = 57
*->alllllvlislavalrtnPgivv<-*
++llll+++ ++++++ +g v+
tem38_gi|1 7 LRLLLLLAATALLTHGQEEGQVE 29
Collagen: domain 1 of 18, from 109 to 158: score 27.3, E = 5e-06
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G Gp G Gp+Gp+Gp+GppG +G pG pG pG+pGpPGppGppG
tem38_gi|1 109 GVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGL 155
pGp<-*
G+
tem38_gi|1 156 GGN 158
CBIA: domain 1 of 1, from 174 to 189: score -0.7, E = 93
*->almiaGtsSgaGKttl<-*
++ ++G++ +G+++l
tem38_gi|1 174 GISVPGPMGPSGPRGL 189
Collagen: domain 2 of 18, from 177 to 235: score 50.4, E = 2.3e-12
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
pGp+Gp Gp+G pGppG+pGp+G++GppG pGepG+ Gp Gp Gp
tem38_gi|1 177 -VPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGP 222
pGppGapGapGpp<-*
pGppG+ G+ G++
tem38_gi|1 223 PGPPGKNGDDGEA 235
Collagen: domain 3 of 18, from 236 to 295: score 75.7, E = 2.5e-19
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG+pG++GppGp G++G pG aG pG++G++G++G +G++G +Gp
tem38_gi|1 236 GKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGP 282
pGppGapGapGpp<-*
+Gp+G+pG+pG++
tem38_gi|1 283 AGPKGEPGSPGEN 295
Collagen: domain 4 of 18, from 296 to 355: score 64.9, E = 2.4e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG++Gp+G+pG++G+pG+pGpaGa+G+ G+ G++GpPGp Gp+Gp
tem38_gi|1 296 GAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGP 342
pGppGapGapGpp<-*
pG pGa Ga+G++
tem38_gi|1 343 PGFPGAVGAKGEA 355
Collagen: domain 5 of 18, from 356 to 415: score 61.6, E = 1.9e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp Gp+G+ Gp+G +G+pGppGpaGa+Gp+G+pG++G+PG++G++G+
tem38_gi|1 356 GPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGA 402
pGppGapGapGpp<-*
pG +GapG pG++
tem38_gi|1 403 PGIAGAPGFPGAR 415
Collagen: domain 6 of 18, from 416 to 475: score 61.1, E = 2.6e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp Gp+Gp GppGp+G++G+pG++G++G+ G++GepGp G +GppGp
tem38_gi|1 416 GPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGP 462
pGppGapGapGpp<-*
+G++G+ Ga G+p
tem38_gi|1 463 AGEEGKRGARGEP 475
Collagen: domain 7 of 18, from 476 to 535: score 63.9, E = 4.4e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp G+pGppG++G pG +G+pG++G +Gp+Gp+Ge+G+PGp+Gp G+
tem38_gi|1 476 GPTGLPGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGS 522
pGppGapGapGpp<-*
pG++G+pG++G p
tem38_gi|1 523 PGEAGRPGEAGLP 535
Collagen: domain 8 of 18, from 536 to 595: score 64.6, E = 2.9e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G++G+ G+pG pGp+G+ GppGpaG G pGppG+pG+ G++G++G+
tem38_gi|1 536 GAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGF 582
pGppGapGapGpp<-*
pGp+Ga+G+pG++
tem38_gi|1 583 PGPKGAAGEPGKA 595
Collagen: domain 9 of 18, from 596 to 655: score 62.1, E = 1.4e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G++G pGppG+ Gp+G+ G++G++G+pGp+Gp+Ge+G++Gp+G pG+
tem38_gi|1 596 GERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGF 642
pGppGapGapGpp<-*
+G pG++G+pG++
tem38_gi|1 643 QGLPGPAGPPGEA 655
Collagen: domain 10 of 18, from 656 to 715: score 60.6, E = 3.6e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG++G pG+ G+pGp+G+ G++G+pG++G +G+pGp Gp+G++G+
tem38_gi|1 656 GKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGA 702
pGppGapGapGpp<-*
pG++Ga+G++G+p
tem38_gi|1 703 PGNDGAKGDAGAP 715
Collagen: domain 11 of 18, from 716 to 775: score 70.2, E = 8.4e-18
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G+pG++G+pG++G+pG++G++G +G++G++G++G++G++G+pG++G
tem38_gi|1 716 GAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGV 762
pGppGapGapGpp<-*
+G +G++G+pGp+
tem38_gi|1 763 RGLTGPIGPPGPA 775
Collagen: domain 12 of 18, from 779 to 838: score 68.4, E = 2.7e-17
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G +G+ Gp Gp+Gp+G++G+pG++G+pGppGp+G++GpPG++G+pG+
tem38_gi|1 779 GDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGA 825
pGppGapGapGpp<-*
+G+pG +Ga+G +
tem38_gi|1 826 KGEPGDAGAKGDA 838
Collagen: domain 13 of 18, from 839 to 898: score 60.5, E = 4e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
GppGp+Gp+GppGp G+ G+pG++Ga+G++GppG+ G+PG++G+ Gp
tem38_gi|1 839 GPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGP 885
pGppGapGapGpp<-*
pGp G++G+pGpp
tem38_gi|1 886 PGPSGNAGPPGPP 898
Collagen: domain 14 of 18, from 899 to 958: score 59.2, E = 8.8e-15
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp+G+ G++Gp+G++Gp+G pG+ G+pGppGp+Ge+G+PG++Gp+G+
tem38_gi|1 899 GPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGA 945
pGppGapGapGpp<-*
pG pG+ G +G++
tem38_gi|1 946 PGTPGPQGIAGQR 958
Collagen: domain 15 of 18, from 959 to 1018: score 62.7, E = 9.9e-16
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
G G+pG +G++G pG pGp G++G++Gp G++Ge+GpPGp GppG
tem38_gi|1 959 GVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGL 1005
pGppGapGapGpp<-*
+GppG++G +G+p
tem38_gi|1 1006 AGPPGESGREGAP 1018
Collagen: domain 16 of 18, from 1020 to 1078: score 54.4, E = 1.8e-13
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
+ G+pG+ G pG++G++G++GpaG+pG pG+pG+pGp Gp+G+ G
tem38_gi|1 1020 -AEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGD 1065
pGppGapGapGpp<-*
+G++G++G++Gp+
tem38_gi|1 1066 RGETGPAGPAGPV 1078
Collagen: domain 17 of 18, from 1079 to 1138: score 73.9, E = 8.1e-19
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp G +Gp+Gp+Gp+G++G++G++G +G +G++G++G +GppGppG+
tem38_gi|1 1079 GPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGS 1125
pGppGapGapGpp<-*
pG++G++Ga Gp+
tem38_gi|1 1126 PGEQGPSGASGPA 1138
Collagen: domain 18 of 18, from 1139 to 1192: score 40.6, E = 1.2e-09
*->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
Gp+GppG++G+pG +G G pGp G+pGp+G G++Gp GppGppGp
tem38_gi|1 1139 GPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGP 1185
pGppGap<-*
pGppG+p
tem38_gi|1 1186 PGPPGPP 1192
fibrinogen_C: domain 1 of 1, from 1271 to 1295: score -0.3, E = 50
*->SPPGlYtIqPd.gakeqpllVYCDmet<-*
S G Y I P++g + +++V+C met
tem38_gi|1 1271 S--GEYWIDPNqGCNLDAIKVFCNMET 1295
COLFI: domain 1 of 1, from 1245 to 1463: score 565.2, E = 2e-220
*->lksPeGksrknPARtCkDLfLchpefksGeYWiDPNqGCikDAikVf
++sPeG srknPARtC+DL++ch+++ksGeYWiDPNqGC++DAikVf
tem38_gi|1 1245 IRSPEG-SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVF 1290
CnkrfetGvgeTCisptpksvpkRiksWykgks.kdkKhvWFgetmegGf
Cn +etG eTC++pt+ sv++ k+Wy +k++kdk+hvWFge+m++Gf
tem38_gi|1 1291 CN--METG--ETCVYPTQPSVAQ--KNWYISKNpKDKRHVWFGESMTDGF 1334
kfsYiddelnpeisnvQlTFLRLLSteAsQNiTYhCKNSvAYmDeatGNl
+f+Y++++++p+++++QlTFLRL+SteAsQNiTYhCKNSvAYmD++tGNl
tem38_gi|1 1335 QFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNL 1384
kkAlilmgSnDvElsadgnskFtYtvlGeDGCssrtgewgKTViEyeTkK
kkAl+l+gSn++E++a+gns+FtY+v+ +DGC+s+tg+wgKTViEy+T+K
tem38_gi|1 1385 KKALLLKGSNEIEIRAEGNSRFTYSVT-VDGCTSHTGAWGKTVIEYKTTK 1433
ttRLPIvDiApsDiGgedQeFGveiGPVCF<-*
+RLPI+D+Ap+D+G +dQeFG+++GPVCF
tem38_gi|1 1434 SSRLPIIDVAPLDVGAPDQEFGFDVGPVCF 1463
//
Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib
Sequence file: tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
[no hits above thresholds]
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
[no hits above thresholds]
Alignments of top-scoring domains:
[no hits above thresholds]
//
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with Prosite
---------------------------------------------------------
| ppsearch (c) 1994 EMBL Data Library |
| based on MacPattern (c) 1990-1994 R. Fuchs |
---------------------------------------------------------
PROSITE pattern search started: Tue Oct 31 18:41:24 2000
Sequence file: tem38
----------------------------------------
Sequence tem38_gi|1418928|emb|CAA98968.1| (1464 residues):
Matching pattern PS00001 ASN_GLYCOSYLATION:
1365: NITY
Total matches: 1
Matching pattern PS00005 PKC_PHOSPHO_SITE:
1012: SGR
1234: TLK
1251: SRK
1258: TCR
1431: TTK
1434: SSR
Total matches: 6
Matching pattern PS00006 CK2_PHOSPHO_SITE:
3: SFVD
101: SPTD
103: TDQE
108: TGVE
271: SGLD
291: SPGE
441: SKGD
522: SPGE
1012: SGRE
1125: SPGE
1258: TCRD
1329: SMTD
1425: TVIE
Total matches: 13
Matching pattern PS00007 TYR_PHOSPHO_SITE:
1208: KAHDGGRY
Total matches: 1
Matching pattern PS00008 MYRISTYL:
22: GQEEGQ
26: GQVEGQ
154: GLGGNF
254: GLPGTA
272: GLDGAK
320: GARGND
323: GNDGAT
326: GATGAA
347: GAVGAK
386: GNPGAD
392: GQPGAK
395: GAKGAN
437: GAPGSK
488: GGPGSR
533: GLPGAK
701: GAPGND
704: GNDGAK
716: GAPGSQ
821: GQPGAK
857: GAPGAK
860: GAKGAR
863: GARGSA
935: GSPGAD
1016: GAPAAE
1028: GSPGAK
1339: GGQGSD
1342: GSDPAD
Total matches: 27
Matching pattern PS00009 AMIDATION:
466: EGKR
Total matches: 1
Matching pattern PS00016 RGD:
745: RGD
1093: RGD
Total matches: 2
Matching pattern PS01208 VWFC:
58: CRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVC
Total matches: 1
Total no of hits in this sequence: 52
========================================
1314 pattern(s) searched in 1 sequence(s), 1464 residues.
Total no of hits in all sequences: 52.
Search time: 00:00 min
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with Profile Search
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with motif search against own library
***** bioMotif : Version V41a DB, 1999 Nov 11 *****
argv[1]=P
argv[2]=-m /data/patterns/own/motif.fa
argv[4]=-seq tem38
***** bioMotif : Version V41a DB, 1999 Nov 11 *****
SeqTyp=2 : PROTEIN search;
>APC D-Box is the MOTIF name
>STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 1464 units
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~
Start with HMM-search search against own library
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/own/own-hmm.lib
Sequence file: tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
[no hits above thresholds]
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
[no hits above thresholds]
Alignments of top-scoring domains:
[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/own/own-hmm-f.lib
Sequence file: tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
[no hits above thresholds]
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
[no hits above thresholds]
Alignments of top-scoring domains:
[no hits above thresholds]
//
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
L. Aravind's signalling DB
IMPALA version 1.1 [20-December-1999]
Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting,
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999),
"IMPALA: Matching a Protein Sequence Against a Collection of
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.
Query= tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen
[Homo sapiens]
(1464 letters)
Searching..................................done
Results from profile search
Score E
Sequences producing significant alignments: (bits) Value
14-3-3 14-3-3 protein alpha Helical domain 27 0.13
FYVE Zinc Finger domain involved in PtdIns(3)P binding 27 0.18
UBA Ubiquitin pathway associated domain 27 0.23
MATH The Meprin associated TRAF homology domain 26 0.50
RASGAP RAS-type GTPase GTP hydrolysis activating protein 25 0.61
MIZFIN MIZ type Cysteine zinc DNA binding domain 25 0.82
RASGEF RAS-type GTPase GDP exchange factor 24 1.2
SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chrom... 23 2.1
BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 23 2.4
DHHC Novel zinc finger domain with DHHC signature 22 4.3
PHD PHD zinc finger(A cysteine rich DNA binding domain) 22 4.9
INSL Insulinase like Metallo protease domain 21 8.9
>14-3-3 14-3-3 protein alpha Helical domain
Length = 270
Score = 27.3 bits (60), Expect = 0.13
Identities = 5/27 (18%), Positives = 5/27 (18%)
Query: 820 DGQPGAKGEPGDAGAKGDAGPPGPAGP 846
G P G A P
Sbjct: 240 SAAAAGGNTEGAQENAPSNAPEGEAEP 266
>FYVE Zinc Finger domain involved in PtdIns(3)P binding
Length = 99
Score = 27.0 bits (59), Expect = 0.18
Identities = 14/41 (34%), Positives = 19/41 (46%), Gaps = 10/41 (24%)
Query: 59 RICVCDN-GKVLCDDVICDETKNCPGAEVPE---GECCPVC 95
R+ D GK++C D+ NC E PE +CC C
Sbjct: 2 RLFSADEHGKLMCWDM------NCKRVETPEWKTSDCCQKC 36
>UBA Ubiquitin pathway associated domain
Length = 255
Score = 26.6 bits (58), Expect = 0.23
Identities = 25/82 (30%), Positives = 30/82 (36%), Gaps = 5/82 (6%)
Query: 813 FAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPG 872
G P QP EP A P A A P ++ A A+G S+G G
Sbjct: 57 LMGIPENLRQP----EPQQQTAAAAEQPSTAATTAEQPAED-DLFAQAAQGGNASSGALG 111
Query: 873 ATGFPGAAGRVGPPGPSGNAGP 894
TG A + GPPG G
Sbjct: 112 TTGGATDAAQGGPPGSIGLTVE 133
Score = 22.7 bits (48), Expect = 3.5
Identities = 22/85 (25%), Positives = 31/85 (35%), Gaps = 8/85 (9%)
Query: 972 FPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPAAEGSPGRDGSPG 1031
G+P +P Q + A+ E+ P A E E A+ + G + S G
Sbjct: 57 LMGIPENLRQPEPQQQTAAAAEQ--------PSTAATTAEQPAEDDLFAQAAQGGNASSG 108
Query: 1032 AKGDRGETGPAGPPGAPGAPGAPGP 1056
A G G A G PG+ G
Sbjct: 109 ALGTTGGATDAAQGGPPGSIGLTVE 133
Score = 22.3 bits (47), Expect = 4.2
Identities = 22/74 (29%), Positives = 26/74 (34%), Gaps = 3/74 (4%)
Query: 1116 LQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGL--NGLPGPIGPPGPRGRTGD 1173
L G P P Q + A+ P +A P +D L G G G TG
Sbjct: 57 LMGIPENLRQPEPQQQTAAAAEQ-PSTAATTAEQPAEDDLFAQAAQGGNASSGALGTTGG 115
Query: 1174 AGPVGPPGPPGPPG 1187
A GPPG G
Sbjct: 116 ATDAAQGGPPGSIG 129
>MATH The Meprin associated TRAF homology domain
Length = 209
Score = 25.6 bits (56), Expect = 0.50
Identities = 7/18 (38%), Positives = 8/18 (43%)
Query: 925 PGPPGPAGEKGSPGADGP 942
P PP PA P A+
Sbjct: 5 PSPPPPAEMSSGPVAESW 22
Score = 21.8 bits (46), Expect = 6.8
Identities = 7/16 (43%), Positives = 9/16 (55%)
Query: 805 PGPPGPAGFAGPPGAD 820
P PP PA + P A+
Sbjct: 5 PSPPPPAEMSSGPVAE 20
Score = 21.4 bits (45), Expect = 9.7
Identities = 5/14 (35%), Positives = 6/14 (42%)
Query: 177 VPGPMGPSGPRGLP 190
VP P P+ P
Sbjct: 4 VPSPPPPAEMSSGP 17
>RASGAP RAS-type GTPase GTP hydrolysis activating protein
Length = 292
Score = 25.1 bits (54), Expect = 0.61
Identities = 16/61 (26%), Positives = 29/61 (47%), Gaps = 11/61 (18%)
Query: 1220 DANVVRDRDLEVDTTLKSLSQQIENI-----RSPEGSRKNPARTCRDLKMCHSDWKSGEY 1274
D + ++DR VDT L +L +E + +S + K + DL+ C +GE+
Sbjct: 137 DPSKIKDRS-AVDTNLHNLQDYVERVFEAITKSADRCPKVLCQIFHDLREC-----AGEH 190
Query: 1275 W 1275
+
Sbjct: 191 F 191
>MIZFIN MIZ type Cysteine zinc DNA binding domain
Length = 172
Score = 24.6 bits (53), Expect = 0.82
Identities = 18/90 (20%), Positives = 30/90 (33%), Gaps = 17/90 (18%)
Query: 58 CRICVCDNGKVLCDDVICD--------ETKNCPGAEV-PEGECCPVCP--DGSESPTDQE 106
C +C + K + +I D + + + +G CP+ P + + T Q
Sbjct: 50 CPVC---DKKAAYESLILDGLFMEILNDCSDVDEIKFQEDGSWCPMRPKKEAMKV-TSQP 105
Query: 107 TTGVEGPKGDTGP--RGPRGPAGPPGRDGI 134
T VE + P A D I
Sbjct: 106 CTKVESSSVFSKPCSVTVASDASKKKIDVI 135
>RASGEF RAS-type GTPase GDP exchange factor
Length = 196
Score = 24.4 bits (53), Expect = 1.2
Identities = 20/105 (19%), Positives = 31/105 (29%), Gaps = 19/105 (18%)
Query: 1344 DPADVAIQLTFLRLMSTEASQNITY-HCKNSVAYMDQQTGNLKKALLLKGSNEIEIRAEG 1402
D VA Q+T L+ E I + + M + + L L NE G
Sbjct: 5 DSLSVAQQMT---LIEKEILGEIDWKDLLDLK--MKHEGPQVISWLQLLVRNE---TLSG 56
Query: 1403 NSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPI----IDVA 1443
+ T W + I + + + I VA
Sbjct: 57 IDLAISR------FNLTVDWIISEILLTKSSKMKRNVIQRFIHVA 95
>SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chromatin associated domain)
Length = 219
Score = 23.4 bits (50), Expect = 2.1
Identities = 9/60 (15%), Positives = 16/60 (26%), Gaps = 9/60 (15%)
Query: 30 GQDEDIP-PITCVQNGLRYHDRD-----VWKPEPCRICVCDNGKVLCDDVICDETKNCPG 83
G D IP P+ V+ L++ + +C + G
Sbjct: 17 GIDSAIPYPVRRVEQLLQFSFLPELQFQNAAVKQRIQRLCYREEKRLA---VSSLAKWLG 73
>BRIGHT BRIGHT domain (Alpha helical DNA binding domain)
Length = 172
Score = 23.4 bits (50), Expect = 2.4
Identities = 7/29 (24%), Positives = 7/29 (24%)
Query: 413 GARGPSGPQGPGGPPGPKGNSGEPGAPGS 441
G R G P P PG
Sbjct: 132 GRRSSYGQYEAMHNQMPMTPISRPSLPGG 160
Score = 22.3 bits (47), Expect = 4.4
Identities = 6/28 (21%), Positives = 7/28 (24%)
Query: 881 GRVGPPGPSGNAGPPGPPGPAGKEGGKG 908
GR G P P + G
Sbjct: 132 GRRSSYGQYEAMHNQMPMTPISRPSLPG 159
>DHHC Novel zinc finger domain with DHHC signature
Length = 217
Score = 22.4 bits (47), Expect = 4.3
Identities = 9/34 (26%), Positives = 11/34 (31%), Gaps = 2/34 (5%)
Query: 52 VWKPEPCRIC-VCDNGKVLCDDVICDETKNCPGA 84
V + C C+ V D C NC G
Sbjct: 141 VDVSARSKHCSACNK-CVCGFDHHCKWLNNCVGE 173
>PHD PHD zinc finger(A cysteine rich DNA binding domain)
Length = 54
Score = 22.3 bits (47), Expect = 4.9
Identities = 12/53 (22%), Positives = 16/53 (29%), Gaps = 17/53 (32%)
Query: 58 CRICVCDNGK-----VLCDDVICDET--KNCPG-------AEVPEGE-CCPVC 95
C +C V CD C+ + C + P GE C C
Sbjct: 3 CSVCQRLQSPPKNRIVFCDG--CNTPFHQLCHEPYISDELLDSPNGEWFCDDC 53
>INSL Insulinase like Metallo protease domain
Length = 433
Score = 21.4 bits (45), Expect = 8.9
Identities = 5/47 (10%), Positives = 13/47 (27%), Gaps = 1/47 (2%)
Query: 1214 RYYRADDANVVRDRDLEVDTTLKSLSQQIENIR-SPEGSRKNPARTC 1259
+Y+ + VV + + + + P + P
Sbjct: 196 SFYQPRNMAVVIVGKVNPKEVEEEVMKTFGKEEGRPVPKVQIPTEPE 242
Underlying Matrix: BLOSUM62
Number of sequences tested against query: 105
Number of sequences better than 10.0: 12
Number of calls to ALIGN: 17
Length of query: 1464
Total length of test sequences: 20182
Effective length of test sequences: 16637.0
Effective search space size: 23806017.2
Initial X dropoff for ALIGN: 25.0 bits
Y. Wolf's SCOP PSSM
IMPALA version 1.1 [20-December-1999]
Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting,
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999),
"IMPALA: Matching a Protein Sequence Against a Collection of
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.
Query= tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen
[Homo sapiens]
(1464 letters)
Searching.................................................done
Results from profile search
Score E
Sequences producing significant alignments: (bits) Value
gi|230410 [1..153] beta-Trefoil 32 0.069
gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyl... 28 1.1
gi|1170529 [121..268] beta-Trefoil 28 1.4
gi|544107 [14..282] Protein kinases (PK), catalytic core 27 1.5
gi|1825699 [8..257] Ribonuclease H-like motif 26 4.1
gi|223347 [1..236] Prealbumin-like 26 5.4
gi|442904 [1..106] Ferredoxin-like 25 9.0
>gi|230410 [1..153] beta-Trefoil
Length = 153
Score = 31.9 bits (72), Expect = 0.069
Identities = 11/38 (28%), Positives = 18/38 (46%), Gaps = 6/38 (15%)
Query: 1306 SVAQKNWYISKNPKDKRHVWFG-----ESMTDGFQFEY 1338
S NWYIS + + V+ G + +TD F ++
Sbjct: 114 SAQFPNWYISTSQAENMPVFLGGTKGGQDITD-FTMQF 150
>gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyltransferases
Length = 402
Score = 27.9 bits (61), Expect = 1.1
Identities = 3/64 (4%), Positives = 12/64 (18%), Gaps = 2/64 (3%)
Query: 15 ATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEP--CRICVCDNGKVLCDD 72
A + G + D + + + + + + +
Sbjct: 34 LRAFREAHGTGYRFVGVEIDPHALDLPPWAEGVVADFLLWEPGEAFDLILGNPPYGIVGE 93
Query: 73 VICD 76
Sbjct: 94 ASKY 97
>gi|1170529 [121..268] beta-Trefoil
Length = 148
Score = 27.6 bits (61), Expect = 1.4
Identities = 13/62 (20%), Positives = 20/62 (31%), Gaps = 9/62 (14%)
Query: 1277 DPNQGCNLDAIK--VFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFG-ESMTDG 1333
P + + T SVA N +I+ + + G S+TD
Sbjct: 90 IPKTTTGGETNSLSSWETRGTK-----NYFISVAHPNLFIATKHDNWVCLAKGLPSITD- 143
Query: 1334 FQ 1335
FQ
Sbjct: 144 FQ 145
>gi|544107 [14..282] Protein kinases (PK), catalytic core
Length = 269
Score = 27.4 bits (59), Expect = 1.5
Identities = 5/61 (8%), Positives = 9/61 (14%), Gaps = 6/61 (9%)
Query: 1259 CRDLKMCHS------DWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNW 1312
L+ D +D G + +
Sbjct: 101 SSALEYLEKHGILHRDIHPNNILLDSMNGPAYLSDFSIAWSKQHPGEEVQELIPQIGTGH 160
Query: 1313 Y 1313
Y
Sbjct: 161 Y 161
>gi|1825699 [8..257] Ribonuclease H-like motif
Length = 250
Score = 26.3 bits (57), Expect = 4.1
Identities = 6/47 (12%), Positives = 10/47 (20%), Gaps = 4/47 (8%)
Query: 1376 YMDQQTGNLKKALLLKGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAW 1422
Q +KK + G S+ + T
Sbjct: 52 LFLQFLRVIKK--AYETLPPNAHVDVGLCTQRNSIVLWN--KRTLKE 94
>gi|223347 [1..236] Prealbumin-like
Length = 236
Score = 25.7 bits (56), Expect = 5.4
Identities = 6/38 (15%), Positives = 8/38 (20%)
Query: 357 PQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQP 394
PQ + GP G G+
Sbjct: 40 PQSISETTGPNFSHLGFGAHDHDLLLNFNNGGLPIGER 77
>gi|442904 [1..106] Ferredoxin-like
Length = 106
Score = 24.9 bits (53), Expect = 9.0
Identities = 10/67 (14%), Positives = 16/67 (22%), Gaps = 11/67 (16%)
Query: 39 TCVQNGLRYHDRDVWKPEP----CRICV--CDNG-KVLCDDVICDETKNCPGAEVPEGEC 91
C + + C +C C D+V D + E
Sbjct: 19 VCPVDCFYEGPNFLVIHPDECIDCALCEPECPAQAIFSEDEVPEDMQEFIQLN----AEL 74
Query: 92 CPVCPDG 98
V P+
Sbjct: 75 AEVWPNI 81
Underlying Matrix: BLOSUM62
Number of sequences tested against query: 1187
Number of sequences better than 10.0: 7
Number of calls to ALIGN: 7
Length of query: 1464
Total length of test sequences: 256703
Effective length of test sequences: 210706.0
Effective search space size: 300338576.9
Initial X dropoff for ALIGN: 25.0 bits