analysis of sequence from tem38
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRICVCDNGKVLC
DDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGL
PGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGE
PGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDA
GPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGP
GSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG
QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGE
RGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIG
PPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPR
GETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSG
EPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPP
GPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPG
PPSAGFDFSFLPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESM
TDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

sec.str. with predator

> tem38_gi|1418928|emb|CAA98968.1|
              .         .         .         .         .
1    MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDR   50
     ___HHHHHHHHHHHHHHHHH______________________________

              .         .         .         .         .
51   DVWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSE  100
     ________EEEEE_____EEE_EEE_________________________

              .         .         .         .         .
101  SPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPP  150
     __________________________________________________

              .         .         .         .         .
151  GPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQG  200
     __________________________________________________

              .         .         .         .         .
201  FQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQ  250
     __________________________________________________

              .         .         .         .         .
251  GARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ  300
     __________________________________________________

              .         .         .         .         .
301  MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG  350
     __________________________________________________

              .         .         .         .         .
351  AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN  400
     __________________________________________________

              .         .         .         .         .
401  GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGE  450
     __________________________________________________

              .         .         .         .         .
451  PGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADG  500
     __________________________________________________

              .         .         .         .         .
501  VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPD  550
     __________________________________________________

              .         .         .         .         .
551  GKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV  600
     __________________________________________________

              .         .         .         .         .
601  PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAG  650
     __________________________________________________

              .         .         .         .         .
651  PPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN  700
     __________________________________________________

              .         .         .         .         .
701  GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGP  750
     __________________________________________________

              .         .         .         .         .
751  KGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPG  800
     __________________________________________________

              .         .         .         .         .
801  DRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPP  850
     __________________________________________________

              .         .         .         .         .
851  GPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP  900
     __________________________________________________

              .         .         .         .         .
901  AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPG  950
     __________________________________________________

              .         .         .         .         .
951  PQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPM 1000
     ________EEEE______________________________________

              .         .         .         .         .
1001 GPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA 1050
     __________________________________________________

              .         .         .         .         .
1051 PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETG 1100
     __________________________________________________

              .         .         .         .         .
1101 EQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAP 1150
     __________________________________________________

              .         .         .         .         .
1151 GKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF 1200
     __________________________________________________

              .         .         .         .         .
1201 LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEG 1250
     ______________EEE_____EEE____HHHHHHHHHHHHHHHH_____

              .         .         .         .         .
1251 SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCV 1300
     _______HHHHHHH_________EEE_________EEEEEEE_____EEE

              .         .         .         .         .
1301 YPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI 1350
     E_______EEEEEE_______EEEEE________EEE_________HHHH

              .         .         .         .         .
1351 QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA 1400
     HHHHHHHHHHHHHHEEEEEE_____________HHHHHHH____EEEEEE

              .         .         .         .         .
1401 EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAP 1450
     _____EEEEEEE____________EEEEEE_______EEEEEE_______

              .    
1451 DQEFGFDVGPVCFL                                     1464
     ______________


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~


method         :         1
alpha-contents :       0.0 %
beta-contents  :       0.0 %
coil-contents  :     100.0 %
class          : irregular


method         :         2
alpha-contents :       0.0 %
beta-contents  :       0.0 %
coil-contents  :     100.0 %
class          : irregular


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

GPI: learning from metazoa
-16.14  -1.94  -1.37  -2.18   0.00   0.00   0.00   0.00  -0.48  -2.13  -1.80 -12.00 -12.00   0.00   0.00   0.00  -50.05
-18.84  -0.22  -0.33   0.00   0.00   0.00   0.00   0.00  -0.85  -2.10  -1.80 -12.00 -12.00   0.00   0.00   0.00  -48.15
ID: tem38_gi|1418928|emb|CAA98968.1|	AC: xxx Len: 1400 1:I  1373 Sc:  -48.15 Pv: 2.046476e-01 NO_GPI_SITE
GPI: learning from protozoa
-26.23  -2.20  -1.13  -0.72  -4.00   0.00   0.00   0.00  -0.08  -2.00  -7.07 -12.00 -12.00   0.00   0.00   0.00  -67.42
-24.64  -1.30  -1.78  -0.22  -4.00   0.00   0.00   0.00  -0.04  -2.20  -7.07 -12.00 -12.00   0.00   0.00   0.00  -65.26
ID: tem38_gi|1418928|emb|CAA98968.1|	AC: xxx Len: 1400 1:I  1371 Sc:  -65.26 Pv: 2.831094e-01 NO_GPI_SITE

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

# SignalP euk predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
tem38_gi|14  0.931  23 Y  0.884  23 Y  0.990  10 Y  0.921 Y
# SignalP gram- predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
tem38_gi|14  0.574 589 Y  0.485  23 Y  0.995   9 Y  0.789 Y
# SignalP gram+ predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?
tem38_gi|14  0.683 382 Y  0.334 1366 N  0.998  10 Y  0.083 N

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

low complexity regions: SEG 12 2.2 2.5
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]

                                  1-6    MFSFVD
                lrlllllaatallt    7-20   
                                 21-21   H
                    gqeegqvegq   22-31   
                                 32-111  DEDIPPITCVQNGLRYHDRDVWKPEPCRIC
                                         VCDNGKVLCDDVICDETKNCPGAEVPEGEC
                                         CPVCPDGSESPTDQETTGVE
gpkgdtgprgprgpagppgrdgipgqpglp  112-157  
              gppgppgppgppglgg
                                158-177  NFAPQLSYGYDEKSTGGISV
pgpmgpsgprglpgppgapgpqgfqgppge  178-230  
       pgepgasgpmgprgppgppgkng
                                231-232  DD
     geagkpgrpgergppgpqgarglpg  233-257  
                                258-271  TAGLPGMKGHRGFS
     gldgakgdagpagpkgepgspgeng  272-296  
                                297-301  APGQM
gprglpgergrpgapgpagargndgatgaa  302-353  
        gppgptgpagppgfpgavgakg
                                354-364  EAGPQGPRGSE
gpqgvrgepgppgpagaagpagnpgadgqp  365-437  
gakgangapgiagapgfpgargpsgpqgpg
                 gppgpkgnsgepg
                                438-447  APGSKGDTGA
kgepgpvgvqgppgpageegkrgargepgp  448-497  
          tglpgppgerggpgsrgfpg
                                498-511  ADGVAGPKGPAGER
              gspgpagpkgspgeag  512-527  
                                528-537  RPGEAGLPGA
kgltgspgspgpdgktgppgpagqdgrpgp  538-578  
                   pgppgargqag
                                579-582  VMGF
               pgpkgaagepgkage  583-597  
                                598-598  R
                 gvpgppgavgpag  599-611  
                                612-613  KD
     geagaqgppgpagpagergeqgpag  614-638  
                                639-639  S
       pgfqglpgpagppgeagkpgeqg  640-662  
                                663-685  VPGDLGAPGPSGARGERGFPGER
gvqgppgpagprgangapgndgakgdagap  686-725  
                    gapgsqgapg
                                726-766  LQGMPGERGAAGLPGPKGDRGDAGPKGADG
                                         SPGKDGVRGLT
                 gpigppgpagapg  767-779  
                                780-781  DK
gesgpsgpagptgargapgdrgepgppgpa  782-861  
gfagppgadgqpgakgepgdagakgdagpp
          gpagpagppgpignvgapga
                                862-865  KGAR
gsagppgatgfpgaagrvgppgpsgnagpp  866-956  
gppgpagkeggkgprgetgpagrpgevgpp
gppgpagekgspgadgpagapgtpgpqgia
                             g
                                957-991  QRGVVGLPGQRGERGFPGLPGPSGEPGKQG
                                         PSGAS
           gergppgpmgppglagppg  992-1010 
                               1011-1039 ESGREGAPAAEGSPGRDGSPGAKGDRGET
        gpagppgapgapgapgpvgpag 1040-1061 
                               1062-1069 KSGDRGET
     gpagpagpvgpvgargpagpqgprg 1070-1094 
                               1095-1117 DKGETGEQGDRGIKGHRGFSGLQ
gppgppgspgeqgpsgasgpagprgppgsa 1118-1151 
                          gapg
                               1152-1152 K
dglnglpgpigppgprgrtgdagpvgppgp 1153-1192 
                    pgppgppgpp
                               1193-1216 SAGFDFSFLPQPPQEKAHDGGRYY
                  raddanvvrdrd 1217-1228 
                               1229-1464 LEVDTTLKSLSQQIENIRSPEGSRKNPART
                                         CRDLKMCHSDWKSGEYWIDPNQGCNLDAIK
                                         VFCNMETGETCVYPTQPSVAQKNWYISKNP
                                         KDKRHVWFGESMTDGFQFEYGGQGSDPADV
                                         AIQLTFLRLMSTEASQNITYHCKNSVAYMD
                                         QQTGNLKKALLLKGSNEIEIRAEGNSRFTY
                                         SVTVDGCTSHTGAWGKTVIEYKTTKSSRLP
                                         IIDVAPLDVGAPDQEFGFDVGPVCFL

low complexity regions: SEG 25 3.0 3.3
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]

                                  1-6    MFSFVD
                lrlllllaatallt    7-20   
                                 21-81   HGQEEGQVEGQDEDIPPITCVQNGLRYHDR
                                         DVWKPEPCRICVCDNGKVLCDDVICDETKN
                                         C
pgaevpegeccpvcpdgsesptdqettgve   82-166  
gpkgdtgprgprgpagppgrdgipgqpglp
     gppgppgppgppglggnfapqlsyg
                                167-172  YDEKST
ggisvpgpmgpsgprglpgppgapgpqgfq  173-1195 
gppgepgepgasgpmgprgppgppgkngdd
geagkpgrpgergppgpqgarglpgtaglp
gmkghrgfsgldgakgdagpagpkgepgsp
gengapgqmgprglpgergrpgapgpagar
gndgatgaagppgptgpagppgfpgavgak
geagpqgprgsegpqgvrgepgppgpagaa
gpagnpgadgqpgakgangapgiagapgfp
gargpsgpqgpggppgpkgnsgepgapgsk
gdtgakgepgpvgvqgppgpageegkrgar
gepgptglpgppgerggpgsrgfpgadgva
gpkgpagergspgpagpkgspgeagrpgea
glpgakgltgspgspgpdgktgppgpagqd
grpgppgppgargqagvmgfpgpkgaagep
gkagergvpgppgavgpagkdgeagaqgpp
gpagpagergeqgpagspgfqglpgpagpp
geagkpgeqgvpgdlgapgpsgargergfp
gergvqgppgpagprgangapgndgakgda
gapgapgsqgapglqgmpgergaaglpgpk
gdrgdagpkgadgspgkdgvrgltgpigpp
gpagapgdkgesgpsgpagptgargapgdr
gepgppgpagfagppgadgqpgakgepgda
gakgdagppgpagpagppgpignvgapgak
gargsagppgatgfpgaagrvgppgpsgna
gppgppgpagkeggkgprgetgpagrpgev
gppgppgpagekgspgadgpagapgtpgpq
giagqrgvvglpgqrgergfpglpgpsgep
gkqgpsgasgergppgpmgppglagppges
gregapaaegspgrdgspgakgdrgetgpa
gppgapgapgapgpvgpagksgdrgetgpa
gpagpvgpvgargpagpqgprgdkgetgeq
gdrgikghrgfsglqgppgppgspgeqgps
gasgpagprgppgsagapgkdglnglpgpi
gppgprgrtgdagpvgppgppgppgppgpp
                           sag
                               1196-1464 FDFSFLPQPPQEKAHDGGRYYRADDANVVR
                                         DRDLEVDTTLKSLSQQIENIRSPEGSRKNP
                                         ARTCRDLKMCHSDWKSGEYWIDPNQGCNLD
                                         AIKVFCNMETGETCVYPTQPSVAQKNWYIS
                                         KNPKDKRHVWFGESMTDGFQFEYGGQGSDP
                                         ADVAIQLTFLRLMSTEASQNITYHCKNSVA
                                         YMDQQTGNLKKALLLKGSNEIEIRAEGNSR
                                         FTYSVTVDGCTSHTGAWGKTVIEYKTTKSS
                                         RLPIIDVAPLDVGAPDQEFGFDVGPVCFL

low complexity regions: SEG 45 3.4 3.75
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]

                                  1-95   MFSFVDLRLLLLLAATALLTHGQEEGQVEG
                                         QDEDIPPITCVQNGLRYHDRDVWKPEPCRI
                                         CVCDNGKVLCDDVICDETKNCPGAEVPEGE
                                         CCPVC
pdgsesptdqettgvegpkgdtgprgprgp   96-1195 
agppgrdgipgqpglpgppgppgppgppgl
ggnfapqlsygydekstggisvpgpmgpsg
prglpgppgapgpqgfqgppgepgepgasg
pmgprgppgppgkngddgeagkpgrpgerg
ppgpqgarglpgtaglpgmkghrgfsgldg
akgdagpagpkgepgspgengapgqmgprg
lpgergrpgapgpagargndgatgaagppg
ptgpagppgfpgavgakgeagpqgprgseg
pqgvrgepgppgpagaagpagnpgadgqpg
akgangapgiagapgfpgargpsgpqgpgg
ppgpkgnsgepgapgskgdtgakgepgpvg
vqgppgpageegkrgargepgptglpgppg
erggpgsrgfpgadgvagpkgpagergspg
pagpkgspgeagrpgeaglpgakgltgspg
spgpdgktgppgpagqdgrpgppgppgarg
qagvmgfpgpkgaagepgkagergvpgppg
avgpagkdgeagaqgppgpagpagergeqg
pagspgfqglpgpagppgeagkpgeqgvpg
dlgapgpsgargergfpgergvqgppgpag
prgangapgndgakgdagapgapgsqgapg
lqgmpgergaaglpgpkgdrgdagpkgadg
spgkdgvrgltgpigppgpagapgdkgesg
psgpagptgargapgdrgepgppgpagfag
ppgadgqpgakgepgdagakgdagppgpag
pagppgpignvgapgakgargsagppgatg
fpgaagrvgppgpsgnagppgppgpagkeg
gkgprgetgpagrpgevgppgppgpagekg
spgadgpagapgtpgpqgiagqrgvvglpg
qrgergfpglpgpsgepgkqgpsgasgerg
ppgpmgppglagppgesgregapaaegspg
rdgspgakgdrgetgpagppgapgapgapg
pvgpagksgdrgetgpagpagpvgpvgarg
pagpqgprgdkgetgeqgdrgikghrgfsg
lqgppgppgspgeqgpsgasgpagprgppg
sagapgkdglnglpgpigppgprgrtgdag
          pvgppgppgppgppgppsag
                               1196-1464 FDFSFLPQPPQEKAHDGGRYYRADDANVVR
                                         DRDLEVDTTLKSLSQQIENIRSPEGSRKNP
                                         ARTCRDLKMCHSDWKSGEYWIDPNQGCNLD
                                         AIKVFCNMETGETCVYPTQPSVAQKNWYIS
                                         KNPKDKRHVWFGESMTDGFQFEYGGQGSDP
                                         ADVAIQLTFLRLMSTEASQNITYHCKNSVA
                                         YMDQQTGNLKKALLLKGSNEIEIRAEGNSR
                                         FTYSVTVDGCTSHTGAWGKTVIEYKTTKSS
                                         RLPIIDVAPLDVGAPDQEFGFDVGPVCFL


low complexity regions: XNU
# Score cutoff = 21, Search from offsets 1 to 4
# both members of each repeat flagged
# lambda = 0.347, K = 0.200, H = 0.664
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
MFSFVDLRlllllaatallTHgqeegqvegqdeDIPPITCVQNGLRYHDRDVWKPEPCRI
CVCDNGKVLCDDVICDETKNCPGAEVPEGECcpvcpdgsesptdqettgvegpkgdtgpr
gprgpagppgrdgipgqpglpgppgppgppgppgLGGNFAPQLSYGYDEKSTGGISVPgp
mgpsgprglpgppgapgpqgfqgppgepgepgasgpmgprgppgppgkngddgeagkpgr
pgergppgpqgarglpgtaglpgmkghrgfsgldgakgdagpagpkgepgspgengapgq
mgprglpgergrpgapgpagargndgatgaagppgptgpagppgfpgavgakgeagpqgp
rgsegpqgvrgepgppgpagaagpagnpgadgqpgakgangapgiagapgfpgargpsgp
qgpggppgpkgnsgepgapgskgdtgakgepgpvgvqgppgpageegkrgargepgptgl
pgppgerggpgsrgfpgadgvagpkgpagergspgpagpkgspgeagrpgeaglpgakgl
tgspgspgpdgktgppgpagqdgrpgppgppgargqagvmgfpgpkgaagepgkagergv
pgppgavgpagkdgeagaqgppgpagpagergeqgpagspgfqglpgpagppgeagkpge
qgvpgdlgapgpsgargergfpgergvqgppgpagprgangapgndgakgdagapgapgs
qgapglqgmpgergaaglpgpkgdrgdagpkgadgspgkdgvrgltgpigppgpagapgd
kgesgpsgpagptgargapgdrgepgppgpagfagppgadgqpgakgepgdagakgdagp
pgpagpagppgpignvgapgakgargsagppgatgfpgaagrvgppgpsgnagppgppgp
agkeggkgprgetgpagrpgevgppgppgpagekgspgadgpagapgtpgpqgiagqrgv
vglpgqrgergfpglpgpsgepgkqgpsgasgergppgpmgppglagppgesgregapaa
egspgrdgspgakgdrgetgpagppgapgapgapgpvgpagksgdrgetgpagpagpvgp
vgargpagpqgprgdkgetgeqgdrgikghrgfsglqgppgppgspgeqgpsgasgpagp
rgppgsagapgkdglnglpgpigppgprgrtgdagpvgppgppgppgppgppsagfdfsf
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD
KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ
TGNLKKALLLKGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPII
DVAPLDVGAPDQEFGFDVGPVCFL
    1 -    8 MFSFVDLR
    9 -   19   ll lllaatall
   20 -   21 T H
   22 -   33   gqeegqveg qde
   34 -   91 DIPPITC VQNGLRYHDR DVWKPEPCRI CVCDNGKVLC DDVICDETKN CPGAEVPEGE C
   92 -  154   cpvcpdgse sptdqettgv egpkgdtgpr gprgpagppg rdgipgqpgl pgppgppgpp g
               ppg
  155 -  178 LGGNFA PQLSYGYDEK STGGISVP
  179 - 1200   gp mgpsgprglp gppgapgpqg fqgppgepge pgasgpmgpr gppgppgkng ddgeagkp
               gr pgergppgpq garglpgtag lpgmkghrgf sgldgakgda gpagpkgepg spgengap
               gq mgprglpger grpgapgpag argndgatga agppgptgpa gppgfpgavg akgeagpq
               gp rgsegpqgvr gepgppgpag aagpagnpga dgqpgakgan gapgiagapg fpgargps
               gp qgpggppgpk gnsgepgapg skgdtgakge pgpvgvqgpp gpageegkrg argepgpt
               gl pgppgerggp gsrgfpgadg vagpkgpage rgspgpagpk gspgeagrpg eaglpgak
               gl tgspgspgpd gktgppgpag qdgrpgppgp pgargqagvm gfpgpkgaag epgkager
               gv pgppgavgpa gkdgeagaqg ppgpagpage rgeqgpagsp gfqglpgpag ppgeagkp
               ge qgvpgdlgap gpsgargerg fpgergvqgp pgpagprgan gapgndgakg dagapgap
               gs qgapglqgmp gergaaglpg pkgdrgdagp kgadgspgkd gvrgltgpig ppgpagap
               gd kgesgpsgpa gptgargapg drgepgppgp agfagppgad gqpgakgepg dagakgda
               gp pgpagpagpp gpignvgapg akgargsagp pgatgfpgaa grvgppgpsg nagppgpp
               gp agkeggkgpr getgpagrpg evgppgppgp agekgspgad gpagapgtpg pqgiagqr
               gv vglpgqrger gfpglpgpsg epgkqgpsga sgergppgpm gppglagppg esgregap
               aa egspgrdgsp gakgdrgetg pagppgapga pgapgpvgpa gksgdrgetg pagpagpv
               gp vgargpagpq gprgdkgetg eqgdrgikgh rgfsglqgpp gppgspgeqg psgasgpa
               gp rgppgsagap gkdglnglpg pigppgprgr tgdagpvgpp gppgppgppg ppsagfdf
               sf 
 1201 - 1464 LPQPPQEKAH DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR 
             DLKMCHSDWK SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPSVAQK NWYISKNPKD 
             KRHVWFGESM TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ 
             TGNLKKALLL KGSNEIEIRA EGNSRFTYSV TVDGCTSHTG AWGKTVIEYK TTKSSRLPII 
             DVAPLDVGAP DQEFGFDVGP VCFL

low complexity regions: DUST
>tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen [Homo sapiens]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRI
CVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPR
GPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGP
MGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGR
PGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGP
RGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGL
PGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGL
TGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGE
QGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGS
QGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGD
KGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGV
VGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPAA
EGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGP
VGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGP
RGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD
KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ
TGNLKKALLLKGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPII
DVAPLDVGAPDQEFGFDVGPVCFL

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

coiled coil prediction for tem38_gi|1418928|emb|CAA98968.1|
sequence: 1400 amino acids, 0 residue(s) in coiled coil state

    .    |     .    |     .    |     .    |     .    |     .   60
MFSFVDLRLL LLLAATALLT HGQEEGQVEG QDEDIPPITC VQNGLRYHDR DVWKPEPCRI
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  120
CVCDNGKVLC DDVICDETKN CPGAEVPEGE CCPVCPDGSE SPTDQETTGV EGPKGDTGPR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  180
GPRGPAGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA PQLSYGYDEK STGGISVPGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  240
MGPSGPRGLP GPPGAPGPQG FQGPPGEPGE PGASGPMGPR GPPGPPGKNG DDGEAGKPGR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  300
PGERGPPGPQ GARGLPGTAG LPGMKGHRGF SGLDGAKGDA GPAGPKGEPG SPGENGAPGQ
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  360
MGPRGLPGER GRPGAPGPAG ARGNDGATGA AGPPGPTGPA GPPGFPGAVG AKGEAGPQGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  420
RGSEGPQGVR GEPGPPGPAG AAGPAGNPGA DGQPGAKGAN GAPGIAGAPG FPGARGPSGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  480
QGPGGPPGPK GNSGEPGAPG SKGDTGAKGE PGPVGVQGPP GPAGEEGKRG ARGEPGPTGL
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  540
PGPPGERGGP GSRGFPGADG VAGPKGPAGE RGSPGPAGPK GSPGEAGRPG EAGLPGAKGL
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  600
TGSPGSPGPD GKTGPPGPAG QDGRPGPPGP PGARGQAGVM GFPGPKGAAG EPGKAGERGV
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  660
PGPPGAVGPA GKDGEAGAQG PPGPAGPAGE RGEQGPAGSP GFQGLPGPAG PPGEAGKPGE
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  720
QGVPGDLGAP GPSGARGERG FPGERGVQGP PGPAGPRGAN GAPGNDGAKG DAGAPGAPGS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  780
QGAPGLQGMP GERGAAGLPG PKGDRGDAGP KGADGSPGKD GVRGLTGPIG PPGPAGAPGD
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  840
KGESGPSGPA GPTGARGAPG DRGEPGPPGP AGFAGPPGAD GQPGAKGEPG DAGAKGDAGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  900
PGPAGPAGPP GPIGNVGAPG AKGARGSAGP PGATGFPGAA GRVGPPGPSG NAGPPGPPGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     .  960
AGKEGGKGPR GETGPAGRPG EVGPPGPPGP AGEKGSPGAD GPAGAPGTPG PQGIAGQRGV
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1020
VGLPGQRGER GFPGLPGPSG EPGKQGPSGA SGERGPPGPM GPPGLAGPPG ESGREGAPAA
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1080
EGSPGRDGSP GAKGDRGETG PAGPPGAPGA PGAPGPVGPA GKSGDRGETG PAGPAGPVGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1140
VGARGPAGPQ GPRGDKGETG EQGDRGIKGH RGFSGLQGPP GPPGSPGEQG PSGASGPAGP
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1200
RGPPGSAGAP GKDGLNGLPG PIGPPGPRGR TGDAGPVGPP GPPGPPGPPG PPSAGFDFSF
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1260
LPQPPQEKAH DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~4 4467777777 7777777~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1320
DLKMCHSDWK SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPSVAQK NWYISKNPKD
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    |     .    |     .    |     .    |     . 1380
KRHVWFGESM TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK  -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local

    .    |     .    | 
TGNLKKALLL KGSNEIEIRA 
~~~~~~~~~~ ~~~~~~~~~~ 
---------- ---------- 
~~~~~~~~~~ ~~~~~~~~~~ 
~~~~~~~~~~ ~~~~~~~~~~ 
~~~~~~~~~~ ~~~~~~~~~~ 
~~~~~~~~~~ ~~~~~~~~~~ 



~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

prediction of transmembrane regions with toppred2

     ***********************************
     *TOPPREDM with eukaryotic function*
     ***********************************

tem38.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: tem38.___inter___

 (1 sequences)
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDR
DVWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSE
SPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPP
GPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQG
FQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQ
GARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN
GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGE
PGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADG
VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPD
GKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAG
PPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGP
KGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPG
DRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPP
GPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPG
PQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPM
GPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETG
EQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAP
GKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEG
SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCV
YPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI
QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAP
DQEFGFDVGPVCFL


(p)rokaryotic or (e)ukaryotic: e


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 8 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1     2    22   0.700 Putative
     2   331   351   0.844 Putative
     3  1041  1061   0.758 Putative

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     2     3
 Loop length     1   308   689   403
 K+R profile  1.00           +      
                       +           +      
CYT-EXT prof     -        0.61      
                    0.33        0.81      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.06
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:  -0.54
-> Orientation: N-in

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       1     3
 Loop length     1  1018   403
 K+R profile  1.00           +      
                       +      
CYT-EXT prof     -        0.81      
                    0.56      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.10
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:   0.26
-> Orientation: N-out

----------------------------------------------------------------------
Structure 3

Transmembrane segments included in this structure:
     Segment       1     2
 Loop length     1   308  1113
 K+R profile  1.00           +      
                       +      
CYT-EXT prof     -        0.67      
                    0.33      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.15
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 339144483842856283565095402520707072.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:   0.34
-> Orientation: N-out

----------------------------------------------------------------------
Structure 4

Transmembrane segments included in this structure:
     Segment       1
 Loop length     1  1442
 K+R profile  1.00      
                       +      
CYT-EXT prof     -      
                    0.61      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 1.00
Tm probability: 0.25
-> Orientation: N-in

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 0.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:  -0.61
-> Orientation: N-in

----------------------------------------------------------------------
Structure 5

Transmembrane segments included in this structure:
     Segment       2     3
 Loop length   330   689   403
 K+R profile     +           +      
                       +      
CYT-EXT prof  0.31        0.81      
                    0.61      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.24
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 0.00
 (NEG-POS)/(NEG+POS): 0.1818
                 NEG: 39.0000
                 POS: 27.0000
-> Orientation: undecided

CYT-EXT difference:   0.51
-> Orientation: N-out

----------------------------------------------------------------------
Structure 6

Transmembrane segments included in this structure:
     Segment       3
 Loop length  1040   403
 K+R profile     +      
                       +      
CYT-EXT prof  0.55      
                    0.81      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.40
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 0.00
 (NEG-POS)/(NEG+POS): 0.0492
                 NEG: 96.0000
                 POS: 87.0000
-> Orientation: undecided

CYT-EXT difference:  -0.26
-> Orientation: N-in

----------------------------------------------------------------------
Structure 7

Transmembrane segments included in this structure:
     Segment       2
 Loop length   330  1113
 K+R profile     +      
                       +      
CYT-EXT prof  0.31      
                    0.67      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.61
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 0.00
 (NEG-POS)/(NEG+POS): 0.1818
                 NEG: 39.0000
                 POS: 27.0000
-> Orientation: undecided

CYT-EXT difference:  -0.37
-> Orientation: N-in

----------------------------------------------------------------------
Structure 8

Transmembrane segments included in this structure:
     Segment  
 Loop length  1464
 K+R profile     +      
                  
CYT-EXT prof  0.61      
                  
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 1.00
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): -4.00
 (NEG-POS)/(NEG+POS): 0.0444
                 NEG: 141.0000
                 POS: 129.0000
-> Orientation: N-out

CYT-EXT difference:   0.61
-> Orientation: N-out

----------------------------------------------------------------------

"tem38" 1464 
 2 22 #f 0.7
 331 351 #f 0.84375
 1041 1061 #f 0.758333



     ************************************
     *TOPPREDM with prokaryotic function*
     ************************************

tem38.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: tem38.___inter___

 (1 sequences)
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDR
DVWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSE
SPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPP
GPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQG
FQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQ
GARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ
MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN
GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGE
PGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADG
VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPD
GKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV
PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAG
PPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGP
KGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPG
DRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPP
GPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP
AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPG
PQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPM
GPPGLAGPPGESGREGAPAAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETG
EQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAP
GKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF
LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEG
SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCV
YPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI
QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLKGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPIIDVAPLDVGAP
DQEFGFDVGPVCFL


(p)rokaryotic or (e)ukaryotic: p


Charge-pair energy: 0

Length of full window (odd number!): 21

Length of core window (odd number!): 11

Number of residues to add to each end of helix: 1

Critical length: 60

Upper cutoff for candidates: 1

Lower cutoff for candidates: 0.6
Total of 8 structures are to be tested


Candidate membrane-spanning segments:

 Helix Begin   End   Score Certainity
     1     2    22   0.700 Putative
     2   331   351   0.844 Putative
     3  1041  1061   0.758 Putative

----------------------------------------------------------------------
Structure 1

Transmembrane segments included in this structure:
     Segment       1     2     3
 Loop length     1   308   689   403
 K+R profile  0.00           +      
                       +           +      
CYT-EXT prof     -        0.61      
                    0.33        0.81      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.06
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:  -0.54
-> Orientation: N-in

----------------------------------------------------------------------
Structure 2

Transmembrane segments included in this structure:
     Segment       2     3
 Loop length   330   689   403
 K+R profile     +           +      
                       +      
CYT-EXT prof  0.31        0.81      
                    0.61      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.24
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 0.00
 (NEG-POS)/(NEG+POS): 0.1818
                 NEG: 39.0000
                 POS: 27.0000
-> Orientation: undecided

CYT-EXT difference:   0.51
-> Orientation: N-out

----------------------------------------------------------------------
Structure 3

Transmembrane segments included in this structure:
     Segment       1     3
 Loop length     1  1018   403
 K+R profile  0.00           +      
                       +      
CYT-EXT prof     -        0.81      
                    0.56      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.10
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 5399089840598723119226988666434663743532463987515388551976903392781192631681024.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:   0.26
-> Orientation: N-out

----------------------------------------------------------------------
Structure 4

Transmembrane segments included in this structure:
     Segment       3
 Loop length  1040   403
 K+R profile     +      
                       +      
CYT-EXT prof  0.55      
                    0.81      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.40
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 0.00
 (NEG-POS)/(NEG+POS): 0.0492
                 NEG: 96.0000
                 POS: 87.0000
-> Orientation: undecided

CYT-EXT difference:  -0.26
-> Orientation: N-in

----------------------------------------------------------------------
Structure 5

Transmembrane segments included in this structure:
     Segment       1     2
 Loop length     1   308  1113
 K+R profile  0.00           +      
                       +      
CYT-EXT prof     -        0.67      
                    0.33      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.15
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 339144483842856283565095402520707072.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:   0.34
-> Orientation: N-out

----------------------------------------------------------------------
Structure 6

Transmembrane segments included in this structure:
     Segment       2
 Loop length   330  1113
 K+R profile     +      
                       +      
CYT-EXT prof  0.31      
                    0.67      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.61
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 0.00
 (NEG-POS)/(NEG+POS): 0.1818
                 NEG: 39.0000
                 POS: 27.0000
-> Orientation: undecided

CYT-EXT difference:  -0.37
-> Orientation: N-in

----------------------------------------------------------------------
Structure 7

Transmembrane segments included in this structure:
     Segment       1
 Loop length     1  1442
 K+R profile  0.00      
                       +      
CYT-EXT prof     -      
                    0.61      
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 0.25
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): 6.00
 (NEG-POS)/(NEG+POS): 0.0000
                 NEG: 0.0000
                 POS: 0.0000
-> Orientation: N-in

CYT-EXT difference:  -0.61
-> Orientation: N-in

----------------------------------------------------------------------
Structure 8

Transmembrane segments included in this structure:
     Segment  
 Loop length  1464
 K+R profile     +      
                  
CYT-EXT prof  0.61      
                  
For CYT-EXT profile neg. values indicate cytoplasmic preference.


K+R difference: 0.00
Tm probability: 1.00
-> Orientation: undecided

Charge-difference over N-terminal Tm (+-15 residues): -4.00
 (NEG-POS)/(NEG+POS): 0.0444
                 NEG: 141.0000
                 POS: 129.0000
-> Orientation: N-out

CYT-EXT difference:   0.61
-> Orientation: N-out

----------------------------------------------------------------------

"tem38" 1464 
 2 22 #f 0.7
 331 351 #f 0.84375
 1041 1061 #f 0.758333



~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

NOW EXECUTING:   /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem38.___saps___
SAPS.  Version of April 11, 1996.
Date run: Tue Oct 31 18:34:55 2000

File: /people/maria/tem38.___saps___
ID   tem38_gi|1418928|emb|CAA98968.1|
DE   prepro-alpha1(I) collagen [Homo sapiens]

number of residues: 1464;   molecular weight: 138.9 kdal
 
         1  MFSFVDLRLL LLLAATALLT HGQEEGQVEG QDEDIPPITC VQNGLRYHDR DVWKPEPCRI 
        61  CVCDNGKVLC DDVICDETKN CPGAEVPEGE CCPVCPDGSE SPTDQETTGV EGPKGDTGPR 
       121  GPRGPAGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA PQLSYGYDEK STGGISVPGP 
       181  MGPSGPRGLP GPPGAPGPQG FQGPPGEPGE PGASGPMGPR GPPGPPGKNG DDGEAGKPGR 
       241  PGERGPPGPQ GARGLPGTAG LPGMKGHRGF SGLDGAKGDA GPAGPKGEPG SPGENGAPGQ 
       301  MGPRGLPGER GRPGAPGPAG ARGNDGATGA AGPPGPTGPA GPPGFPGAVG AKGEAGPQGP 
       361  RGSEGPQGVR GEPGPPGPAG AAGPAGNPGA DGQPGAKGAN GAPGIAGAPG FPGARGPSGP 
       421  QGPGGPPGPK GNSGEPGAPG SKGDTGAKGE PGPVGVQGPP GPAGEEGKRG ARGEPGPTGL 
       481  PGPPGERGGP GSRGFPGADG VAGPKGPAGE RGSPGPAGPK GSPGEAGRPG EAGLPGAKGL 
       541  TGSPGSPGPD GKTGPPGPAG QDGRPGPPGP PGARGQAGVM GFPGPKGAAG EPGKAGERGV 
       601  PGPPGAVGPA GKDGEAGAQG PPGPAGPAGE RGEQGPAGSP GFQGLPGPAG PPGEAGKPGE 
       661  QGVPGDLGAP GPSGARGERG FPGERGVQGP PGPAGPRGAN GAPGNDGAKG DAGAPGAPGS 
       721  QGAPGLQGMP GERGAAGLPG PKGDRGDAGP KGADGSPGKD GVRGLTGPIG PPGPAGAPGD 
       781  KGESGPSGPA GPTGARGAPG DRGEPGPPGP AGFAGPPGAD GQPGAKGEPG DAGAKGDAGP 
       841  PGPAGPAGPP GPIGNVGAPG AKGARGSAGP PGATGFPGAA GRVGPPGPSG NAGPPGPPGP 
       901  AGKEGGKGPR GETGPAGRPG EVGPPGPPGP AGEKGSPGAD GPAGAPGTPG PQGIAGQRGV 
       961  VGLPGQRGER GFPGLPGPSG EPGKQGPSGA SGERGPPGPM GPPGLAGPPG ESGREGAPAA 
      1021  EGSPGRDGSP GAKGDRGETG PAGPPGAPGA PGAPGPVGPA GKSGDRGETG PAGPAGPVGP 
      1081  VGARGPAGPQ GPRGDKGETG EQGDRGIKGH RGFSGLQGPP GPPGSPGEQG PSGASGPAGP 
      1141  RGPPGSAGAP GKDGLNGLPG PIGPPGPRGR TGDAGPVGPP GPPGPPGPPG PPSAGFDFSF 
      1201  LPQPPQEKAH DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR 
      1261  DLKMCHSDWK SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPSVAQK NWYISKNPKD 
      1321  KRHVWFGESM TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ 
      1381  TGNLKKALLL KGSNEIEIRA EGNSRFTYSV TVDGCTSHTG AWGKTVIEYK TTKSSRLPII 
      1441  DVAPLDVGAP DQEFGFDVGP VCFL

--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s)

A  :141( 9.6%); C  : 18( 1.2%); D  : 66( 4.5%); E  : 75( 5.1%); F  : 27( 1.8%)
G++:390(26.6%); H  :  9( 0.6%); I- : 24( 1.6%); K  : 58( 4.0%); L--: 48( 3.3%)
M  : 13( 0.9%); N  : 28( 1.9%); P++:278(19.0%); Q  : 48( 3.3%); R  : 71( 4.8%)
S  : 61( 4.2%); T- : 43( 2.9%); V- : 47( 3.2%); W  :  6( 0.4%); Y- : 13( 0.9%)

KR      :  129 (  8.8%);   ED      :  141 (  9.6%);   AGP   ++:  809 ( 55.3%);
KRED    :  270 ( 18.4%);   KR-ED   :  -12 ( -0.8%);   FIKMNY- :  163 ( 11.1%);
LVIFM --:  159 ( 10.9%);   ST    - :  104 (  7.1%).

--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS
 
         1  00000-0+00 0000000000 000--000-0 0---000000 00000+00-+ -00+0-00+0 
        61  000-00+000 --000--0+0 0000-00-0- 000000-00- 000-0-0000 -00+0-000+ 
       121  00+0000000 +-00000000 0000000000 0000000000 0000000--+ 0000000000 
       181  000000+000 0000000000 000000-00- 000000000+ 0000000+00 --0-00+00+ 
       241  00-+000000 00+0000000 0000+00+00 000-00+0-0 00000+0-00 000-000000 
       301  000+0000-+ 0+00000000 0+00-00000 0000000000 0000000000 0+0-000000 
       361  +00-00000+ 0-00000000 0000000000 -00000+000 0000000000 0000+00000 
       421  000000000+ 0000-00000 0+0-000+0- 0000000000 0000--0++0 0+0-000000 
       481  00000-+000 00+00000-0 0000+0000- +00000000+ 0000-00+00 -000000+00 
       541  000000000- 0+00000000 0-0+000000 000+000000 00000+0000 -00+00-+00 
       601  0000000000 0+-0-00000 000000000- +0-0000000 0000000000 000-00+00- 
       661  00000-0000 00000+0-+0 000-+00000 000000+000 00000-00+0 -000000000 
       721  0000000000 0-+0000000 0+0-+0-000 +00-0000+- 00+0000000 000000000- 
       781  +0-0000000 00000+0000 -+0-000000 000000000- 00000+0-00 -000+0-000 
       841  0000000000 0000000000 0+00+00000 0000000000 0+00000000 0000000000 
       901  00+-00+00+ 0-00000+00 -000000000 00-+00000- 0000000000 0000000+00 
       961  000000+0-+ 0000000000 -00+000000 00-+000000 0000000000 -00+-00000 
      1021  -0000+-000 00+0-+0-00 0000000000 0000000000 0+00-+0-00 0000000000 
      1081  000+000000 00+0-+0-00 -00-+00+00 +000000000 0000000-00 0000000000 
      1141  +000000000 0+-0000000 0000000+0+ 00-0000000 0000000000 000000-000 
      1201  000000-+00 -00+00+0-- 0000+-+-0- 0-000+0000 00-00+00-0 0++000+00+ 
      1261  -0+0000-0+ 00-000-000 0000-00+00 000-00-000 000000000+ 00000+00+- 
      1321  ++00000-00 0-0000-000 000-00-000 00000+0000 -000000000 +000000-00 
      1381  0000++0000 +000-0-0+0 -000+00000 00-0000000 000+000-0+ 00+00+0000 
      1441  -0000-0000 -0-000-000 0000

A. CHARGE CLUSTERS.


Positive charge clusters (cmin =  9/30 or 12/45 or 15/60):  none


Negative charge clusters (cmin = 10/30 or 13/45 or 16/60):  none


Mixed charge clusters (cmin = 15/30 or 20/45 or 24/60):  none


B. HIGH SCORING (UN)CHARGED SEGMENTS.

There are no high scoring positive charge segments.
There are no high scoring negative charge segments.
There are no high scoring mixed charge segments.
There are no high scoring uncharged segments.


C. CHARGE RUNS AND PATTERNS.

pattern  (+)|  (-)|  (*)|  (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)|
lmin0     5 |   5 |   7 |  50 |  10 |  10 |  13 |  12 |  12 |  16 |   6 |   7 | 
lmin1     6 |   6 |   9 |  60 |  12 |  12 |  16 |  15 |  15 |  20 |   7 |   9 | 
lmin2     7 |   8 |  10 |  67 |  13 |  14 |  18 |  17 |  17 |  22 |   8 |  10 | 
 (Significance level: 0.010000; Minimal displayed length:  6)
There are no charge runs or patterns exceeding the given minimal lengths.

Run count statistics:

  +  runs >=   3:   0
  -  runs >=   3:   1, at   32;
  *  runs >=   5:   0
  0  runs >=  33:   1, at  133;

--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES

1. HIGH SCORING SEGMENTS.
There are no high scoring hydrophobic segments.
There are no high scoring transmembrane segments.


2. SPACINGS OF C.


H2N-39-C-17-C-2-C-1-C-6-C-4-C-5-C-9-C-C-2-C-1163-C-5-C-16-C-8-C-7-C-70-C-44-C-46-C-2-COOH


2*. SPACINGS OF C and H. (additional deluxe function for ALEX)


H2N-20-H-18-C-7-H-9-C-2-C-1-C-6-C-4-C-5-C-9-C-C-2-C-171-H-842-H-99-H-48-C-5-C-H-15-C-8-C-7-C-23-H-45-H-C-44-C-2-H-43-C-2-COOH

--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.

A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length:  5

Aligned matching blocks:


[ 112- 116]   GPKGD
[ 740- 744]   GPKGD

______________________________

[ 114- 118]   KGDTG
[ 442- 446]   KGDTG

______________________________

[         ]--------[         ]--------[ 118- 122]-(  -5)-[ 118- 125]-(  -8)-
[         ]--------[         ]--------[ 121- 125]-(  -5)-[ 121- 128]--------
[ 206- 207]-(   4)-[ 212- 224]-(  -7)-[ 218- 222]-(  -5)-[ 218- 225]-( -11)-
[1127-1128]-(   4)-[1133-1145]-(  -7)-[1139-1143]--------[         ]--------


[ 118- 130]
[         ]
[ 215- 227]
[         ]


[ 212- 224]   GASGPMGPRGPPG
[1133-1145]   GASGPAGPRGPPG

[ 118- 122]   GPRGP
[ 121- 125]   GPRGP
[ 218- 222]   GPRGP
[1139-1143]   GPRGP

[ 118- 125]   GPRGPRGP
[ 121- 128]   GPRGPAGP
[ 218- 225]   GPRGPPGP

[ 118- 130]   GPRGPRGPAGPPG
[ 215- 227]   GPMGPRGPPGPPG

______________________________

[ 123- 130]   RGPAGPPG
[ 220- 227]   RGPPGPPG
[ 415- 424]   RGPSGPQGPG
[ 865- 872]   RGSAGP__PG

with superset:
  [ 123- 128]   RGPAGP
  [ 220- 225]   RGPPGP
  [ 244- 249]   RGPPGP
  [ 415- 420]   RGPSGP
  [ 745- 750]   RGDAGP
  [ 865- 870]   RGSAGP
  [ 994- 999]   RGPPGP
  [1084-1089]   RGPAGP

______________________________

[ 126- 130]   AGPPG
[ 331- 336]   AGPPGP
[ 340- 344]   AGPPG
[ 649- 654]   AGPPGE
[ 814- 819]   AGPPGA
[ 838- 843]   AGPPGP
[ 847- 852]   AGPPGP
[ 868- 873]   AGPPGA
[ 892- 897]   AGPPGP
[1006-1011]   AGPPGE
[1042-1047]   AGPPGA

______________________________

[ 129- 136]   PGRDGIPG
[1024-1031]   PGRDGSPG

______________________________

[ 139- 149]   GLPGPPGPPGP
[ 188- 198]   GLPGPPGAPGP

with superset:
  [ 139- 143]   GLPGP
  [ 188- 192]   GLPGP
  [ 479- 483]   GLPGP
  [ 644- 648]   GLPGP
  [ 737- 741]   GLPGP
  [ 974- 978]   GLPGP
  [1157-1161]   GLPGP

and:
  [ 139- 145]   GLPGPPG
  [ 188- 194]   GLPGPPG
  [ 479- 485]   GLPGPPG

______________________________

[ 145- 155]   GPPGPPGPPGL
[ 995-1005]   GPPGPMGPPGL

______________________________

[ 179- 186]   GPMGPSGP
[ 215- 222]   GPMGPRGP

with superset:
  [ 179- 183]   GPMGP
  [ 215- 219]   GPMGP
  [ 998-1002]   GPMGP

______________________________

[ 185- 194]   GPRGLPGPPG
[ 218- 227]   GPRGPPGPPG
[ 476- 485]   GPTGLPGPPG

with superset:
  [ 185- 191]   GPRGLPG
  [ 218- 224]   GPRGPPG
  [ 251- 257]   GARGLPG
  [ 302- 308]   GPRGLPG
  [ 476- 482]   GPTGLPG
  [1139-1145]   GPRGPPG

______________________________

[ 187- 191]   RGLPG
[ 253- 257]   RGLPG
[ 304- 308]   RGLPG

______________________________

[ 191- 197]   GPPGAPG
[1043-1049]   GPPGAPG

______________________________

[ 196- 200]   PGPQG
[ 247- 251]   PGPQG
[ 949- 953]   PGPQG

______________________________

[ 203- 207]   GPPGE
[ 482- 486]   GPPGE
[ 650- 654]   GPPGE
[1007-1011]   GPPGE

______________________________

[ 208- 227]   PGEPGASGPMGPRGPPGPPG
[ 409- 428]   PGFPGARGPSGPQGPGGPPG

with superset:
  [ 190- 195]   PGPPGA
  [ 208- 213]   PGEPGA
  [ 343- 348]   PGFPGA
  [ 409- 414]   PGFPGA
  [ 568- 573]   PGPPGA
  [ 601- 606]   PGPPGA
  [1045-1050]   PGAPGA

and:
  [ 190- 198]   PGPPGAPGP
  [ 208- 216]   PGEPGASGP
  [ 409- 417]   PGFPGARGP
  [ 601- 609]   PGPPGAVGP

______________________________

[ 191- 192]-(   4)-[ 197- 198]-(   4)-[ 203- 204]-(   4)-[ 209- 213]
[ 416- 417]-(   4)-[ 422- 423]-(   4)-[ 428- 429]-(   4)-[ 434- 438]

[ 209- 213]   GEPGA
[ 434- 438]   GEPGA

______________________________

[ 205- 213]-(   3)-[ 217- 224]
[ 289- 297]-(   3)-[ 301- 308]

[ 205- 213]   PGEPGEPGA
[ 289- 297]   PGSPGENGA

[ 217- 224]   MGPRGPPG
[ 301- 308]   MGPRGLPG

______________________________

[ 232- 236]   DGEAG
[ 613- 617]   DGEAG

______________________________

[ 233- 234]-(   4)-[ 239- 249]
[ 911- 912]-(   4)-[ 917- 927]

[ 239- 249]   GRPGERGPPGP
[ 917- 927]   GRPGEVGPPGP

with superset:
  [ 239- 243]   GRPGE
  [ 527- 531]   GRPGE
  [ 917- 921]   GRPGE

______________________________

[ 241- 248]   PGERGPPG
[ 307- 314]   PGERGRPG
[ 484- 491]   PGERGGPG

with superset:
  [ 241- 245]   PGERG
  [ 307- 311]   PGERG
  [ 484- 488]   PGERG
  [ 682- 686]   PGERG
  [ 730- 734]   PGERG

______________________________

[         ]--------[ 244- 248]-(  -5)-[ 244- 252]
[ 982- 999]-(  -6)-[ 994- 998]--------[         ]
[1126-1143]-(  -3)-[1141-1145]-(  -5)-[1141-1149]

[ 982- 999]   PGKQGPSGASGERGPPGP
[1126-1143]   PGEQGPSGASGPAGPRGP

with superset:
  [ 671- 675]   GPSGA
  [ 986- 990]   GPSGA
  [1130-1134]   GPSGA

[ 244- 248]   RGPPG
[ 994- 998]   RGPPG
[1141-1145]   RGPPG

[ 244- 252]   RGPPGPQGA
[1141-1149]   RGPPGSAGA

______________________________

[ 259- 266]-(  -8)-[ 259- 269]
[ 532- 539]--------[         ]
[ 736- 743]-(  -8)-[ 736- 746]

[ 259- 266]   AGLPGMKG
[ 532- 539]   AGLPGAKG
[ 736- 743]   AGLPGPKG

[ 259- 269]   AGLPGMKGHRG
[ 736- 746]   AGLPGPKGDRG

______________________________

[ 265- 273]   KGHRGFSGL
[1108-1116]   KGHRGFSGL

______________________________

[ 281- 290]-(  -8)-[ 283- 287]
[         ]--------[ 502- 506]
[ 515- 524]-(  -8)-[ 517- 521]
[         ]--------[ 748- 752]

[ 281- 290]   GPAGPKGEPG
[ 515- 524]   GPAGPKGSPG

[ 283- 287]   AGPKG
[ 502- 506]   AGPKG
[ 517- 521]   AGPKG
[ 748- 752]   AGPKG

______________________________

[ 286- 290]   KGEPG
[ 448- 452]   KGEPG
[ 826- 830]   KGEPG

______________________________

[ 289- 294]   PGSPGE
[ 544- 548]   PGSPG
[1123-1128]   PGSPGE

______________________________

[ 284- 285]-(   4)-[ 290- 294]
[ 515- 516]-(   4)-[ 521- 525]
[1118-1119]-(   4)-[1124-1128]

[ 290- 294]   GSPGE
[ 521- 525]   GSPGE
[1124-1128]   GSPGE

______________________________

[ 314- 323]   GAPGPAGARG
[ 668- 677]   GAPGPSGARG

with superset:
  [ 194- 198]   GAPGP
  [ 314- 318]   GAPGP
  [ 668- 672]   GAPGP
  [1052-1056]   GAPGP

______________________________

[ 320- 335]   GARGNDGATGAAGPPG
[ 701- 716]   GAPGNDGAKGDAGAPG

with superset:
  [ 274- 281]   DGAKGDAG
  [ 325- 332]   DGATGAAG
  [ 706- 713]   DGAKGDAG

______________________________

[ 337- 341]-(  -5)-[ 337- 351]-( -15)-[ 337- 344]-(  10)-[ 355- 362]
[ 913- 917]--------[         ]--------[ 913- 920]--------[         ]
[1039-1043]-(  -5)-[1039-1053]-( -15)-[1039-1046]--------[         ]
[1069-1073]--------[         ]--------[         ]--------[1087-1094]

[ 337- 341]   TGPAG
[ 913- 917]   TGPAG
[1039-1043]   TGPAG
[1069-1073]   TGPAG

[ 337- 351]   TGPAGPPGFPGAVGA
[1039-1053]   TGPAGPPGAPGAPGA

[ 337- 344]   TGPAGPPG
[ 913- 920]   TGPAGRPG
[1039-1046]   TGPAGPPG

[ 355- 362]   AGPQGPRG
[1087-1094]   AGPQGPRG

with superset:
  [ 356- 360]   GPQGP
  [ 419- 423]   GPQGP
  [1088-1092]   GPQGP

______________________________

[ 343- 350]   PGFPGAVG
[ 601- 608]   PGPPGAVG

______________________________

[         ]--------[ 344- 348]
[         ]--------[ 410- 414]
[ 446- 452]-(  41)-[ 494- 498]
[ 824- 830]-(  44)-[ 875- 879]

[ 446- 452]   GAKGEPG
[ 824- 830]   GAKGEPG

with superset:
  [ 350- 354]   GAKGE
  [ 446- 450]   GAKGE
  [ 824- 828]   GAKGE

[ 344- 348]   GFPGA
[ 410- 414]   GFPGA
[ 494- 498]   GFPGA
[ 875- 879]   GFPGA

______________________________

[ 370- 384]   RGEPGPPGPAGAAGP
[ 802- 816]   RGEPGPPGPAGFAGP

with superset:
  [ 370- 377]   RGEPGPPG
  [ 469- 476]   RGARGEPG
  [ 676- 683]   RGERGFPG
  [ 802- 809]   RGEPGPPG
  [ 967- 974]   RGERGFPG

______________________________

[ 377- 378]-(   3)-[ 382- 389]
[ 839- 840]-(   3)-[ 844- 851]

[ 382- 389]   AGPAGNPG
[ 844- 851]   AGPAGPPG

with superset:
  [ 280- 284]   AGPAG
  [ 382- 386]   AGPAG
  [ 625- 629]   AGPAG
  [ 844- 848]   AGPAG
  [1072-1076]   AGPAG

______________________________

[ 388- 392]-( -11)-[ 382- 398]--------[         ]--------[ 433- 437]
[ 496- 500]--------[         ]--------[ 496- 503]--------[         ]
[ 817- 821]-( -11)-[ 811- 827]--------[         ]--------[         ]
[ 937- 941]--------[         ]--------[ 937- 944]-(  34)-[ 979- 983]

[ 388- 392]   PGADG
[ 496- 500]   PGADG
[ 817- 821]   PGADG
[ 937- 941]   PGADG

with superset:
  [ 388- 396]   PGADGQPGA
  [ 817- 825]   PGADGQPGA
  [ 937- 945]   PGADGPAGA

[ 382- 398]   AGPAGNPGADGQPGAKG
[ 811- 827]   AGFAGPPGADGQPGAKG

with superset:
  [ 388- 396]   PGADGQPGA
  [ 817- 825]   PGADGQPGA
  [ 937- 945]   PGADGPAGA

[ 496- 503]   PGADGVAG
[ 937- 944]   PGADGPAG

[ 433- 437]   SGEPG
[ 979- 983]   SGEPG

______________________________

[ 398- 408]   GANGAPGIAGA
[ 698- 708]   GANGAPGNDGA

with superset:
  [ 295- 299]   NGAPG
  [ 400- 404]   NGAPG
  [ 700- 704]   NGAPG

______________________________

[ 413- 423]   GARGPSGPQGP
[1082-1092]   GARGPAGPQGP

______________________________

[ 416- 423]   GPSGPQGP
[ 785- 792]   GPSGPAGP

with superset:
  [ 182- 186]   GPSGP
  [ 416- 420]   GPSGP
  [ 785- 789]   GPSGP

______________________________

[ 409- 416]-(  10)-[ 427- 431]-(  -5)-[ 427- 437]
[ 568- 575]-(   7)-[ 583- 587]-(  -5)-[ 583- 593]
[         ]--------[ 739- 743]--------[         ]

[ 409- 416]   PGFPGARG
[ 568- 575]   PGPPGARG

[ 427- 431]   PGPKG
[ 583- 587]   PGPKG
[ 739- 743]   PGPKG

[ 427- 437]   PGPKGNSGEPG
[ 583- 593]   PGPKGAAGEPG

______________________________

[ 449- 453]   GEPGP
[ 473- 477]   GEPGP
[ 803- 807]   GEPGP

______________________________

[ 446- 447]-(   3)-[ 451- 455]
[1049-1050]-(   3)-[1054-1058]

[ 451- 455]   PGPVG
[1054-1058]   PGPVG

______________________________

[ 473- 474]-(   4)-[ 479- 486]
[ 968- 969]-(   4)-[ 974- 981]

[ 479- 486]   GLPGPPGE
[ 974- 981]   GLPGPSGE

______________________________

[ 493- 512]   RGFPGADGVAGPKGPAGERG
[ 679- 698]   RGFPGERGVQGPPGPAGPRG

with superset:
  [ 493- 497]   RGFPG
  [ 679- 683]   RGFPG
  [ 970- 974]   RGFPG

______________________________

[ 503- 518]-( -11)-[ 508- 516]
[         ]--------[ 595- 603]
[ 623- 638]-( -11)-[ 628- 636]

[ 503- 518]   GPKGPAGERGSPGPAG
[ 623- 638]   GPAGPAGERGEQGPAG

[ 508- 516]   AGERGSPGP
[ 595- 603]   AGERGVPGP
[ 628- 636]   AGERGEQGP

______________________________

[ 512- 516]   GSPGP
[ 545- 549]   GSPGP

______________________________

[ 514- 524]   PGPAGPKGSPG
[ 928- 938]   PGPAGEKGSPG

______________________________

[ 523- 536]   PGEAGRPGEAGLPG
[ 646- 659]   PGPAGPPGEAGKPG

with superset:
  [ 526- 531]   AGRPGE
  [ 649- 654]   AGPPGE
  [ 655- 660]   AGKPGE
  [ 916- 921]   AGRPGE
  [1006-1011]   AGPPGE

______________________________

[ 569- 576]   GPPGARGQ
[ 815- 822]   GPPGADGQ

with superset:
  [ 569- 573]   GPPGA
  [ 602- 606]   GPPGA
  [ 815- 819]   GPPGA
  [ 869- 873]   GPPGA
  [1043-1047]   GPPGA

______________________________

[ 584- 588]   GPKGA
[ 749- 753]   GPKGA

______________________________

[ 590- 594]   GEPGK
[ 980- 984]   GEPGK

______________________________

[ 601- 612]-(  -9)-[ 604- 612]
[         ]--------[ 895- 903]
[1051-1062]-(  -9)-[1054-1062]

[ 601- 612]   PGPPGAVGPAGK
[1051-1062]   PGAPGPVGPAGK

[ 604- 612]   PGAVGPAGK
[ 895- 903]   PGPPGPAGK
[1054-1062]   PGPVGPAGK

______________________________

[ 614- 615]-(   3)-[ 619- 627]
[ 683- 684]-(   3)-[ 688- 696]

[ 619- 627]   QGPPGPAGP
[ 688- 696]   QGPPGPAGP

with superset:
  [ 202- 206]   QGPPG
  [ 457- 461]   QGPPG
  [ 619- 623]   QGPPG
  [ 688- 692]   QGPPG
  [1117-1121]   QGPPG

and:
  [ 457- 463]   QGPPGPA
  [ 619- 625]   QGPPGPA
  [ 688- 694]   QGPPGPA

______________________________

[ 619- 630]   QGPPGPAGPAGE
[ 643- 654]   QGLPGPAGPPGE

with superset:
  [ 421- 426]   QGPGGP
  [ 457- 462]   QGPPGP
  [ 619- 624]   QGPPGP
  [ 643- 648]   QGLPGP
  [ 688- 693]   QGPPGP
  [1117-1122]   QGPPGP

and:
  [ 457- 464]   QGPPGPAG
  [ 619- 626]   QGPPGPAG
  [ 643- 650]   QGLPGPAG
  [ 688- 695]   QGPPGPAG

______________________________

[ 623- 627]-(   4)-[ 632- 636]
[1118-1122]-(   4)-[1127-1131]

[ 623- 627]   GPAGP
[1118-1122]   GPPGP

[ 632- 636]   GEQGP
[1127-1131]   GEQGP

______________________________

[ 644- 648]-(  -5)-[ 644- 653]
[ 737- 741]--------[         ]
[ 974- 978]--------[         ]
[1157-1161]-(  -5)-[1157-1166]

[ 644- 648]   GLPGP
[ 737- 741]   GLPGP
[ 974- 978]   GLPGP
[1157-1161]   GLPGP

[ 644- 653]   GLPGPAGPPG
[1157-1166]   GLPGPIGPPG

______________________________

[ 650- 653]-(   4)-[ 658- 662]
[1118-1121]-(   4)-[1126-1130]

[ 650- 653]   GPPG
[1118-1121]   GPPG

[ 658- 662]   PGEQG
[1126-1130]   PGEQG

______________________________

[ 670- 674]--------[         ]
[ 886- 890]-(  44)-[ 935- 939]
[ 976- 980]-(  47)-[1028-1032]

[ 670- 674]   PGPSG
[ 886- 890]   PGPSG
[ 976- 980]   PGPSG

[ 935- 939]   GSPGA
[1028-1032]   GSPGA

______________________________

[ 682- 686]   PGERG
[ 730- 734]   PGERG

______________________________

[ 703- 716]   PGNDGAKGDAGAPG
[ 829- 842]   PGDAGAKGDAGPPG

with superset:
  [ 275- 279]   GAKGD
  [ 707- 711]   GAKGD
  [ 833- 837]   GAKGD
  [1031-1035]   GAKGD

and:
  [ 275- 281]   GAKGDAG
  [ 707- 713]   GAKGDAG
  [ 833- 839]   GAKGDAG

______________________________

[ 710- 714]   GDAGA
[ 830- 834]   GDAGA

______________________________

[ 707- 708]-(   3)-[ 712- 722]
[ 938- 939]-(   3)-[ 943- 953]

[ 712- 722]   AGAPGAPGSQG
[ 943- 953]   AGAPGTPGPQG

with superset:
  [ 406- 410]   AGAPG
  [ 712- 716]   AGAPG
  [ 775- 779]   AGAPG
  [ 943- 947]   AGAPG
  [1147-1151]   AGAPG

and:
  [ 406- 413]   AGAPGFPG
  [ 712- 719]   AGAPGAPG
  [ 943- 950]   AGAPGTPG

______________________________

[ 739- 750]   PGPKGDRGDAGP
[1030-1041]   PGAKGDRGETGP

______________________________

[ 754- 758]   DGSPG
[1027-1031]   DGSPG

______________________________

[ 757- 774]   PGKDGVRGLTGPIGPPGP
[1150-1167]   PGKDGLNGLPGPIGPPGP

______________________________

[ 773- 779]   GPAGAPG
[ 941- 947]   GPAGAPG

______________________________

[ 775- 779]   AGAPG
[ 943- 947]   AGAPG
[1147-1151]   AGAPG

______________________________

[ 767- 771]-(   4)-[ 776- 791]
[ 788- 792]-(   4)-[ 797- 812]

[ 767- 771]   GPIGP
[ 788- 792]   GPAGP

[ 776- 791]   GAPGDKGESGPSGPAG
[ 797- 812]   GAPGDRGEPGPPGPAG

______________________________

[ 770- 774]-(   4)-[ 779- 783]
[1085-1089]-(   4)-[1094-1098]

[ 770- 774]   GPPGP
[1085-1089]   GPAGP

[ 779- 783]   GDKGE
[1094-1098]   GDKGE

______________________________

[ 800- 819]   GDRGEPGPPGPAGFAGPPGA
[1064-1083]   GDRGETGPAGPAGPVGPVGA

______________________________

[ 836- 852]   GDAGPPGPAGPAGPPGP
[1172-1188]   GDAGPVGPPGPPGPPGP

______________________________

[ 850- 854]   PGPIG
[1159-1163]   PGPIG

______________________________

[ 859- 870]   PGAKGARGSAGP
[1030-1041]   PGAKGDRGETGP

______________________________

[ 884- 885]-(   3)-[ 889- 902]
[1130-1131]-(   3)-[1135-1148]

[ 889- 902]   SGNAGPPGPPGPAG
[1135-1148]   SGPAGPRGPPGSAG

with superset:
  [ 214- 219]   SGPMGP
  [ 418- 423]   SGPQGP
  [ 787- 792]   SGPAGP
  [ 889- 894]   SGNAGP
  [1135-1140]   SGPAGP

and:
  [ 214- 224]   SGPMGPRGPPG
  [ 418- 428]   SGPQGPGGPPG
  [ 889- 899]   SGNAGPPGPPG
  [1135-1145]   SGPAGPRGPPG

______________________________

[ 922- 930]   VGPPGPPGP
[1177-1185]   VGPPGPPGP

with superset:
  [ 883- 888]   VGPPGP
  [ 922- 927]   VGPPGP
  [1177-1182]   VGPPGP

______________________________

[1055-1059]   GPVGP
[1076-1080]   GPVGP
[1175-1179]   GPVGP

______________________________

[1114-1125]   SGLQGPPGPPGS
[1135-1146]   SGPAGPRGPPGS


______________________________
Simple tandem repeat: 

[ 523- 528]    PGEAGR
[ 529- 534]    PGEAGL
[ 535- 540]    PGAKGL


Highly repetitive regions:

From  118 to 1192 with major motif GERGPPGPA.
From  124 to 1141 with major motif GPAGPP.
From  138 to 1192 with major motif PGPPGPP.
From  141 to 1192 with major motif PGPPGPA.
From  141 to 1191 with major motif PGPPGP.
From  142 to 1192 with major motif GPPGPP.
From  187 to 1072 with major motif RGEPGPP.
From  280 to 1180 with major motif AGPPGPP.
From  280 to 1182 with major motif AGPPGPRGP.
From  316 to  933 with major motif PGPAGP.

B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
   (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length:  9

Aligned matching blocks:


[ 747- 761]   -ssp+ss-sops+-s
[1015-1028]   -ssp_ss-sops+-s

______________________________

[ 776- 800]   ssps-+s-ospospssposs+ssps
[1031-1055]   ss+s-+s-ospssppsspsspssps

with superset:
  [ 112- 122]   sp+s-osp+sp
  [ 275- 285]   ss+s-sspssp
  [ 509- 519]   s-+sopspssp
  [ 779- 789]   s-+s-osposp
  [ 800- 810]   s-+s-psppsp
  [1034-1044]   s-+s-ospssp
  [1064-1074]   s-+s-ospssp

and:
  [ 779- 791]   s-+s-osposp_ss
  [ 800- 812]   s-+s-psppspss
  [1034-1047]   s-+s-ospssppss
  [1064-1076]   s-+s-ospsspss

and:
  [ 779- 798]   s-+s-ospos_pssposs+ss
  [ 800- 819]   s-+s-pspps_pssissppss
  [1034-1053]   s-+s-ospssppssps_spss


--------------------------------------------------------------------------------

MULTIPLETS.

A. AMINO ACID ALPHABET.

1. Total number of amino acid multiplets:  83  (Expected range: 118--189) low

       1  ........LL LLLAA..LL. ...EE..... .....PP... .......... .......... 
      61  .......... DD........ .......... CC........ ......TT.. .......... 
     121  .......PP. .......... ..PP.PP.PP .PP..GG... .......... ..GG...... 
     181  .......... .PP....... ...PP..... .......... .PP.PP.... DD........ 
     241  .....PP... .......... .......... .......... .......... .......... 
     301  .......... .......... .........A A.PP...... .PP....... .......... 
     361  .......... ....PP.... AA........ .......... .......... .......... 
     421  ...GGPP... .......... .......... ........PP ....EE.... .......... 
     481  ..PP...GG. .......... .......... .......... .......... .......... 
     541  .......... ....PP.... ......PP.P P......... .......AA. .......... 
     601  ..PP...... .......... PP........ .......... .......... PP........ 
     661  .......... .......... .........P P......... .......... .......... 
     721  .......... ....AA.... .......... .......... .......... PP........ 
     781  .......... .......... ......PP.. .....PP... .......... .........P 
     841  P.......PP .......... .........P P.......AA ....PP.... ...PP.PP.. 
     901  ....GG.... .......... ...PP.PP.. .......... .......... .........V 
     961  V......... .......... .......... .....PP... .PP....PP. ........AA 
    1021  .......... .......... ...PP..... .......... .......... .......... 
    1081  .......... .......... .......... ........PP .PP....... .......... 
    1141  ..PP...... .......... ...PP..... ........PP .PP.PP.PP. PP........ 
    1201  ...PP..... .GG.YY..DD ..VV...... ..TT.....Q Q......... .......... 
    1261  .......... .......... .......... .......... .......... .......... 
    1321  .......... ........GG .......... .......... .......... ........QQ 
    1381  ....KK.LLL .......... .......... .......... .......... TT.SS...II 
    1441  .......... .......... ....

2. Histogram of spacings between consecutive amino acid multiplets:
   (1-5) 32   (6-10) 11   (11-20) 21   (>=21) 20

3. Clusters of amino acid multiplets (cmin = 10/30 or 13/45 or 16/60):  none

4. Significant specific amino acid altplet counts:
Letters		Observed (Critical number)
AG		113 (93)
 at   83 (l= 2)   126 (l= 2)   194 (l= 2)   212 (l= 2)   235 (l= 2)  
     251 (l= 2)   259 (l= 2)   275 (l= 2)   280 (l= 2)   283 (l= 2)  
     296 (l= 2)   314 (l= 2)   319 (l= 3)   326 (l= 2)   329 (l= 2)  
     331 (l= 2)   340 (l= 2)   347 (l= 2)   350 (l= 2)   355 (l= 2)  
     379 (l= 3)   382 (l= 2)   385 (l= 2)   389 (l= 2)   395 (l= 2)  
     398 (l= 2)   401 (l= 2)   406 (l= 3)   413 (l= 2)   437 (l= 2)  
     446 (l= 2)   463 (l= 2)   470 (l= 2)   497 (l= 2)   502 (l= 2)  
     508 (l= 2)   517 (l= 2)   526 (l= 2)   532 (l= 2)   536 (l= 2)  
     559 (l= 2)   572 (l= 2)   577 (l= 2)   587 (l= 2)   589 (l= 2)  
     595 (l= 2)   605 (l= 2)   610 (l= 2)   616 (l= 3)   625 (l= 2)  
     628 (l= 2)   637 (l= 2)   649 (l= 2)   655 (l= 2)   668 (l= 2)  
     674 (l= 2)   694 (l= 2)   698 (l= 2)   701 (l= 2)   707 (l= 2)  
     712 (l= 3)   716 (l= 2)   722 (l= 2)   734 (l= 2)   736 (l= 2)  
     748 (l= 2)   752 (l= 2)   775 (l= 3)   790 (l= 2)   794 (l= 2)  
     797 (l= 2)   811 (l= 2)   814 (l= 2)   818 (l= 2)   824 (l= 2)  
     832 (l= 3)   838 (l= 2)   844 (l= 2)   847 (l= 2)   857 (l= 2)  
     860 (l= 2)   863 (l= 2)   868 (l= 2)   872 (l= 2)   878 (l= 2)  
     880 (l= 2)   892 (l= 2)   901 (l= 2)   916 (l= 2)   931 (l= 2)  
     938 (l= 2)   943 (l= 3)   955 (l= 2)   989 (l= 2)  1006 (l= 2)  
    1016 (l= 2)  1031 (l= 2)  1042 (l= 2)  1046 (l= 2)  1049 (l= 2)  
    1052 (l= 2)  1060 (l= 2)  1072 (l= 2)  1075 (l= 2)  1082 (l= 2)  
    1087 (l= 2)  1133 (l= 2)  1138 (l= 2)  1147 (l= 3)  1174 (l= 2)  
    1194 (l= 2)  1420 (l= 2)  1448 (l= 2)
GP		203 (156)
 at   82 (l= 2)   112 (l= 2)   118 (l= 2)   121 (l= 2)   124 (l= 2)  
     127 (l= 2)   129 (l= 2)   135 (l= 2)   138 (l= 2)   141 (l= 3)  
     144 (l= 3)   147 (l= 3)   150 (l= 3)   153 (l= 2)   178 (l= 3)  
     182 (l= 2)   185 (l= 2)   190 (l= 3)   193 (l= 2)   196 (l= 3)  
     203 (l= 2)   205 (l= 2)   208 (l= 2)   211 (l= 2)   215 (l= 2)  
     218 (l= 2)   221 (l= 2)   223 (l= 3)   226 (l= 2)   238 (l= 2)  
     241 (l= 2)   245 (l= 2)   247 (l= 3)   256 (l= 2)   262 (l= 2)  
     281 (l= 2)   284 (l= 2)   289 (l= 2)   292 (l= 2)   298 (l= 2)  
     302 (l= 2)   307 (l= 2)   313 (l= 2)   316 (l= 3)   332 (l= 2)  
     334 (l= 3)   338 (l= 2)   341 (l= 2)   343 (l= 2)   346 (l= 2)  
     356 (l= 2)   359 (l= 2)   365 (l= 2)   373 (l= 3)   376 (l= 3)  
     383 (l= 2)   388 (l= 2)   394 (l= 2)   403 (l= 2)   409 (l= 2)  
     412 (l= 2)   416 (l= 2)   419 (l= 2)   422 (l= 3)   425 (l= 2)  
     427 (l= 3)   436 (l= 2)   439 (l= 2)   451 (l= 3)   458 (l= 2)  
     460 (l= 3)   475 (l= 3)   481 (l= 3)   484 (l= 2)   489 (l= 3)  
     496 (l= 2)   503 (l= 2)   506 (l= 2)   514 (l= 3)   518 (l= 2)  
     523 (l= 2)   529 (l= 2)   535 (l= 2)   544 (l= 2)   547 (l= 3)  
     554 (l= 2)   556 (l= 3)   565 (l= 3)   568 (l= 3)   571 (l= 2)  
     583 (l= 3)   592 (l= 2)   601 (l= 3)   604 (l= 2)   608 (l= 2)  
     620 (l= 2)   622 (l= 3)   626 (l= 2)   635 (l= 2)   640 (l= 2)  
     646 (l= 3)   650 (l= 2)   652 (l= 2)   658 (l= 2)   664 (l= 2)  
     670 (l= 3)   682 (l= 2)   689 (l= 2)   691 (l= 3)   695 (l= 2)  
     703 (l= 2)   715 (l= 2)   71
8 (l= 2)   724 (l= 2)   730 (l= 2)  
     739 (l= 3)   749 (l= 2)   757 (l= 2)   767 (l= 2)   770 (l= 2)  
     772 (l= 3)   778 (l= 2)   785 (l= 2)   788 (l= 2)   791 (l= 2)  
     799 (l= 2)   805 (l= 3)   808 (l= 3)   815 (l= 2)   817 (l= 2)  
     823 (l= 2)   829 (l= 2)   839 (l= 2)   841 (l= 3)   845 (l= 2)  
     848 (l= 2)   850 (l= 3)   859 (l= 2)   869 (l= 2)   871 (l= 2)  
     877 (l= 2)   884 (l= 2)   886 (l= 3)   893 (l= 2)   895 (l= 3)  
     898 (l= 3)   908 (l= 2)   914 (l= 2)   919 (l= 2)   923 (l= 2)  
     925 (l= 3)   928 (l= 3)   937 (l= 2)   941 (l= 2)   946 (l= 2)  
     949 (l= 3)   964 (l= 2)   973 (l= 2)   976 (l= 3)   982 (l= 2)  
     986 (l= 2)   995 (l= 2)   997 (l= 3)  1001 (l= 2)  1003 (l= 2)  
    1007 (l= 2)  1009 (l= 2)  1024 (l= 2)  1030 (l= 2)  1040 (l= 2)  
    1043 (l= 2)  1045 (l= 2)  1048 (l= 2)  1051 (l= 2)  1054 (l= 3)  
    1058 (l= 2)  1070 (l= 2)  1073 (l= 2)  1076 (l= 2)  1079 (l= 2)  
    1085 (l= 2)  1088 (l= 2)  1091 (l= 2)  1118 (l= 2)  1120 (l= 3)  
    1123 (l= 2)  1126 (l= 2)  1130 (l= 2)  1136 (l= 2)  1139 (l= 2)  
    1142 (l= 2)  1144 (l= 2)  1150 (l= 2)  1159 (l= 3)  1163 (l= 2)  
    1165 (l= 3)  1175 (l= 2)  1178 (l= 2)  1180 (l= 3)  1183 (l= 3)  
    1186 (l= 3)  1189 (l= 3)  1459 (l= 2)

5. Long amino acid multiplets (>= 5; Letter/Length/Position):
    L/5/9


B. CHARGE ALPHABET.

1. Total number of charge multiplets:  12  (Expected range:   8-- 37)
   4 +plets (f+: 8.8%), 8 -plets (f-: 9.6%)
   Total number of charge altplets: 32 (Critical number: 42)

2. Histogram of spacings between consecutive charge multiplets:
   (1-5) 2   (6-10) 1   (11-20) 0   (>=21) 10

--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.

A. AMINO ACID ALPHABET (core:  4; !-core: 5)

Location	Period	Element		Copies	Core	Errors
   9-  13	 1	L         	 5	 5 !	 0
 109- 159	 3	G..       	17	17 !	 0
 173-1192	 3	G..       	338	280 !	 2


B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core:  5; !-core: 6)
   and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core:  6; !-core: 8)

Location	Period	Element		Copies	Core	Errors
 228- 245	 3	*00       	 6	 6  	/0/2/0/


--------------------------------------------------------------------------------
SPACING ANALYSIS.

Location (Quartile) Spacing     Rank       P-value   Interpretation

  47- 165  (1.)     Y( 118)Y     2 of  14   0.9994   small  2. maximal spacing
  53-1269  (2.)     W(1216)W     1 of   7   0.0002   large  1. maximal spacing
  95-1259  (2.)     C(1164)C     1 of  19   0.0000   large  1. maximal spacing
 167-1215  (2.)     Y(1048)Y     1 of  14   0.0000   large  1. maximal spacing
 170- 228  (1.)     K(  58)K     2 of  59   0.9982   small  2. maximal spacing
 267-1110  (2.)     H( 843)H     1 of  10   0.0043   large maximal spacing
 286- 352  (1.)     K(  66)K     1 of  59   0.9976   small  1. maximal spacing
 310- 312  (1.)     R(   2)R    72 of  72   0.0006   large minimal spacing
1168-1170  (4.)     R(   2)R    70 of  72   0.0006     matching minimum
1205-1248  (4.)     P(  43)P     2 of 279   0.0003   large  2. maximal spacing
1213-1250  (4.)     G(  37)G     2 of 391   0.0000   large  2. maximal spacing
1225-1227  (4.)     R(   2)R    71 of  72   0.0006     matching minimum
1299-1370  (4.)     C(  71)C     2 of  19   1.0000   small  2. maximal spacing
1325-1422  (4.)     W(  97)W     2 of   7   0.9996   small  2. maximal spacing
1342-1382  (4.)     G(  40)G     1 of 391   0.0013   large  1. maximal spacing
1345-1438  (4.)     P(  93)P     1 of 279   0.0000   large  1. maximal spacing



~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Pfam (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/Pfam
Sequence file:            tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  tem38_gi|1418928|emb|CAA98968.1|  prepro-alpha1(I) collagen [Homo sapiens]

Scores for sequence family classification (score includes all domains):
Model        Description                                Score    E-value  N 
--------     -----------                                -----    ------- ---
Collagen     Collagen triple helix repeat (20 copies)   970.8   3.3e-288  18
COLFI        Fibrillar collagen C-terminal domain       565.2     2e-220   1
vwc          von Willebrand factor type C domain         89.7    5.8e-23   1
fibrinogen_C Fibrinogen beta and gamma chains, C-term    -0.3         50   1
DUF41        Domain of unknown function DUF41           -71.4         30   1

Parsed for domains:
Model        Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------     ------- ----- -----    ----- -----      -----  -------
vwc            1/1      40    95 ..     1    84 []    89.7  5.8e-23
Collagen       1/18    107   165 ..     1    60 []    26.8  0.00013
Collagen       2/18    177   235 ..     1    60 []    51.4    2e-11
Collagen       3/18    236   295 ..     1    60 []    77.7  2.4e-19
Collagen       4/18    296   355 ..     1    60 []    66.9  4.3e-16
Collagen       5/18    356   415 ..     1    60 []    63.6  4.2e-15
Collagen       6/18    416   475 ..     1    60 []    63.1  5.9e-15
Collagen       7/18    476   535 ..     1    60 []    65.9  8.5e-16
Collagen       8/18    536   595 ..     1    60 []    66.6  5.3e-16
Collagen       9/18    596   655 ..     1    60 []    64.1    3e-15
Collagen      10/18    656   715 ..     1    60 []    62.6  8.4e-15
Collagen      11/18    716   775 ..     1    60 []    72.2  1.1e-17
Collagen      12/18    779   838 ..     1    60 []    70.3  3.9e-17
Collagen      13/18    839   898 ..     1    60 []    62.4  9.4e-15
Collagen      14/18    899   958 ..     1    60 []    61.2  2.3e-14
Collagen      15/18    959  1018 ..     1    60 []    64.6  2.1e-15
Collagen      16/18   1020  1078 ..     1    60 []    55.4  1.2e-12
Collagen      17/18   1079  1138 ..     1    60 []    75.9  8.5e-19
Collagen      18/18   1139  1198 ..     1    60 []    35.6  1.1e-06
fibrinogen_C   1/1    1271  1295 ..    18    43 ..    -0.3       50
DUF41          1/1       4  1308 ..     1   247 []   -71.4       30
COLFI          1/1    1245  1463 ..     1   226 []   565.2   2e-220

Alignments of top-scoring domains:
vwc: domain 1 of 1, from 40 to 95: score 89.7, E = 5.8e-23
                   *->CvqnGvvYengetWkpdsqPnGvdkCtyiCtCddiedavrlggkvlC
                      CvqnG +Y+++++Wkp++       C+ iC+Cd+        gkvlC
  tem38_gi|1    40    CVQNGLRYHDRDVWKPEP-------CR-ICVCDN--------GKVLC 70   

                   dkitCppelLpsldCpnprrvdalvippGECCpewvC<-*
                   d+++C+++     +Cp +      + p+GECCp  vC   
  tem38_gi|1    71 DDVICDET----KNCPGA------EVPEGECCP--VC    95   

Collagen: domain 1 of 18, from 107 to 165: score 26.8, E = 0.00013
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                         G  Gp+G++Gp+Gp+Gp+Gp+G  G pG pG pGpPGppGppGp
  tem38_gi|1   107    -TTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGP 152  

                   pGppGapGapGpp<-*
                   pG  G+  +       
  tem38_gi|1   153 PGLGGNFAPQLSY    165  

Collagen: domain 2 of 18, from 177 to 235: score 51.4, E = 2e-11
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                        pGp+Gp Gp+G pGppG+pGp+G++GppG pGepG+ Gp Gp Gp
  tem38_gi|1   177    -VPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGP 222  

                   pGppGapGapGpp<-*
                   pGppG+ G+ G++   
  tem38_gi|1   223 PGPPGKNGDDGEA    235  

Collagen: domain 3 of 18, from 236 to 295: score 77.7, E = 2.4e-19
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG+pG++GppGp G++G pG aG pG++G++G++G +G++G +Gp
  tem38_gi|1   236    GKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGP 282  

                   pGppGapGapGpp<-*
                   +Gp+G+pG+pG++   
  tem38_gi|1   283 AGPKGEPGSPGEN    295  

Collagen: domain 4 of 18, from 296 to 355: score 66.9, E = 4.3e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG++Gp+G+pG++G+pG+pGpaGa+G+ G+ G++GpPGp Gp+Gp
  tem38_gi|1   296    GAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGP 342  

                   pGppGapGapGpp<-*
                   pG pGa Ga+G++   
  tem38_gi|1   343 PGFPGAVGAKGEA    355  

Collagen: domain 5 of 18, from 356 to 415: score 63.6, E = 4.2e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp Gp+G+ Gp+G +G+pGppGpaGa+Gp+G+pG++G+PG++G++G+
  tem38_gi|1   356    GPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGA 402  

                   pGppGapGapGpp<-*
                   pG +GapG pG++   
  tem38_gi|1   403 PGIAGAPGFPGAR    415  

Collagen: domain 6 of 18, from 416 to 475: score 63.1, E = 5.9e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp Gp+Gp GppGp+G++G+pG++G++G+ G++GepGp G +GppGp
  tem38_gi|1   416    GPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGP 462  

                   pGppGapGapGpp<-*
                   +G++G+ Ga G+p   
  tem38_gi|1   463 AGEEGKRGARGEP    475  

Collagen: domain 7 of 18, from 476 to 535: score 65.9, E = 8.5e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp G+pGppG++G pG +G+pG++G +Gp+Gp+Ge+G+PGp+Gp G+
  tem38_gi|1   476    GPTGLPGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGS 522  

                   pGppGapGapGpp<-*
                   pG++G+pG++G p   
  tem38_gi|1   523 PGEAGRPGEAGLP    535  

Collagen: domain 8 of 18, from 536 to 595: score 66.6, E = 5.3e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G++G+ G+pG pGp+G+ GppGpaG  G pGppG+pG+ G++G++G+
  tem38_gi|1   536    GAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGF 582  

                   pGppGapGapGpp<-*
                   pGp+Ga+G+pG++   
  tem38_gi|1   583 PGPKGAAGEPGKA    595  

Collagen: domain 9 of 18, from 596 to 655: score 64.1, E = 3e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G++G pGppG+ Gp+G+ G++G++G+pGp+Gp+Ge+G++Gp+G pG+
  tem38_gi|1   596    GERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGF 642  

                   pGppGapGapGpp<-*
                   +G pG++G+pG++   
  tem38_gi|1   643 QGLPGPAGPPGEA    655  

Collagen: domain 10 of 18, from 656 to 715: score 62.6, E = 8.4e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG++G pG+ G+pGp+G+ G++G+pG++G +G+pGp Gp+G++G+
  tem38_gi|1   656    GKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGA 702  

                   pGppGapGapGpp<-*
                   pG++Ga+G++G+p   
  tem38_gi|1   703 PGNDGAKGDAGAP    715  

Collagen: domain 11 of 18, from 716 to 775: score 72.2, E = 1.1e-17
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG++G+pG++G+pG++G++G +G++G++G++G++G++G+pG++G 
  tem38_gi|1   716    GAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGV 762  

                   pGppGapGapGpp<-*
                   +G +G++G+pGp+   
  tem38_gi|1   763 RGLTGPIGPPGPA    775  

Collagen: domain 12 of 18, from 779 to 838: score 70.3, E = 3.9e-17
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G +G+ Gp Gp+Gp+G++G+pG++G+pGppGp+G++GpPG++G+pG+
  tem38_gi|1   779    GDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGA 825  

                   pGppGapGapGpp<-*
                   +G+pG +Ga+G +   
  tem38_gi|1   826 KGEPGDAGAKGDA    838  

Collagen: domain 13 of 18, from 839 to 898: score 62.4, E = 9.4e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      GppGp+Gp+GppGp G+ G+pG++Ga+G++GppG+ G+PG++G+ Gp
  tem38_gi|1   839    GPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGP 885  

                   pGppGapGapGpp<-*
                   pGp G++G+pGpp   
  tem38_gi|1   886 PGPSGNAGPPGPP    898  

Collagen: domain 14 of 18, from 899 to 958: score 61.2, E = 2.3e-14
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp+G+ G++Gp+G++Gp+G pG+ G+pGppGp+Ge+G+PG++Gp+G+
  tem38_gi|1   899    GPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGA 945  

                   pGppGapGapGpp<-*
                   pG pG+ G +G++   
  tem38_gi|1   946 PGTPGPQGIAGQR    958  

Collagen: domain 15 of 18, from 959 to 1018: score 64.6, E = 2.1e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G  G+pG +G++G pG pGp G++G++Gp G++Ge+GpPGp GppG 
  tem38_gi|1   959    GVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGL 1005 

                   pGppGapGapGpp<-*
                   +GppG++G +G+p   
  tem38_gi|1  1006 AGPPGESGREGAP    1018 

Collagen: domain 16 of 18, from 1020 to 1078: score 55.4, E = 1.2e-12
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                       + G+pG+ G pG++G++G++GpaG+pG pG+pG+pGp Gp+G+ G 
  tem38_gi|1  1020    -AEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGD 1065 

                   pGppGapGapGpp<-*
                   +G++G++G++Gp+   
  tem38_gi|1  1066 RGETGPAGPAGPV    1078 

Collagen: domain 17 of 18, from 1079 to 1138: score 75.9, E = 8.5e-19
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp G +Gp+Gp+Gp+G++G++G++G +G +G++G++G +GppGppG+
  tem38_gi|1  1079    GPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGS 1125 

                   pGppGapGapGpp<-*
                   pG++G++Ga Gp+   
  tem38_gi|1  1126 PGEQGPSGASGPA    1138 

Collagen: domain 18 of 18, from 1139 to 1198: score 35.6, E = 1.1e-06
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp+GppG++G+pG +G  G pGp G+pGp+G  G++Gp GppGppGp
  tem38_gi|1  1139    GPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGP 1185 

                   pGppGapGapGpp<-*
                   pGppG+p a       
  tem38_gi|1  1186 PGPPGPPSAGFDF    1198 

fibrinogen_C: domain 1 of 1, from 1271 to 1295: score -0.3, E = 50
                   *->SPPGlYtIqPd.gakeqpllVYCDmet<-*
                      S  G Y I P++g +  +++V+C met   
  tem38_gi|1  1271    S--GEYWIDPNqGCNLDAIKVFCNMET    1295 

DUF41: domain 1 of 1, from 4 to 1308: score -71.4, E = 30
                   *->lteeQLlstFsNvkhliGslevqnTnfkslsFLanLesIecg.....
                      +++   l+ +         l    T + + +  +   ++e+++++ +
  tem38_gi|1     4    FVD---LRLL---------LLLAATALLTHG--QEEGQVEGQdedip 36   

                   ..................................................
                   + +  +++ + ++++  ++++ +    ++++   ++   +++++ ++ + 
  tem38_gi|1    37 pitcvqnglryhdrdvwkpepcricvcdngkvlcddvicdetkncpgaev 86   

                   ..................................................
                   ++++  +  ++++++++++++++ +++++++++++++++ +++++++ ++
  tem38_gi|1    87 pegeccpvcpdgsesptdqettgvegpkgdtgprgprgpagppgrdgipg 136  

                   ..................................................
                   +++ ++++++++++++++ +++  ++ + + +++++++ + +++ +++++
  tem38_gi|1   137 qpglpgppgppgppgppglggnfapqlsygydekstggisvpgpmgpsgp 186  

                   ..................................................
                   ++ +++++ +++++ +++++++++++ +++ +++++++++++++++++ +
  tem38_gi|1   187 rglpgppgapgpqgfqgppgepgepgasgpmgprgppgppgkngddgeag 236  

                   ..................................................
                   +++++++++++++++ ++ +++ + ++ +++++ ++ ++ +++ ++ +++
  tem38_gi|1   237 kpgrpgergppgpqgarglpgtaglpgmkghrgfsgldgakgdagpagpk 286  

                   ..................................................
                   ++++++++++ +++ ++++ ++++++++ +++ + +++++ ++  +++++
  tem38_gi|1   287 gepgspgengapgqmgprglpgergrpgapgpagargndgatgaagppgp 336  

                   ..................................................
                   +++ ++++ ++  + +++ +++++++++++++ +++++++++ +  ++ +
  tem38_gi|1   337 tgpagppgfpgavgakgeagpqgprgsegpqgvrgepgppgpagaagpag 386  

                   ..................................................
                   +++ +++++ ++ ++ ++  + ++ ++ ++++++++++++++++++++++
  tem38_gi|1   387 npgadgqpgakgangapgiagapgfpgargpsgpqgpggppgpkgnsgep 436  

                   ..................................................
                   + ++++++++ ++++++ + ++++++ +++++++ ++++++++ ++++++
  tem38_gi|1   437 gapgskgdtgakgepgpvgvqgppgpageegkrgargepgptglpgppge 486  

                   ..................................................
                   ++++++++ ++ ++  +++++ ++++++++ ++++++++ +++++ + ++
  tem38_gi|1   487 rggpgsrgfpgadgvagpkgpagergspgpagpkgspgeagrpgeaglpg 536  

                   ..................................................
                    ++ ++++++++++++++++++ +++++++++++++ +++ +  + ++++
  tem38_gi|1   537 akgltgspgspgpdgktgppgpagqdgrpgppgppgargqagvmgfpgpk 586  

                   ..................................................
                   +  +++++ ++++ +++++  ++ +++++ + ++++++ ++ ++++++++
  tem38_gi|1   587 gaagepgkagergvpgppgavgpagkdgeagaqgppgpagpagergeqgp 636  

                   ..................................................
                    ++++ ++ +++ +++++ +++++++ +++ + +++++ +++++ +++++
  tem38_gi|1   637 agspgfqglpgpagppgeagkpgeqgvpgdlgapgpsgargergfpgerg 686  

                   ..................................................
                    ++++++ ++++ ++ +++++ +++ + ++ +++++ ++ ++ +++++  
  tem38_gi|1   687 vqgppgpagprgangapgndgakgdagapgapgsqgapglqgmpgergaa 736  

                   ..................................................
                   + +++++++++ ++++ ++++++++ ++ +++ +++++ + +++++++++
  tem38_gi|1   737 glpgpkgdrgdagpkgadgspgkdgvrgltgpigppgpagapgdkgesgp 786  

                   ..................................................
                   +++ ++++ ++ ++++++++++++ +  ++++ +++++ ++++++ + ++
  tem38_gi|1   787 sgpagptgargapgdrgepgppgpagfagppgadgqpgakgepgdagakg 836  

                   ..................................................
                   + +++++ ++ +++++ ++ + ++ ++ +++ ++++ ++ ++  ++ +++
  tem38_gi|1   837 dagppgpagpagppgpignvgapgakgargsagppgatgfpgaagrvgpp 886  

                   ..................................................
                   +++++ ++++++++ ++++++++++++++ +++++ ++++++++ +++++
  tem38_gi|1   887 gpsgnagppgppgpagkeggkgprgetgpagrpgevgppgppgpagekgs 936  

                   ..................................................
                   ++ +++ + ++++++++  ++++  + ++++++++ ++ +++++++++++
  tem38_gi|1   937 pgadgpagapgtpgpqgiagqrgvvglpgqrgergfpglpgpsgepgkqg 986  

                   ..................................................
                   +++ +++++++++ ++++  ++++++++++ +  +++++++++++ ++++
  tem38_gi|1   987 psgasgergppgpmgppglagppgesgregapaaegspgrdgspgakgdr 1036 

                   ..................................................
                   +++++ ++++ ++ ++ +++ ++ +++++++++++ ++ ++ ++ + +++
  tem38_gi|1  1037 getgpagppgapgapgapgpvgpagksgdrgetgpagpagpvgpvgargp 1086 

                   ..................................................
                    +++++++++++++++++++ +++++ ++ +++++++++++++++++ ++
  tem38_gi|1  1087 agpqgprgdkgetgeqgdrgikghrgfsglqgppgppgspgeqgpsgasg 1136 

                   ..................................................
                   + ++++++++ + +++++ ++ +++ +++++++++++ ++ +++++++++
  tem38_gi|1  1137 pagprgppgsagapgkdglnglpgpigppgprgrtgdagpvgppgppgpp 1186 

                   ..................................irk.rnkdrvrkildn
                   +++++++ + + +  +++++++ +++++  + ++    r +d  +  +  
  tem38_gi|1  1187 gppgppsagfdfsflpqppqekahdggryyraddANVvRDRDLEVDTT-- 1234 

                   ihdnpfswidnqnmlelgllnlTnmtrlgLpilsnldlnkLnlpnlknis
                                                                lk++s
  tem38_gi|1  1235 ---------------------------------------------LKSLS 1239 

                   npnstgekiivnfenlhpdFClTteEllnfflnsnvsienleakyCepks
                   ++          +en      +++ E+            +++a  C +  
  tem38_gi|1  1240 QQ----------IEN------IRSPEGS----------RKNPARTCRDL- 1262 

                   rifflikktdngivyklCnfkslsssvnLdngCtiIfGdLvIgpgdEeyV
                                  k+C++   s             G ++I+p+     
  tem38_gi|1  1263 ---------------KMCHSDWKS-------------GEYWIDPNQG--- 1281 

                   skLknveviFGsLiIqNTnLtnidFLenLkyIasLedsvs<-*
                    +L+  +v       +  n ++         ++  + sv+   
  tem38_gi|1  1282 CNLDAIKV-------F-CNMETGE-----TCVYPTQPSVA    1308 

COLFI: domain 1 of 1, from 1245 to 1463: score 565.2, E = 2e-220
                   *->lksPeGksrknPARtCkDLfLchpefksGeYWiDPNqGCikDAikVf
                      ++sPeG srknPARtC+DL++ch+++ksGeYWiDPNqGC++DAikVf
  tem38_gi|1  1245    IRSPEG-SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVF 1290 

                   CnkrfetGvgeTCisptpksvpkRiksWykgks.kdkKhvWFgetmegGf
                   Cn  +etG  eTC++pt+ sv++  k+Wy +k++kdk+hvWFge+m++Gf
  tem38_gi|1  1291 CN--METG--ETCVYPTQPSVAQ--KNWYISKNpKDKRHVWFGESMTDGF 1334 

                   kfsYiddelnpeisnvQlTFLRLLSteAsQNiTYhCKNSvAYmDeatGNl
                   +f+Y++++++p+++++QlTFLRL+SteAsQNiTYhCKNSvAYmD++tGNl
  tem38_gi|1  1335 QFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNL 1384 

                   kkAlilmgSnDvElsadgnskFtYtvlGeDGCssrtgewgKTViEyeTkK
                   kkAl+l+gSn++E++a+gns+FtY+v+ +DGC+s+tg+wgKTViEy+T+K
  tem38_gi|1  1385 KKALLLKGSNEIEIRAEGNSRFTYSVT-VDGCTSHTGAWGKTVIEYKTTK 1433 

                   ttRLPIvDiApsDiGgedQeFGveiGPVCF<-*
                    +RLPI+D+Ap+D+G +dQeFG+++GPVCF   
  tem38_gi|1  1434 SSRLPIIDVAPLDVGAPDQEFGFDVGPVCF    1463 

//

Start with PfamFrag (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/pfam/PfamFrag
Sequence file:            tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  tem38_gi|1418928|emb|CAA98968.1|  prepro-alpha1(I) collagen [Homo sapiens]

Scores for sequence family classification (score includes all domains):
Model        Description                                Score    E-value  N 
--------     -----------                                -----    ------- ---
Collagen     Collagen triple helix repeat (20 copies)   946.7   5.9e-281  18
COLFI        Fibrillar collagen C-terminal domain       565.2     2e-220   1
fibrinogen_C Fibrinogen beta and gamma chains, C-term    -0.3         50   1
CBIA         Cobyrinic acid a,c-diamide synthase         -0.7         93   1
LBP_BPI_CETP LBP / BPI / CETP family                     -0.7         57   1

Parsed for domains:
Model        Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
--------     ------- ----- -----    ----- -----      -----  -------
LBP_BPI_CETP   1/1       7    29 ..     1    23 [.    -0.7       57
Collagen       1/18    109   158 ..     1    50 [.    27.3    5e-06
CBIA           1/1     174   189 ..     1    16 [.    -0.7       93
Collagen       2/18    177   235 ..     1    60 []    50.4  2.3e-12
Collagen       3/18    236   295 ..     1    60 []    75.7  2.5e-19
Collagen       4/18    296   355 ..     1    60 []    64.9  2.4e-16
Collagen       5/18    356   415 ..     1    60 []    61.6  1.9e-15
Collagen       6/18    416   475 ..     1    60 []    61.1  2.6e-15
Collagen       7/18    476   535 ..     1    60 []    63.9  4.4e-16
Collagen       8/18    536   595 ..     1    60 []    64.6  2.9e-16
Collagen       9/18    596   655 ..     1    60 []    62.1  1.4e-15
Collagen      10/18    656   715 ..     1    60 []    60.6  3.6e-15
Collagen      11/18    716   775 ..     1    60 []    70.2  8.4e-18
Collagen      12/18    779   838 ..     1    60 []    68.4  2.7e-17
Collagen      13/18    839   898 ..     1    60 []    60.5    4e-15
Collagen      14/18    899   958 ..     1    60 []    59.2  8.8e-15
Collagen      15/18    959  1018 ..     1    60 []    62.7  9.9e-16
Collagen      16/18   1020  1078 ..     1    60 []    54.4  1.8e-13
Collagen      17/18   1079  1138 ..     1    60 []    73.9  8.1e-19
Collagen      18/18   1139  1192 ..     1    54 [.    40.6  1.2e-09
fibrinogen_C   1/1    1271  1295 ..    18    43 ..    -0.3       50
COLFI          1/1    1245  1463 ..     1   226 []   565.2   2e-220

Alignments of top-scoring domains:
LBP_BPI_CETP: domain 1 of 1, from 7 to 29: score -0.7, E = 57
                   *->alllllvlislavalrtnPgivv<-*
                      ++llll+++  ++++++ +g v+   
  tem38_gi|1     7    LRLLLLLAATALLTHGQEEGQVE    29   

Collagen: domain 1 of 18, from 109 to 158: score 27.3, E = 5e-06
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G  Gp G  Gp+Gp+Gp+GppG +G pG pG pG+pGpPGppGppG 
  tem38_gi|1   109    GVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGL 155  

                   pGp<-*
                    G+   
  tem38_gi|1   156 GGN    158  

CBIA: domain 1 of 1, from 174 to 189: score -0.7, E = 93
                   *->almiaGtsSgaGKttl<-*
                      ++ ++G++  +G+++l   
  tem38_gi|1   174    GISVPGPMGPSGPRGL    189  

Collagen: domain 2 of 18, from 177 to 235: score 50.4, E = 2.3e-12
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                        pGp+Gp Gp+G pGppG+pGp+G++GppG pGepG+ Gp Gp Gp
  tem38_gi|1   177    -VPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGP 222  

                   pGppGapGapGpp<-*
                   pGppG+ G+ G++   
  tem38_gi|1   223 PGPPGKNGDDGEA    235  

Collagen: domain 3 of 18, from 236 to 295: score 75.7, E = 2.5e-19
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG+pG++GppGp G++G pG aG pG++G++G++G +G++G +Gp
  tem38_gi|1   236    GKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGP 282  

                   pGppGapGapGpp<-*
                   +Gp+G+pG+pG++   
  tem38_gi|1   283 AGPKGEPGSPGEN    295  

Collagen: domain 4 of 18, from 296 to 355: score 64.9, E = 2.4e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG++Gp+G+pG++G+pG+pGpaGa+G+ G+ G++GpPGp Gp+Gp
  tem38_gi|1   296    GAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGP 342  

                   pGppGapGapGpp<-*
                   pG pGa Ga+G++   
  tem38_gi|1   343 PGFPGAVGAKGEA    355  

Collagen: domain 5 of 18, from 356 to 415: score 61.6, E = 1.9e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp Gp+G+ Gp+G +G+pGppGpaGa+Gp+G+pG++G+PG++G++G+
  tem38_gi|1   356    GPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGA 402  

                   pGppGapGapGpp<-*
                   pG +GapG pG++   
  tem38_gi|1   403 PGIAGAPGFPGAR    415  

Collagen: domain 6 of 18, from 416 to 475: score 61.1, E = 2.6e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp Gp+Gp GppGp+G++G+pG++G++G+ G++GepGp G +GppGp
  tem38_gi|1   416    GPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGP 462  

                   pGppGapGapGpp<-*
                   +G++G+ Ga G+p   
  tem38_gi|1   463 AGEEGKRGARGEP    475  

Collagen: domain 7 of 18, from 476 to 535: score 63.9, E = 4.4e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp G+pGppG++G pG +G+pG++G +Gp+Gp+Ge+G+PGp+Gp G+
  tem38_gi|1   476    GPTGLPGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGS 522  

                   pGppGapGapGpp<-*
                   pG++G+pG++G p   
  tem38_gi|1   523 PGEAGRPGEAGLP    535  

Collagen: domain 8 of 18, from 536 to 595: score 64.6, E = 2.9e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G++G+ G+pG pGp+G+ GppGpaG  G pGppG+pG+ G++G++G+
  tem38_gi|1   536    GAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGF 582  

                   pGppGapGapGpp<-*
                   pGp+Ga+G+pG++   
  tem38_gi|1   583 PGPKGAAGEPGKA    595  

Collagen: domain 9 of 18, from 596 to 655: score 62.1, E = 1.4e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G++G pGppG+ Gp+G+ G++G++G+pGp+Gp+Ge+G++Gp+G pG+
  tem38_gi|1   596    GERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGF 642  

                   pGppGapGapGpp<-*
                   +G pG++G+pG++   
  tem38_gi|1   643 QGLPGPAGPPGEA    655  

Collagen: domain 10 of 18, from 656 to 715: score 60.6, E = 3.6e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG++G pG+ G+pGp+G+ G++G+pG++G +G+pGp Gp+G++G+
  tem38_gi|1   656    GKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGA 702  

                   pGppGapGapGpp<-*
                   pG++Ga+G++G+p   
  tem38_gi|1   703 PGNDGAKGDAGAP    715  

Collagen: domain 11 of 18, from 716 to 775: score 70.2, E = 8.4e-18
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G+pG++G+pG++G+pG++G++G +G++G++G++G++G++G+pG++G 
  tem38_gi|1   716    GAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGV 762  

                   pGppGapGapGpp<-*
                   +G +G++G+pGp+   
  tem38_gi|1   763 RGLTGPIGPPGPA    775  

Collagen: domain 12 of 18, from 779 to 838: score 68.4, E = 2.7e-17
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G +G+ Gp Gp+Gp+G++G+pG++G+pGppGp+G++GpPG++G+pG+
  tem38_gi|1   779    GDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGA 825  

                   pGppGapGapGpp<-*
                   +G+pG +Ga+G +   
  tem38_gi|1   826 KGEPGDAGAKGDA    838  

Collagen: domain 13 of 18, from 839 to 898: score 60.5, E = 4e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      GppGp+Gp+GppGp G+ G+pG++Ga+G++GppG+ G+PG++G+ Gp
  tem38_gi|1   839    GPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGP 885  

                   pGppGapGapGpp<-*
                   pGp G++G+pGpp   
  tem38_gi|1   886 PGPSGNAGPPGPP    898  

Collagen: domain 14 of 18, from 899 to 958: score 59.2, E = 8.8e-15
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp+G+ G++Gp+G++Gp+G pG+ G+pGppGp+Ge+G+PG++Gp+G+
  tem38_gi|1   899    GPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGA 945  

                   pGppGapGapGpp<-*
                   pG pG+ G +G++   
  tem38_gi|1   946 PGTPGPQGIAGQR    958  

Collagen: domain 15 of 18, from 959 to 1018: score 62.7, E = 9.9e-16
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      G  G+pG +G++G pG pGp G++G++Gp G++Ge+GpPGp GppG 
  tem38_gi|1   959    GVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGL 1005 

                   pGppGapGapGpp<-*
                   +GppG++G +G+p   
  tem38_gi|1  1006 AGPPGESGREGAP    1018 

Collagen: domain 16 of 18, from 1020 to 1078: score 54.4, E = 1.8e-13
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                       + G+pG+ G pG++G++G++GpaG+pG pG+pG+pGp Gp+G+ G 
  tem38_gi|1  1020    -AEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGD 1065 

                   pGppGapGapGpp<-*
                   +G++G++G++Gp+   
  tem38_gi|1  1066 RGETGPAGPAGPV    1078 

Collagen: domain 17 of 18, from 1079 to 1138: score 73.9, E = 8.1e-19
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp G +Gp+Gp+Gp+G++G++G++G +G +G++G++G +GppGppG+
  tem38_gi|1  1079    GPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGS 1125 

                   pGppGapGapGpp<-*
                   pG++G++Ga Gp+   
  tem38_gi|1  1126 PGEQGPSGASGPA    1138 

Collagen: domain 18 of 18, from 1139 to 1192: score 40.6, E = 1.2e-09
                   *->GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGp
                      Gp+GppG++G+pG +G  G pGp G+pGp+G  G++Gp GppGppGp
  tem38_gi|1  1139    GPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGP 1185 

                   pGppGap<-*
                   pGppG+p   
  tem38_gi|1  1186 PGPPGPP    1192 

fibrinogen_C: domain 1 of 1, from 1271 to 1295: score -0.3, E = 50
                   *->SPPGlYtIqPd.gakeqpllVYCDmet<-*
                      S  G Y I P++g +  +++V+C met   
  tem38_gi|1  1271    S--GEYWIDPNqGCNLDAIKVFCNMET    1295 

COLFI: domain 1 of 1, from 1245 to 1463: score 565.2, E = 2e-220
                   *->lksPeGksrknPARtCkDLfLchpefksGeYWiDPNqGCikDAikVf
                      ++sPeG srknPARtC+DL++ch+++ksGeYWiDPNqGC++DAikVf
  tem38_gi|1  1245    IRSPEG-SRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVF 1290 

                   CnkrfetGvgeTCisptpksvpkRiksWykgks.kdkKhvWFgetmegGf
                   Cn  +etG  eTC++pt+ sv++  k+Wy +k++kdk+hvWFge+m++Gf
  tem38_gi|1  1291 CN--METG--ETCVYPTQPSVAQ--KNWYISKNpKDKRHVWFGESMTDGF 1334 

                   kfsYiddelnpeisnvQlTFLRLLSteAsQNiTYhCKNSvAYmDeatGNl
                   +f+Y++++++p+++++QlTFLRL+SteAsQNiTYhCKNSvAYmD++tGNl
  tem38_gi|1  1335 QFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNL 1384 

                   kkAlilmgSnDvElsadgnskFtYtvlGeDGCssrtgewgKTViEyeTkK
                   kkAl+l+gSn++E++a+gns+FtY+v+ +DGC+s+tg+wgKTViEy+T+K
  tem38_gi|1  1385 KKALLLKGSNEIEIRAEGNSRFTYSVT-VDGCTSHTGAWGKTVIEYKTTK 1433 

                   ttRLPIvDiApsDiGgedQeFGveiGPVCF<-*
                    +RLPI+D+Ap+D+G +dQeFG+++GPVCF   
  tem38_gi|1  1434 SSRLPIIDVAPLDVGAPDQEFGFDVGPVCF    1463 

//

Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib
Sequence file:            tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  tem38_gi|1418928|emb|CAA98968.1|  prepro-alpha1(I) collagen [Homo sapiens]

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Prosite
---------------------------------------------------------
|          ppsearch (c) 1994 EMBL Data Library          |
|       based on MacPattern (c) 1990-1994 R. Fuchs      |
---------------------------------------------------------

PROSITE pattern search started: Tue Oct 31 18:41:24 2000

Sequence file: tem38

----------------------------------------
Sequence tem38_gi|1418928|emb|CAA98968.1| (1464 residues):

Matching pattern PS00001 ASN_GLYCOSYLATION:
 1365: NITY
Total matches: 1

Matching pattern PS00005 PKC_PHOSPHO_SITE:
 1012: SGR
 1234: TLK
 1251: SRK
 1258: TCR
 1431: TTK
 1434: SSR
Total matches: 6

Matching pattern PS00006 CK2_PHOSPHO_SITE:
    3: SFVD
  101: SPTD
  103: TDQE
  108: TGVE
  271: SGLD
  291: SPGE
  441: SKGD
  522: SPGE
 1012: SGRE
 1125: SPGE
 1258: TCRD
 1329: SMTD
 1425: TVIE
Total matches: 13

Matching pattern PS00007 TYR_PHOSPHO_SITE:
 1208: KAHDGGRY
Total matches: 1

Matching pattern PS00008 MYRISTYL:
   22: GQEEGQ
   26: GQVEGQ
  154: GLGGNF
  254: GLPGTA
  272: GLDGAK
  320: GARGND
  323: GNDGAT
  326: GATGAA
  347: GAVGAK
  386: GNPGAD
  392: GQPGAK
  395: GAKGAN
  437: GAPGSK
  488: GGPGSR
  533: GLPGAK
  701: GAPGND
  704: GNDGAK
  716: GAPGSQ
  821: GQPGAK
  857: GAPGAK
  860: GAKGAR
  863: GARGSA
  935: GSPGAD
 1016: GAPAAE
 1028: GSPGAK
 1339: GGQGSD
 1342: GSDPAD
Total matches: 27

Matching pattern PS00009 AMIDATION:
  466: EGKR
Total matches: 1

Matching pattern PS00016 RGD:
  745: RGD
 1093: RGD
Total matches: 2

Matching pattern PS01208 VWFC:
   58: CRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVC
Total matches: 1

Total no of hits in this sequence: 52

========================================

1314 pattern(s) searched in 1 sequence(s), 1464 residues.
Total no of hits in all sequences: 52.
Search time: 00:00 min

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with Profile Search

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

Start with motif search against own library
***** bioMotif : Version V41a DB, 1999 Nov 11 *****
argv[1]=P 
argv[2]=-m  /data/patterns/own/motif.fa
argv[4]=-seq  tem38

     ***** bioMotif : Version V41a DB, 1999 Nov 11 *****
          SeqTyp=2 : PROTEIN  search; 


>APC D-Box is the MOTIF name

>STATISTICS Total   : 0 solutions in 0 sequences, 0 units;  out of 1 sequences, 1464 units


~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~

Start with HMM-search search against own library
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm.lib
Sequence file:            tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  tem38_gi|1418928|emb|CAA98968.1|  prepro-alpha1(I) collagen [Homo sapiens]

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file:                 /data/patterns/own/own-hmm-f.lib
Sequence file:            tem38
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query:  tem38_gi|1418928|emb|CAA98968.1|  prepro-alpha1(I) collagen [Homo sapiens]

Scores for sequence family classification (score includes all domains):
Model    Description                                    Score    E-value  N 
-------- -----------                                    -----    ------- ---
	[no hits above thresholds]

Parsed for domains:
Model    Domain  seq-f seq-t    hmm-f hmm-t      score  E-value
-------- ------- ----- -----    ----- -----      -----  -------
	[no hits above thresholds]

Alignments of top-scoring domains:
	[no hits above thresholds]
//

~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~

L. Aravind's signalling DB
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen
[Homo sapiens]
         (1464 letters)

Searching..................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

14-3-3 14-3-3 protein alpha Helical domain                         27  0.13
FYVE Zinc Finger domain involved in PtdIns(3)P binding             27  0.18
UBA Ubiquitin pathway associated domain                            27  0.23
MATH The Meprin associated TRAF homology domain                    26  0.50
RASGAP RAS-type GTPase GTP hydrolysis activating protein           25  0.61
MIZFIN  MIZ type Cysteine zinc DNA binding domain                  25  0.82
RASGEF RAS-type GTPase GDP exchange factor                         24  1.2
SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chrom...    23  2.1
BRIGHT BRIGHT domain (Alpha helical DNA binding domain)            23  2.4
DHHC Novel zinc finger domain with DHHC signature                  22  4.3
PHD PHD zinc finger(A cysteine rich DNA binding domain)            22  4.9
INSL Insulinase like Metallo protease domain                       21  8.9

>14-3-3 14-3-3 protein alpha Helical domain 
          Length = 270

 Score = 27.3 bits (60), Expect = 0.13
 Identities = 5/27 (18%), Positives = 5/27 (18%)

Query: 820 DGQPGAKGEPGDAGAKGDAGPPGPAGP 846
                     G         P G A P
Sbjct: 240 SAAAAGGNTEGAQENAPSNAPEGEAEP 266


>FYVE Zinc Finger domain involved in PtdIns(3)P binding 
          Length = 99

 Score = 27.0 bits (59), Expect = 0.18
 Identities = 14/41 (34%), Positives = 19/41 (46%), Gaps = 10/41 (24%)

Query: 59 RICVCDN-GKVLCDDVICDETKNCPGAEVPE---GECCPVC 95
          R+   D  GK++C D+      NC   E PE    +CC  C
Sbjct: 2  RLFSADEHGKLMCWDM------NCKRVETPEWKTSDCCQKC 36


>UBA Ubiquitin pathway associated domain 
          Length = 255

 Score = 26.6 bits (58), Expect = 0.23
 Identities = 25/82 (30%), Positives = 30/82 (36%), Gaps = 5/82 (6%)

Query: 813 FAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPG 872
             G P    QP    EP    A     P   A  A  P    ++ A  A+G   S+G  G
Sbjct: 57  LMGIPENLRQP----EPQQQTAAAAEQPSTAATTAEQPAED-DLFAQAAQGGNASSGALG 111

Query: 873 ATGFPGAAGRVGPPGPSGNAGP 894
            TG    A + GPPG  G    
Sbjct: 112 TTGGATDAAQGGPPGSIGLTVE 133


 Score = 22.7 bits (48), Expect = 3.5
 Identities = 22/85 (25%), Positives = 31/85 (35%), Gaps = 8/85 (9%)

Query: 972  FPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPAAEGSPGRDGSPG 1031
              G+P    +P  Q  + A+ E+        P  A    E   E    A+ + G + S G
Sbjct: 57   LMGIPENLRQPEPQQQTAAAAEQ--------PSTAATTAEQPAEDDLFAQAAQGGNASSG 108

Query: 1032 AKGDRGETGPAGPPGAPGAPGAPGP 1056
            A G  G    A   G PG+ G    
Sbjct: 109  ALGTTGGATDAAQGGPPGSIGLTVE 133


 Score = 22.3 bits (47), Expect = 4.2
 Identities = 22/74 (29%), Positives = 26/74 (34%), Gaps = 3/74 (4%)

Query: 1116 LQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGL--NGLPGPIGPPGPRGRTGD 1173
            L G P     P  Q  + A+    P     +A  P +D L      G     G  G TG 
Sbjct: 57   LMGIPENLRQPEPQQQTAAAAEQ-PSTAATTAEQPAEDDLFAQAAQGGNASSGALGTTGG 115

Query: 1174 AGPVGPPGPPGPPG 1187
            A      GPPG  G
Sbjct: 116  ATDAAQGGPPGSIG 129


>MATH The Meprin associated TRAF homology domain 
          Length = 209

 Score = 25.6 bits (56), Expect = 0.50
 Identities = 7/18 (38%), Positives = 8/18 (43%)

Query: 925 PGPPGPAGEKGSPGADGP 942
           P PP PA     P A+  
Sbjct: 5   PSPPPPAEMSSGPVAESW 22


 Score = 21.8 bits (46), Expect = 6.8
 Identities = 7/16 (43%), Positives = 9/16 (55%)

Query: 805 PGPPGPAGFAGPPGAD 820
           P PP PA  +  P A+
Sbjct: 5   PSPPPPAEMSSGPVAE 20


 Score = 21.4 bits (45), Expect = 9.7
 Identities = 5/14 (35%), Positives = 6/14 (42%)

Query: 177 VPGPMGPSGPRGLP 190
           VP P  P+     P
Sbjct: 4   VPSPPPPAEMSSGP 17


>RASGAP RAS-type GTPase GTP hydrolysis activating protein  
          Length = 292

 Score = 25.1 bits (54), Expect = 0.61
 Identities = 16/61 (26%), Positives = 29/61 (47%), Gaps = 11/61 (18%)

Query: 1220 DANVVRDRDLEVDTTLKSLSQQIENI-----RSPEGSRKNPARTCRDLKMCHSDWKSGEY 1274
            D + ++DR   VDT L +L   +E +     +S +   K   +   DL+ C     +GE+
Sbjct: 137  DPSKIKDRS-AVDTNLHNLQDYVERVFEAITKSADRCPKVLCQIFHDLREC-----AGEH 190

Query: 1275 W 1275
            +
Sbjct: 191  F 191


>MIZFIN  MIZ type Cysteine zinc DNA binding domain 
          Length = 172

 Score = 24.6 bits (53), Expect = 0.82
 Identities = 18/90 (20%), Positives = 30/90 (33%), Gaps = 17/90 (18%)

Query: 58  CRICVCDNGKVLCDDVICD--------ETKNCPGAEV-PEGECCPVCP--DGSESPTDQE 106
           C +C   + K   + +I D        +  +    +   +G  CP+ P  +  +  T Q 
Sbjct: 50  CPVC---DKKAAYESLILDGLFMEILNDCSDVDEIKFQEDGSWCPMRPKKEAMKV-TSQP 105

Query: 107 TTGVEGPKGDTGP--RGPRGPAGPPGRDGI 134
            T VE     + P        A     D I
Sbjct: 106 CTKVESSSVFSKPCSVTVASDASKKKIDVI 135


>RASGEF RAS-type GTPase GDP exchange factor 
          Length = 196

 Score = 24.4 bits (53), Expect = 1.2
 Identities = 20/105 (19%), Positives = 31/105 (29%), Gaps = 19/105 (18%)

Query: 1344 DPADVAIQLTFLRLMSTEASQNITY-HCKNSVAYMDQQTGNLKKALLLKGSNEIEIRAEG 1402
            D   VA Q+T   L+  E    I +    +    M  +   +   L L   NE      G
Sbjct: 5    DSLSVAQQMT---LIEKEILGEIDWKDLLDLK--MKHEGPQVISWLQLLVRNE---TLSG 56

Query: 1403 NSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKSSRLPI----IDVA 1443
                          + T  W  + I    +   +  +    I VA
Sbjct: 57   IDLAISR------FNLTVDWIISEILLTKSSKMKRNVIQRFIHVA 95


>SET Su(var)3-9, Enhancer of Zeste, trithorax domain (A chromatin associated domain) 
          Length = 219

 Score = 23.4 bits (50), Expect = 2.1
 Identities = 9/60 (15%), Positives = 16/60 (26%), Gaps = 9/60 (15%)

Query: 30 GQDEDIP-PITCVQNGLRYHDRD-----VWKPEPCRICVCDNGKVLCDDVICDETKNCPG 83
          G D  IP P+  V+  L++             +     +C   +               G
Sbjct: 17 GIDSAIPYPVRRVEQLLQFSFLPELQFQNAAVKQRIQRLCYREEKRLA---VSSLAKWLG 73


>BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 
          Length = 172

 Score = 23.4 bits (50), Expect = 2.4
 Identities = 7/29 (24%), Positives = 7/29 (24%)

Query: 413 GARGPSGPQGPGGPPGPKGNSGEPGAPGS 441
           G R   G         P      P  PG 
Sbjct: 132 GRRSSYGQYEAMHNQMPMTPISRPSLPGG 160


 Score = 22.3 bits (47), Expect = 4.4
 Identities = 6/28 (21%), Positives = 7/28 (24%)

Query: 881 GRVGPPGPSGNAGPPGPPGPAGKEGGKG 908
           GR    G         P  P  +    G
Sbjct: 132 GRRSSYGQYEAMHNQMPMTPISRPSLPG 159


>DHHC Novel zinc finger domain with DHHC signature 
          Length = 217

 Score = 22.4 bits (47), Expect = 4.3
 Identities = 9/34 (26%), Positives = 11/34 (31%), Gaps = 2/34 (5%)

Query: 52  VWKPEPCRIC-VCDNGKVLCDDVICDETKNCPGA 84
           V      + C  C+   V   D  C    NC G 
Sbjct: 141 VDVSARSKHCSACNK-CVCGFDHHCKWLNNCVGE 173


>PHD PHD zinc finger(A cysteine rich DNA binding domain) 
          Length = 54

 Score = 22.3 bits (47), Expect = 4.9
 Identities = 12/53 (22%), Positives = 16/53 (29%), Gaps = 17/53 (32%)

Query: 58 CRICVCDNGK-----VLCDDVICDET--KNCPG-------AEVPEGE-CCPVC 95
          C +C           V CD   C+    + C          + P GE  C  C
Sbjct: 3  CSVCQRLQSPPKNRIVFCDG--CNTPFHQLCHEPYISDELLDSPNGEWFCDDC 53


>INSL Insulinase like Metallo protease domain 
          Length = 433

 Score = 21.4 bits (45), Expect = 8.9
 Identities = 5/47 (10%), Positives = 13/47 (27%), Gaps = 1/47 (2%)

Query: 1214 RYYRADDANVVRDRDLEVDTTLKSLSQQIENIR-SPEGSRKNPARTC 1259
             +Y+  +  VV    +      + + +        P    + P    
Sbjct: 196  SFYQPRNMAVVIVGKVNPKEVEEEVMKTFGKEEGRPVPKVQIPTEPE 242


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 105
Number of sequences better than 10.0: 12 
Number of calls to ALIGN: 17 
Length of query: 1464 
Total length of test sequences: 20182  
Effective length of test sequences: 16637.0
Effective search space size: 23806017.2
Initial X dropoff for ALIGN: 25.0 bits

Y. Wolf's SCOP PSSM
IMPALA version 1.1 [20-December-1999]


Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, 
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), 
"IMPALA: Matching a Protein Sequence Against a Collection of 
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.

Query= tem38_gi|1418928|emb|CAA98968.1| prepro-alpha1(I) collagen
[Homo sapiens]
         (1464 letters)

Searching.................................................done
Results from profile search


                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

gi|230410 [1..153] beta-Trefoil                                    32  0.069
gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyl...    28  1.1
gi|1170529 [121..268] beta-Trefoil                                 28  1.4
gi|544107 [14..282] Protein kinases (PK), catalytic core           27  1.5
gi|1825699 [8..257] Ribonuclease H-like motif                      26  4.1
gi|223347 [1..236] Prealbumin-like                                 26  5.4
gi|442904 [1..106] Ferredoxin-like                                 25  9.0

>gi|230410 [1..153] beta-Trefoil 
          Length = 153

 Score = 31.9 bits (72), Expect = 0.069
 Identities = 11/38 (28%), Positives = 18/38 (46%), Gaps = 6/38 (15%)

Query: 1306 SVAQKNWYISKNPKDKRHVWFG-----ESMTDGFQFEY 1338
            S    NWYIS +  +   V+ G     + +TD F  ++
Sbjct: 114  SAQFPNWYISTSQAENMPVFLGGTKGGQDITD-FTMQF 150


>gi|155099 [19..420] S-adenosyl-L-methionine-dependent methyltransferases 
          Length = 402

 Score = 27.9 bits (61), Expect = 1.1
 Identities = 3/64 (4%), Positives = 12/64 (18%), Gaps = 2/64 (3%)

Query: 15 ATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEP--CRICVCDNGKVLCDD 72
            A         +  G + D   +             +         + + +    +  +
Sbjct: 34 LRAFREAHGTGYRFVGVEIDPHALDLPPWAEGVVADFLLWEPGEAFDLILGNPPYGIVGE 93

Query: 73 VICD 76
              
Sbjct: 94 ASKY 97


>gi|1170529 [121..268] beta-Trefoil 
          Length = 148

 Score = 27.6 bits (61), Expect = 1.4
 Identities = 13/62 (20%), Positives = 20/62 (31%), Gaps = 9/62 (14%)

Query: 1277 DPNQGCNLDAIK--VFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFG-ESMTDG 1333
             P      +      +    T          SVA  N +I+    +   +  G  S+TD 
Sbjct: 90   IPKTTTGGETNSLSSWETRGTK-----NYFISVAHPNLFIATKHDNWVCLAKGLPSITD- 143

Query: 1334 FQ 1335
            FQ
Sbjct: 144  FQ 145


>gi|544107 [14..282] Protein kinases (PK), catalytic core 
          Length = 269

 Score = 27.4 bits (59), Expect = 1.5
 Identities = 5/61 (8%), Positives = 9/61 (14%), Gaps = 6/61 (9%)

Query: 1259 CRDLKMCHS------DWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNW 1312
               L+          D       +D   G    +       +                  
Sbjct: 101  SSALEYLEKHGILHRDIHPNNILLDSMNGPAYLSDFSIAWSKQHPGEEVQELIPQIGTGH 160

Query: 1313 Y 1313
            Y
Sbjct: 161  Y 161


>gi|1825699 [8..257] Ribonuclease H-like motif 
          Length = 250

 Score = 26.3 bits (57), Expect = 4.1
 Identities = 6/47 (12%), Positives = 10/47 (20%), Gaps = 4/47 (8%)

Query: 1376 YMDQQTGNLKKALLLKGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAW 1422
               Q    +KK    +          G      S+ +      T   
Sbjct: 52   LFLQFLRVIKK--AYETLPPNAHVDVGLCTQRNSIVLWN--KRTLKE 94


>gi|223347 [1..236] Prealbumin-like 
          Length = 236

 Score = 25.7 bits (56), Expect = 5.4
 Identities = 6/38 (15%), Positives = 8/38 (20%)

Query: 357 PQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQP 394
           PQ    + GP       G                 G+ 
Sbjct: 40  PQSISETTGPNFSHLGFGAHDHDLLLNFNNGGLPIGER 77


>gi|442904 [1..106] Ferredoxin-like 
          Length = 106

 Score = 24.9 bits (53), Expect = 9.0
 Identities = 10/67 (14%), Positives = 16/67 (22%), Gaps = 11/67 (16%)

Query: 39 TCVQNGLRYHDRDVWKPEP----CRICV--CDNG-KVLCDDVICDETKNCPGAEVPEGEC 91
           C  +        +         C +C   C        D+V  D  +          E 
Sbjct: 19 VCPVDCFYEGPNFLVIHPDECIDCALCEPECPAQAIFSEDEVPEDMQEFIQLN----AEL 74

Query: 92 CPVCPDG 98
            V P+ 
Sbjct: 75 AEVWPNI 81


Underlying Matrix: BLOSUM62
Number of sequences tested against query: 1187
Number of sequences better than 10.0: 7 
Number of calls to ALIGN: 7 
Length of query: 1464 
Total length of test sequences: 256703  
Effective length of test sequences: 210706.0
Effective search space size: 300338576.9
Initial X dropoff for ALIGN: 25.0 bits