analysis of sequence from tem37
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
>tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMYCDELKLKSVPMVPPGIKY
LYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLE
DLQLTHNKITKLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPVSLLTL
YLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYY
LEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
sec.str. with predator
> tem37_gi|1708878|sp|P51884|LUM_HUMAN
. . . . .
1 MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSA 50
___HHHHHHHHH______________________________________
. . . . .
51 MYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHN 100
_HHHHHHH_________EEEEEE_______________HHHHHHHHHH__
. . . . .
101 VLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKIT 150
________HHHHHHHHHHHHH______EEE_____HHHHHHHHH_HHHHH
. . . . .
151 KLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLP 200
_______HHHHHHHHHHH___HHHHHHHHHH_________HHHHHHH___
. . . . .
201 SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNS 250
____HHHHHHH___________HHHHHHHHHHHHHHH_____________
. . . . .
251 FNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP 300
__EEEE___________________HHHHHHHHH______HHHHHH____
. . .
301 LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN 338
_________________________HHHHHHHH_____
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
method : 1
alpha-contents : 37.8 %
beta-contents : 20.3 %
coil-contents : 41.9 %
class : mixed
method : 2
alpha-contents : 33.4 %
beta-contents : 6.6 %
coil-contents : 60.0 %
class : alpha
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
GPI: learning from metazoa
-26.13 0.00 0.00 0.00 -4.00 -4.00 -12.00 0.00 0.00 -3.79 -2.19 0.00 -12.00 -8.00 0.00 0.00 -72.12
-4.97 0.00 0.00 0.00 0.00 0.00 -24.00 0.00 0.00 -3.75 -1.81 0.00 -12.00 -8.00 0.00 0.00 -54.53
ID: tem37_gi|1708878|sp|P51884|LUM_HUMAN AC: xxx Len: 280 1:I 243 Sc: -54.53 Pv: 2.839984e-01 NO_GPI_SITE
GPI: learning from protozoa
-27.61 0.00 0.00 0.00 -4.00 0.00 -24.00 0.00 0.00 -3.36 -7.11 0.00 -12.00 -8.00 0.00 0.00 -86.09
-18.57 0.00 0.00 -0.40 0.00 0.00 -4.00 -0.60 0.00 -4.71 -10.81 -12.00 -12.00 -8.00 -12.00 0.00 -83.08
ID: tem37_gi|1708878|sp|P51884|LUM_HUMAN AC: xxx Len: 280 1:I 248 Sc: -83.08 Pv: 5.654476e-01 NO_GPI_SITE
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
# SignalP euk predictions
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ?
tem37_gi|17 1.000 19 Y 0.916 19 Y 0.946 13 Y 0.883 Y
# SignalP gram- predictions
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ?
tem37_gi|17 0.529 39 Y 0.307 20 N 0.989 3 Y 0.746 Y
# SignalP gram+ predictions
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ?
tem37_gi|17 0.364 179 N 0.239 20 N 0.945 4 N 0.802 Y
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
low complexity regions: SEG 12 2.2 2.5
>tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
1-271 MSLSAFTLFLALIGGTSGQYYDYDFPPSIY
GQSSPNCAPECNCPESYPSAMYCDELKLKS
VPMVPPGIKYLYLRNNQIDHIDEKAFENVT
DLQWLILDHNVLENSKIKGRVFSKLKQLKK
LHINHNNLTESVGPLPKSLEDLQLTHNKIT
KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF
KGLKSLEYLDLSFNQIARLPSGLPVSLLTL
YLDNNKISNIPDEYFKRFNALQYLRLSHNE
LADSGIPGNSFNVSSLVELDLSYNKLKNIP
T
vnenlenyylevnqle 272-287
288-338 KFDIKSFCKILGPLSYSKIKHLRLDGNRIS
ETSLPPDMYECLRVANEVTLN
low complexity regions: SEG 25 3.0 3.3
>tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
1-338 MSLSAFTLFLALIGGTSGQYYDYDFPPSIY
GQSSPNCAPECNCPESYPSAMYCDELKLKS
VPMVPPGIKYLYLRNNQIDHIDEKAFENVT
DLQWLILDHNVLENSKIKGRVFSKLKQLKK
LHINHNNLTESVGPLPKSLEDLQLTHNKIT
KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF
KGLKSLEYLDLSFNQIARLPSGLPVSLLTL
YLDNNKISNIPDEYFKRFNALQYLRLSHNE
LADSGIPGNSFNVSSLVELDLSYNKLKNIP
TVNENLENYYLEVNQLEKFDIKSFCKILGP
LSYSKIKHLRLDGNRISETSLPPDMYECLR
VANEVTLN
low complexity regions: SEG 45 3.4 3.75
>tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
1-338 MSLSAFTLFLALIGGTSGQYYDYDFPPSIY
GQSSPNCAPECNCPESYPSAMYCDELKLKS
VPMVPPGIKYLYLRNNQIDHIDEKAFENVT
DLQWLILDHNVLENSKIKGRVFSKLKQLKK
LHINHNNLTESVGPLPKSLEDLQLTHNKIT
KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF
KGLKSLEYLDLSFNQIARLPSGLPVSLLTL
YLDNNKISNIPDEYFKRFNALQYLRLSHNE
LADSGIPGNSFNVSSLVELDLSYNKLKNIP
TVNENLENYYLEVNQLEKFDIKSFCKILGP
LSYSKIKHLRLDGNRISETSLPPDMYECLR
VANEVTLN
low complexity regions: XNU
# Score cutoff = 21, Search from offsets 1 to 4
# both members of each repeat flagged
# lambda = 0.347, K = 0.200, H = 0.664
>tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMYCDELKLKS
VPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKK
LHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRLKEDAVSAAF
KGLKSLEYLDLSFNQIARLPSGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNE
LADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP
LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN
1 - 338 MSLSAFTLFL ALIGGTSGQY YDYDFPPSIY GQSSPNCAPE CNCPESYPSA MYCDELKLKS
VPMVPPGIKY LYLRNNQIDH IDEKAFENVT DLQWLILDHN VLENSKIKGR VFSKLKQLKK
LHINHNNLTE SVGPLPKSLE DLQLTHNKIT KLGSFEGLVN LTFIHLQHNR LKEDAVSAAF
KGLKSLEYLD LSFNQIARLP SGLPVSLLTL YLDNNKISNI PDEYFKRFNA LQYLRLSHNE
LADSGIPGNS FNVSSLVELD LSYNKLKNIP TVNENLENYY LEVNQLEKFD IKSFCKILGP
LSYSKIKHLR LDGNRISETS LPPDMYECLR VANEVTLN
low complexity regions: DUST
>tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMYCDELKLKS
VPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKK
LHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRLKEDAVSAAF
KGLKSLEYLDLSFNQIARLPSGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNE
LADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP
LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
coiled coil prediction for tem37_gi|1708878|sp|P51884|LUM_HUMAN
sequence: 280 amino acids, 0 residue(s) in coiled coil state
. | . | . | . | . | . 60
MSLSAFTLFL ALIGGTSGQY YDYDFPPSIY GQSSPNCAPE CNCPESYPSA MYCDELKLKS
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 120
VPMVPPGIKY LYLRNNQIDH IDEKAFENVT DLQWLILDHN VLENSKIKGR VFSKLKQLKK
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~1124444 * 14 M'95 -w local
. | . | . | . | . | . 180
LHINHNNLTE SVGPLPKSLE DLQLTHNKIT KLGSFEGLVN LTFIHLQHNR LKEDAVSAAF
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
4444444444 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . | . | . 240
KGLKSLEYLD LSFNQIARLP SGLPVSLLTL YLDNNKISNI PDEYFKRFNA LQYLRLSHNE
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border
---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif.
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local
. | . | . | . |
LADSGIPGNS FNVSSLVELD LSYNKLKNIP TVNENLENYY
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~
---------- ---------- ---------- ----------
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~
~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
prediction of transmembrane regions with toppred2
***********************************
*TOPPREDM with eukaryotic function*
***********************************
tem37.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: tem37.___inter___
(1 sequences)
MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSA
MYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHN
VLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKIT
KLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLP
SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNS
FNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP
LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN
(p)rokaryotic or (e)ukaryotic: e
Charge-pair energy: 0
Length of full window (odd number!): 21
Length of core window (odd number!): 11
Number of residues to add to each end of helix: 1
Critical length: 60
Upper cutoff for candidates: 1
Lower cutoff for candidates: 0.6
Total of 1 structures are to be tested
Candidate membrane-spanning segments:
Helix Begin End Score Certainity
1 1 21 1.745 Certain
----------------------------------------------------------------------
Structure 1
Transmembrane segments included in this structure:
Segment 1
Loop length 0 317
K+R profile 1.00
+
CYT-EXT prof -
0.51
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 1.00
Tm probability: 1.00
-> Orientation: N-in
Charge-difference over N-terminal Tm (+-15 residues): 2.00
(NEG-POS)/(NEG+POS): 0.0521
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: -0.51
-> Orientation: N-in
----------------------------------------------------------------------
"tem37" 338
1 21 #t 1.74479
************************************
*TOPPREDM with prokaryotic function*
************************************
tem37.___inter___ is a single sequence
Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale
Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok
Using sequence file: tem37.___inter___
(1 sequences)
MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSA
MYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHN
VLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKIT
KLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLP
SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNS
FNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP
LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN
(p)rokaryotic or (e)ukaryotic: p
Charge-pair energy: 0
Length of full window (odd number!): 21
Length of core window (odd number!): 11
Number of residues to add to each end of helix: 1
Critical length: 60
Upper cutoff for candidates: 1
Lower cutoff for candidates: 0.6
Total of 1 structures are to be tested
Candidate membrane-spanning segments:
Helix Begin End Score Certainity
1 1 21 1.745 Certain
----------------------------------------------------------------------
Structure 1
Transmembrane segments included in this structure:
Segment 1
Loop length 0 317
K+R profile 1.00
+
CYT-EXT prof -
0.51
For CYT-EXT profile neg. values indicate cytoplasmic preference.
K+R difference: 1.00
Tm probability: 1.00
-> Orientation: N-in
Charge-difference over N-terminal Tm (+-15 residues): 2.00
(NEG-POS)/(NEG+POS): 0.0521
NEG: 0.0000
POS: 0.0000
-> Orientation: N-in
CYT-EXT difference: -0.51
-> Orientation: N-in
----------------------------------------------------------------------
"tem37" 338
1 21 #t 1.74479
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem37.___saps___
SAPS. Version of April 11, 1996.
Date run: Tue Oct 31 18:26:47 2000
File: /people/maria/tem37.___saps___
ID tem37_gi|1708878|sp|P51884|LUM_HUMAN
DE LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
number of residues: 338; molecular weight: 38.4 kdal
1 MSLSAFTLFL ALIGGTSGQY YDYDFPPSIY GQSSPNCAPE CNCPESYPSA MYCDELKLKS
61 VPMVPPGIKY LYLRNNQIDH IDEKAFENVT DLQWLILDHN VLENSKIKGR VFSKLKQLKK
121 LHINHNNLTE SVGPLPKSLE DLQLTHNKIT KLGSFEGLVN LTFIHLQHNR LKEDAVSAAF
181 KGLKSLEYLD LSFNQIARLP SGLPVSLLTL YLDNNKISNI PDEYFKRFNA LQYLRLSHNE
241 LADSGIPGNS FNVSSLVELD LSYNKLKNIP TVNENLENYY LEVNQLEKFD IKSFCKILGP
301 LSYSKIKHLR LDGNRISETS LPPDMYECLR VANEVTLN
--------------------------------------------------------------------------------
COMPOSITIONAL ANALYSIS (extremes relative to: swp23s)
A- : 12( 3.6%); C : 6( 1.8%); D : 17( 5.0%); E : 21( 6.2%); F : 14( 4.1%)
G : 15( 4.4%); H : 9( 2.7%); I : 19( 5.6%); K : 25( 7.4%); L+ : 53(15.7%)
M : 4( 1.2%); N+ : 30( 8.9%); P : 19( 5.6%); Q : 10( 3.0%); R : 9( 2.7%)
S : 31( 9.2%); T : 11( 3.3%); V : 15( 4.4%); W : 1( 0.3%); Y : 17( 5.0%)
KR : 34 ( 10.1%); ED : 38 ( 11.2%); AGP : 46 ( 13.6%);
KRED : 72 ( 21.3%); KR-ED : -4 ( -1.2%); FIKMNY : 109 ( 32.2%);
LVIFM : 105 ( 31.1%); ST : 42 ( 12.4%).
--------------------------------------------------------------------------------
CHARGE DISTRIBUTIONAL ANALYSIS
1 0000000000 0000000000 0-0-000000 000000000- 0000-00000 000--0+0+0
61 00000000+0 000+0000-0 0--+00-000 -000000-00 00-00+0+0+ 000+0+00++
121 000000000- 000000+00- -000000+00 +0000-0000 000000000+ 0+--000000
181 +00+00-00- 0000000+00 0000000000 00-00+0000 0--00++000 0000+0000-
241 00-0000000 0000000-0- 0000+0+000 000-00-000 0-0000-+0- 0+000+0000
301 0000+0+00+ 0-00+00-00 000-00-00+ 000-0000
A. CHARGE CLUSTERS.
Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none
Negative charge clusters (cmin = 10/30 or 13/45 or 16/60): none
Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60): none
B. HIGH SCORING (UN)CHARGED SEGMENTS.
There are no high scoring positive charge segments.
There are no high scoring negative charge segments.
There are no high scoring mixed charge segments.
There are no high scoring uncharged segments.
C. CHARGE RUNS AND PATTERNS.
pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)|
lmin0 4 | 5 | 7 | 37 | 9 | 9 | 12 | 11 | 11 | 14 | 7 | 8 |
lmin1 6 | 6 | 8 | 45 | 11 | 11 | 15 | 13 | 14 | 18 | 8 | 10 |
lmin2 7 | 7 | 10 | 50 | 12 | 13 | 16 | 15 | 15 | 20 | 9 | 12 |
(Significance level: 0.010000; Minimal displayed length: 6)
(*00) 14(0,0,0); at 179- 192: AFKGLKSLEYLDLS
(3. quartile) 00+00+00-00-00
Run count statistics:
+ runs >= 3: 0
- runs >= 3: 0
* runs >= 4: 0
0 runs >= 25: 0
--------------------------------------------------------------------------------
DISTRIBUTION OF OTHER AMINO ACID TYPES
1. HIGH SCORING SEGMENTS.
There are no high scoring hydrophobic segments.
____________________________________
High scoring transmembrane segments:
5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST)
-6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED)
Expected score/letter: -3.896
M_0.01= 57.11; M_0.05= 46.67; M_0.30= 34.25
1) From 1 to 15: length= 15, score=39.00
1 MSLSAFTLFL ALIGG
L: 4(26.7%); A: 2(13.3%); G: 2(13.3%); S: 2(13.3%);
F: 2(13.3%);
2. SPACINGS OF C.
H2N-36-C-3-C-1-C-9-C-241-C-32-C-10-COOH
2*. SPACINGS OF C and H. (additional deluxe function for ALEX)
H2N-36-C-3-C-1-C-9-C-26-H-18-H-22-H-2-H-20-H-18-H-2-H-69-H-56-C-12-H-19-C-10-COOH
--------------------------------------------------------------------------------
REPETITIVE STRUCTURES.
A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet.
Repeat core block length: 4
Aligned ma
tching blocks:
[ 105- 108] SKIK
[ 304- 307] SKIK
______________________________
[ 137- 142] KSLE__DL
[ 184- 191] KSLEYLDL
______________________________
[ 185- 192] SLEYLDLS
[ 255- 262] SLVELDLS
B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet.
(i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C)
Repeat core block length: 8
--------------------------------------------------------------------------------
MULTIPLETS.
A. AMINO ACID ALPHABET.
1. Total number of amino acid multiplets: 14 (Expected range: 8-- 36)
2. Histogram of spacings between consecutive amino acid multiplets:
(1-5) 2 (6-10) 4 (11-20) 2 (>=21) 7
3. Clusters of amino acid multiplets (cmin = 9/30 or 11/45 or 13/60): none
B. CHARGE ALPHABET.
1. Total number of charge multiplets: 7 (Expected range: 0-- 15)
2 +plets (f+: 10.1%), 5 -plets (f-: 11.2%)
Total number of charge altplets: 3 (Critical number: 18)
2. Histogram of spacings between consecutive charge multiplets:
(1-5) 1 (6-10) 0 (11-20) 1 (>=21) 6
--------------------------------------------------------------------------------
PERIODICITY ANALYSIS.
A. AMINO ACID ALPHABET (core: 4; !-core: 5)
Location Period Element Copies Core Errors
121- 148 7 L...... 4 4 0
183- 214 8 L....... 4 4 0
219- 258 10 N......... 4 4 0
256- 295 10 L......... 4 4 0
B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6)
and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9)
Location Period Element Copies Core Errors
149- 166 3 i0. 6 6 /0/2/./
231- 335 5 i.0.. 18 8 /3/./7/././
--------------------------------------------------------------------------------
SPACING ANALYSIS.
Location (Quartile) Spacing Rank P-value Interpretation
0- 37 (1.) C( 37)C 2 of 7 0.9909 small 2. maximal spacing
19- 32 (1.) Q( 13)Q 11 of 11 0.0041 large minimal spacing
53- 295 (3.) C( 242)C 1 of 7 0.0033 large 1. maximal spacing
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with Pfam (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/pfam/Pfam
Sequence file: tem37
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
LRR Leucine Rich Repeat 128.1 1.7e-34 10
LRRNT Leucine rich repeat N-terminal domain 42.4 9.9e-09 1
crp Bacterial regulatory proteins, crp famil 2.5 40 1
E1_N E1 Protein, N terminal domain -0.5 87 1
lyase_1 Lyase -1.1 78 1
DUF41 Domain of unknown function DUF41 -73.7 46 1
PI3Ka Phosphoinositide 3-kinase family, access -106.9 94 1
DNA_ligase_N NAD-dependent DNA ligase -256.9 70 1
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
LRRNT 1/1 36 66 .. 1 31 [] 42.4 9.9e-09
LRR 1/10 67 90 .. 1 23 [] 17.6 0.3
LRR 2/10 91 116 .. 1 23 [] 15.0 1.8
LRR 3/10 117 136 .. 1 23 [] 7.0 96
lyase_1 1/1 138 156 .. 459 485 .] -1.1 78
LRR 4/10 138 159 .. 1 23 [] 18.3 0.19
LRR 5/10 160 184 .. 1 23 [] 15.0 1.8
DUF41 1/1 81 204 .. 1 247 [] -73.7 46
LRR 6/10 185 204 .. 1 23 [] 16.2 0.78
LRR 7/10 206 229 .. 1 23 [] 19.3 0.089
crp 1/1 234 243 .. 1 11 [. 2.5 40
DNA_ligase_N 1/1 83 247 .. 1 327 [] -256.9 70
E1_N 1/1 235 253 .. 142 161 .] -0.5 87
LRR 8/10 230 254 .. 1 23 [] 11.3 22
PI3Ka 1/1 99 274 .. 1 215 [] -106.9 94
LRR 9/10 255 277 .. 1 23 [] 15.4 1.3
LRR 10/10 305 330 .. 1 23 [] 15.8 1
Alignments of top-scoring domains:
LRRNT: domain 1 of 1, from 36 to 66: score 42.4, E = 9.9e-09
*->aCpreCtCsp.fglvVdCsgrgLtlevPrdlP<-*
C++eC+C+++++++++C++++L+ +vP++ P
tem37_gi|1 36 NCAPECNCPEsYPSAMYCDELKLK-SVPMVPP 66
LRR: domain 1 of 10, from 67 to 90: score 17.6, E = 0.3
*->nLeeLdLsnN.LtslppglfsnLp<-*
+++L+L+nN++ ++++++f+n +
tem37_gi|1 67 GIKYLYLRNNqIDHIDEKAFENVT 90
LRR: domain 2 of 10, from 91 to 116: score 15.0, E = 1.8
*->nLeeLdLsnN.Lt..slppglfsnLp<-*
L++L L++N L+++++ +fs+L+
tem37_gi|1 91 DLQWLILDHNvLEnsKIKGRVFSKLK 116
LRR: domain 3 of 10, from 117 to 136: score 7.0, E = 96
*->nLeeLdLsnN.LtslppglfsnLp<-*
+L++L++++N+Lt ++ Lp
tem37_gi|1 117 QLKKLHINHNnLT--ES--VGPLP 136
lyase_1: domain 1 of 1, from 138 to 156: score -1.1, E = 78
*->alelgqlteeefdsivsPvfefarSve<-*
+le++qlt+++++++ s e
tem37_gi|1 138 SLEDLQLTHNKITKLGSF--------E 156
LRR: domain 4 of 10, from 138 to 159: score 18.3, E = 0.19
*->nLeeLdLsnN.LtslppglfsnLp<-*
+Le L L +N++t+l + f++L
tem37_gi|1 138 SLEDLQLTHNkITKLGS--FEGLV 159
LRR: domain 5 of 10, from 160 to 184: score 15.0, E = 1.8
*->nLeeLdLsnN.Lt.slppglfsnLp<-*
nL+ ++L++N+L+++ + +f++L+
tem37_gi|1 160 NLTFIHLQHNrLKeDAVSAAFKGLK 184
DUF41: domain 1 of 1, from 81 to 204: score -73.7, E = 46
*->lteeQLlstFsNvkhliGslevqnTnfkslsFLanLesIecgirkrn
++e +F Nv+ l+ l + +++ + I+++
tem37_gi|1 81 IDE----KAFENVTDLQ-WLILDHNVLENSK-------IKGR----V 111
kdrvrkildnihdnpfswidnqnmlelgllnlTnmtrlgLpilsnldlnk
+ + ++ l++ h n+
tem37_gi|1 112 FSKLKQ-LKKLHINH----------------------------------- 125
LnlpnlknisnpnstgekiivnfenlhpdFClTteEllnfflnsnvsien
+nl+ ++ p + ++++d +lT+ + ++
tem37_gi|1 126 ---NNLTESVGPLP----------KSLEDLQLTHNKITKLGSF------- 155
leakyCepksrifflikktdngivyklCnfkslsssvnLdngCtiIfGdL
++ ++ f+ l+ ++ d
tem37_gi|1 156 -------------------EGL---VNLTFIHLQH--------NRLKEDA 175
vIgpgdEeyVskLknveviFGsLiIqNTnLtnidFLenLkyIasLedsvs
v + + Lk++e+ +d ++ +Ia+L + ++
tem37_gi|1 176 VSAAFKG-----LKSLEY--------------LD--LSFNQIARLPSGLP 204
<-*
tem37_gi|1 - -
LRR: domain 6 of 10, from 185 to 204: score 16.2, E = 0.78
*->nLeeLdLsnN.LtslppglfsnLp<-*
+Le+LdLs N++ lp+ +Lp
tem37_gi|1 185 SLEYLDLSFNqIARLPS----GLP 204
LRR: domain 7 of 10, from 206 to 229: score 19.3, E = 0.089
*->nLeeLdLsnN.LtslppglfsnLp<-*
+L +L+L+nN+++++p++ f+ +
tem37_gi|1 206 SLLTLYLDNNkISNIPDEYFKRFN 229
crp: domain 1 of 1, from 234 to 243: score 2.5, E = 40
*->lpmsLRqeIAd<-*
l++s ++e+Ad
tem37_gi|1 234 LRLS-HNELAD 243
DNA_ligase_N: domain 1 of 1, from 83 to 247: score -256.9, E = 70
*->eeaqqeieeLrelirkydyeYYvlDaPlVpDaeYDrLyrrLkaLEek
e+a + +++L+ li ++ vl ++ + ++ +L ++Lk+L
tem37_gi|1 83 EKAFENVTDLQWLILDHN----VLENSKIKGRVFSKL-KQLKKLHIN 124
fPELiTpDSPTQrVGGapllgdFkkvrHpaPMLSLDNAFsedeLrafieR
+ +L T+ VG + +++ +++L+ +
tem37_gi|1 125 HNNL------TESVG-PLPKS-------------------LEDLQLTHNK 148
CCmirrrlgnsekvayvVEPKIDGlAvsLtYedGvLvrAaTRGDGttGED
i++ lg+ e Gl v+Lt
tem37_gi|1 149 ---ITK-LGSFE-----------GL-VNLT-------------------- 162
VTqNVkTIraIPlklpgdnivrppPerlEvRGEVfmpkedFeaLNeeree
+++ +++L
tem37_gi|1 163 -------------FIHLQ----------------------HNRLK----- 172
egekpFANPRNAAAGSLRQLDPkiTAkRkLrffvYglglveglelgpdTq
+ + A +k gl +e l+l
tem37_gi|1 173 --------------------EDAVSAAFK------GLKSLEYLDLS---- 192
seaLkqLkkl..GFplVnphtrlck.....gideVldyyaewekkRdsLp
q+++l++G+p V+ +t ++++ ++i + +y++ + + L
tem37_gi|1 193 ---FNQIARLpsGLP-VSLLTLYLDnnkisNIPD--EYFKRFNA----LQ 232
yeIDGVVvKvnelplQreLGfTskaPRWAiAYKFpAe<-*
y +++++ l + s++P
tem37_gi|1 233 Y------LRLSHNELAD-----SGIP----------- 247
E1_N: domain 1 of 1, from 235 to 253: score -0.5, E = 87
*->RLFeelPEvpDSGy.GntevE<-*
RL++ E+ DSG++Gn+ ++
tem37_gi|1 235 RLSHN--ELADSGIpGNSFNV 253
LRR: domain 8 of 10, from 230 to 254: score 11.3, E = 22
*->nLeeLdLsnN.Lt..slppglfsnLp<-*
L++L+Ls+N+L ++++p ++f+ +
tem37_gi|1 230 ALQYLRLSHNeLAdsGIPGNSFN-VS 254
PI3Ka: domain 1 of 1, from 99 to 274: score -106.9, E = 94
*->dkdlkpnlsskerkrleaIlayD....PlsaLtaeekdLiWkfRhyy
+++l+ s+ + ++++++ ++ + + + Lt+ + L +++++
tem37_gi|1 99 HNVLEN--SKIKGRVFSKLKQLKklhiNHNNLTESVGPLPKSLEDLQ 143
ltsnPkALtLmCVGSPKl.....LlSVkWsdlsevaealsLldkWvWqap
lt n+ +t Kl++ ++L V+ + ++ l + a+
tem37_gi|1 144 LTHNK--IT-------KLgsfegL--VNLTFIHLQHNRLKEDAVS---AA 179
idpvdALELLdpkFadnheeVReYAVkcLesYasDdELlfYLLQLVQALK
+ ++LE+Ld +F + +L s +++ L +YL
tem37_gi|1 180 FKGLKSLEYLDLSFNQ---------IARLPSGLPVSLLTLYL-------- 212
YEnldepfhdSpLsrFLlkR..AlkNrsrlGHfFfWyLksEiYKDdldhd
+ + ++S +kR +Al+ rl+H + +++
tem37_gi|1 213 -D----NNKISNIPDEYFKRfnALQYL-RLSH----NELADS------GI 246
eevkserFgvllEsylrectgtsledlnk<-*
+ s ++l E+ l++ +++ +n+
tem37_gi|1 247 PGN-SFNVSSLVELDLSYNKLKNIPTVNE 274
LRR: domain 9 of 10, from 255 to 277: score 15.4, E = 1.3
*->nLeeLdLsnN.LtslppglfsnLp<-*
+L eLdLs+N+L+++p + +nL
tem37_gi|1 255 SLVELDLSYNkLKNIPT-VNENLE 277
LRR: domain 10 of 10, from 305 to 330: score 15.8, E = 1
*->nLeeLdLsnN.Lt..slppglfsnLp<-*
++++L+L++N+++++slpp+ + L+
tem37_gi|1 305 KIKHLRLDGNrISetSLPPDMYECLR 330
//
Start with PfamFrag (from /data/patterns/pfam)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/pfam/PfamFrag
Sequence file: tem37
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
LRR Leucine Rich Repeat 110.9 2.1e-29 11
LRRNT Leucine rich repeat N-terminal domain 40.5 1.7e-09 1
crp Bacterial regulatory proteins, crp family 2.5 40 1
K-box K-box region 1.4 44 1
SNAP-25 SNAP-25 family 0.2 45 1
E1_N E1 Protein, N terminal domain -0.5 87 1
lyase_1 Lyase -1.1 78 1
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
LRRNT 1/1 36 66 .. 1 31 [] 40.5 1.7e-09
LRR 1/11 67 90 .. 1 23 [] 15.6 0.017
SNAP-25 1/1 75 91 .. 199 215 .] 0.2 45
LRR 2/11 91 116 .. 1 23 [] 13.0 0.091
LRR 3/11 117 130 .. 1 13 [. 6.4 7
lyase_1 1/1 138 156 .. 459 485 .] -1.1 78
LRR 4/11 138 159 .. 1 23 [] 16.4 0.01
LRR 5/11 160 184 .. 1 23 [] 13.1 0.09
LRR 6/11 185 204 .. 1 23 [] 14.5 0.036
LRR 7/11 206 229 .. 1 23 [] 17.4 0.0054
crp 1/1 234 243 .. 1 11 [. 2.5 40
LRR 8/11 230 252 .. 1 20 [. 11.0 0.35
E1_N 1/1 235 253 .. 142 161 .] -0.5 87
LRR 9/11 255 270 .. 1 15 [. 12.6 0.12
K-box 1/1 261 288 .. 1 29 [. 1.4 44
LRR 10/11 275 298 .. 1 23 [] 1.4 1.7e+02
LRR 11/11 305 330 .. 1 23 [] 13.9 0.053
Alignments of top-scoring domains:
LRRNT: domain 1 of 1, from 36 to 66: score 40.5, E = 1.7e-09
*->aCpreCtCsp.fglvVdCsgrgLtlevPrdlP<-*
C++eC+C+++++++++C++++L+ +vP++ P
tem37_gi|1 36 NCAPECNCPEsYPSAMYCDELKLK-SVPMVPP 66
LRR: domain 1 of 11, from 67 to 90: score 15.6, E = 0.017
*->nLeeLdLsnN.LtslppglfsnLp<-*
+++L+L+nN++ ++++++f+n +
tem37_gi|1 67 GIKYLYLRNNqIDHIDEKAFENVT 90
SNAP-25: domain 1 of 1, from 75 to 91: score 0.2, E = 45
*->nrqidRIeeKadsndar<-*
n qid I eKa +n +
tem37_gi|1 75 NNQIDHIDEKAFENVTD 91
LRR: domain 2 of 11, from 91 to 116: score 13.0, E = 0.091
*->nLeeLdLsnN.Lt..slppglfsnLp<-*
L++L L++N L+++++ +fs+L+
tem37_gi|1 91 DLQWLILDHNvLEnsKIKGRVFSKLK 116
LRR: domain 3 of 11, from 117 to 130: score 6.4, E = 7
*->nLeeLdLsnN.Lts<-*
+L++L++++N+Lt+
tem37_gi|1 117 QLKKLHINHNnLTE 130
lyase_1: domain 1 of 1, from 138 to 156: score -1.1, E = 78
*->alelgqlteeefdsivsPvfefarSve<-*
+le++qlt+++++++ s e
tem37_gi|1 138 SLEDLQLTHNKITKLGSF--------E 156
LRR: domain 4 of 11, from 138 to 159: score 16.4, E = 0.01
*->nLeeLdLsnN.LtslppglfsnLp<-*
+Le L L +N++t+l + f++L
tem37_gi|1 138 SLEDLQLTHNkITKLGS--FEGLV 159
LRR: domain 5 of 11, from 160 to 184: score 13.1, E = 0.09
*->nLeeLdLsnN.Lt.slppglfsnLp<-*
nL+ ++L++N+L+++ + +f++L+
tem37_gi|1 160 NLTFIHLQHNrLKeDAVSAAFKGLK 184
LRR: domain 6 of 11, from 185 to 204: score 14.5, E = 0.036
*->nLeeLdLsnN.LtslppglfsnLp<-*
+Le+LdLs N++ lp+ +Lp
tem37_gi|1 185 SLEYLDLSFNqIARLPS----GLP 204
LRR: domain 7 of 11, from 206 to 229: score 17.4, E = 0.0054
*->nLeeLdLsnN.LtslppglfsnLp<-*
+L +L+L+nN+++++p++ f+ +
tem37_gi|1 206 SLLTLYLDNNkISNIPDEYFKRFN 229
crp: domain 1 of 1, from 234 to 243: score 2.5, E = 40
*->lpmsLRqeIAd<-*
l++s ++e+Ad
tem37_gi|1 234 LRLS-HNELAD 243
LRR: domain 8 of 11, from 230 to 252: score 11.0, E = 0.35
*->nLeeLdLsnN.Lt..slppglfs<-*
L++L+Ls+N+L ++++p ++f+
tem37_gi|1 230 ALQYLRLSHNeLAdsGIPGNSFN 252
E1_N: domain 1 of 1, from 235 to 253: score -0.5, E = 87
*->RLFeelPEvpDSGy.GntevE<-*
RL++ E+ DSG++Gn+ ++
tem37_gi|1 235 RLSHN--ELADSGIpGNSFNV 253
LRR: domain 9 of 11, from 255 to 270: score 12.6, E = 0.12
*->nLeeLdLsnN.Ltslp<-*
+L eLdLs+N+L+++p
tem37_gi|1 255 SLVELDLSYNkLKNIP 270
K-box: domain 1 of 1, from 261 to 288: score 1.4, E = 44
*->dsyqkssgnsslwesnyqnwqqEaaKLka<-*
sy+k+ +++ ++n +n++ E++ L +
tem37_gi|1 261 LSYNKLKNIP-TVNENLENYYLEVNQLEK 288
LRR: domain 10 of 11, from 275 to 298: score 1.4, E = 1.7e+02
*->nLeeLdLsnN.LtslppglfsnLp<-*
nLe +L+ N+L++++ ++f++
tem37_gi|1 275 NLENYYLEVNqLEKFDIKSFCKIL 298
LRR: domain 11 of 11, from 305 to 330: score 13.9, E = 0.053
*->nLeeLdLsnN.Lt..slppglfsnLp<-*
++++L+L++N+++++slpp+ + L+
tem37_gi|1 305 KIKHLRLDGNrISetSLPPDMYECLR 330
//
Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm)
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib
Sequence file: tem37
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
LRR-ma 73.0 6.2e-18 10
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
LRR-ma 1/10 65 78 .. 1 14 [] 10.4 31
LRR-ma 2/10 89 102 .. 1 14 [] 4.1 4.1e+02
LRR-ma 3/10 115 128 .. 1 14 [] 12.4 11
LRR-ma 4/10 136 149 .. 1 14 [] 9.2 52
LRR-ma 5/10 158 171 .. 1 14 [] 11.5 20
LRR-ma 6/10 183 196 .. 1 14 [] 16.6 0.61
LRR-ma 7/10 204 217 .. 1 14 [] 8.0 85
LRR-ma 8/10 228 241 .. 1 14 [] 11.2 23
LRR-ma 9/10 253 266 .. 1 14 [] 12.9 7.8
LRR-ma 10/10 303 316 .. 1 14 [] 4.4 3.6e+02
Alignments of top-scoring domains:
LRR-ma: domain 1 of 10, from 65 to 78: score 10.4, E = 31
*->lpsLeeLdLsnNrl<-*
+p +++L+L+nN++
tem37_gi|1 65 PPGIKYLYLRNNQI 78
LRR-ma: domain 2 of 10, from 89 to 102: score 4.1, E = 4.1e+02
*->lpsLeeLdLsnNrl<-*
+ L+ L L +N l
tem37_gi|1 89 VTDLQWLILDHNVL 102
LRR-ma: domain 3 of 10, from 115 to 128: score 12.4, E = 11
*->lpsLeeLdLsnNrl<-*
l++L++L++++N+l
tem37_gi|1 115 LKQLKKLHINHNNL 128
LRR-ma: domain 4 of 10, from 136 to 149: score 9.2, E = 52
*->lpsLeeLdLsnNrl<-*
++sLe L L +N++
tem37_gi|1 136 PKSLEDLQLTHNKI 149
LRR-ma: domain 5 of 10, from 158 to 171: score 11.5, E = 20
*->lpsLeeLdLsnNrl<-*
l +L+ ++L++Nrl
tem37_gi|1 158 LVNLTFIHLQHNRL 171
LRR-ma: domain 6 of 10, from 183 to 196: score 16.6, E = 0.61
*->lpsLeeLdLsnNrl<-*
l+sLe+LdLs N++
tem37_gi|1 183 LKSLEYLDLSFNQI 196
LRR-ma: domain 7 of 10, from 204 to 217: score 8.0, E = 85
*->lpsLeeLdLsnNrl<-*
+ sL +L+L nN++
tem37_gi|1 204 PVSLLTLYLDNNKI 217
LRR-ma: domain 8 of 10, from 228 to 241: score 11.2, E = 23
*->lpsLeeLdLsnNrl<-*
+ L++L+Ls+N+l
tem37_gi|1 228 FNALQYLRLSHNEL 241
LRR-ma: domain 9 of 10, from 253 to 266: score 12.9, E = 7.8
*->lpsLeeLdLsnNrl<-*
+sL eLdLs+N+l
tem37_gi|1 253 VSSLVELDLSYNKL 266
LRR-ma: domain 10 of 10, from 303 to 316: score 4.4, E = 3.6e+02
*->lpsLeeLdLsnNrl<-*
+++++L+L +Nr+
tem37_gi|1 303 YSKIKHLRLDGNRI 316
//
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with Prosite
---------------------------------------------------------
| ppsearch (c) 1994 EMBL Data Library |
| based on MacPattern (c) 1990-1994 R. Fuchs |
---------------------------------------------------------
PROSITE pattern search started: Tue Oct 31 18:28:58 2000
Sequence file: tem37
----------------------------------------
Sequence tem37_gi|1708878|sp|P51884|LUM_HUMAN (338 residues):
Matching pattern PS00001 ASN_GLYCOSYLATION:
88: NVTD
127: NLTE
160: NLTF
252: NVSS
Total matches: 4
Matching pattern PS00006 CK2_PHOSPHO_SITE:
138: SLED
237: SHNE
255: SLVE
271: TVNE
Total matches: 4
Matching pattern PS00008 MYRISTYL:
14: GGTSGQ
153: GSFEGL
202: GLPVSL
245: GIPGNS
248: GNSFNV
313: GNRISE
Total matches: 6
Matching pattern PS00029 LEUCINE_ZIPPER:
121: LHINHNNLTESVGPLPKSLEDL
Total matches: 1
Total no of hits in this sequence: 15
========================================
1314 pattern(s) searched in 1 sequence(s), 338 residues.
Total no of hits in all sequences: 15.
Search time: 00:00 min
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with Profile Search
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
Start with motif search against own library
***** bioMotif : Version V41a DB, 1999 Nov 11 *****
argv[1]=P
argv[2]=-m /data/patterns/own/motif.fa
argv[4]=-seq tem37
***** bioMotif : Version V41a DB, 1999 Nov 11 *****
SeqTyp=2 : PROTEIN search;
>APC D-Box is the MOTIF name
>STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 338 units
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~
Start with HMM-search search against own library
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/own/own-hmm.lib
Sequence file: tem37
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
[no hits above thresholds]
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
[no hits above thresholds]
Alignments of top-scoring domains:
[no hits above thresholds]
//
hmmpfam - search a single seq against HMM database
HMMER 2.1.1 (Dec 1998)
Copyright (C) 1992-1998 Washington University School of Medicine
HMMER is freely distributed under the GNU General Public License (GPL).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: /data/patterns/own/own-hmm-f.lib
Sequence file: tem37
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN)
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
[no hits above thresholds]
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
[no hits above thresholds]
Alignments of top-scoring domains:
[no hits above thresholds]
//
~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~
L. Aravind's signalling DB
IMPALA version 1.1 [20-December-1999]
Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting,
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999),
"IMPALA: Matching a Protein Sequence Against a Collection of
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.
Query= tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM)
(KERATAN SULFATE PROTEOGLYCAN)
(338 letters)
Searching..................................done
Results from profile search
Score E
Sequences producing significant alignments: (bits) Value
LRR Leucine rich repeats 112 5e-28
PDE cyclic NMP phosphodiesterase domain 25 0.15
RASGAP RAS-type GTPase GTP hydrolysis activating protein 22 1.6
UB Ubiquitin domain 21 2.3
HISDAC Histone deacetylase domain 21 2.7
RASGEF RAS-type GTPase GDP exchange factor 21 3.2
BZIP Basic Zipper domain (A DNA binding domain) 21 3.4
MBL Metallo-betalactamase domain 20 3.7
SH3 Src Homology domain 3 20 4.1
UBHYD Ubiquitin C-terminal hydrolase domain 20 6.0
HECT A ubiquitin conjugating enzyme domain 19 7.7
SH2 Src Homology domain 2 19 8.3
KR Kringle domain (Adhesion module) 19 9.4
>LRR Leucine rich repeats
Length = 339
Score = 112 bits (279), Expect = 5e-28
Identities = 71/270 (26%), Positives = 116/270 (42%), Gaps = 14/270 (5%)
Query: 66 PGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINH 125
G L L N I+ I N+ DL +L L +N L+ + R S L+ LK +
Sbjct: 56 KGSIVLNLSYNNIETIPNSVCANLIDLLFLDLSNNKLDMLPPQIRRLSMLQSLKLSNNPL 115
Query: 126 NNLTESVGPLPKSLEDLQLTHNKITK---LGSFEGLVNLTFIHLQHNRLKEDAVSAAFKG 182
N+ P SL L +++ T + + + NL + N L V A
Sbjct: 116 NHFQLKQLPSMTSLSVLHMSNTNRTLDNIPPTLDDMHNLRDVDFSENNLPI--VPEALFK 173
Query: 183 LKSLEYLDLSFNQIARLP--SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNE 240
L++L L+LS N+I +L G +L TL + +N+++ +PD K L L ++N+
Sbjct: 174 LRNLRKLNLSGNKIEKLNMTEGEWENLETLNMSHNQLTVLPDCVVK-LTRLTKLYAANNQ 232
Query: 241 LADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP 300
L GIP + L L LSYNKL+ +P L+ +L+ + + + +
Sbjct: 233 LTFEGIPSGIGKLIQLTVLHLSYNKLELVPEGISRCVK--LQKLKLDHNRLITLPEGIHL 290
Query: 301 LSYSKIKHLRLDGNRISETSLPPDMYECLR 330
L +K L L N +PP + +
Sbjct: 291 LPD--LKVLDLHENEN--LVMPPKPNDARK 316
Score = 42.9 bits (100), Expect = 5e-07
Identities = 15/52 (28%), Positives = 24/52 (45%)
Query: 153 GSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLP 204
G L L + ++ N LK + +K L +DLS NQ+ +P+ L
Sbjct: 2 GELSDLPRLRSVIVRDNNLKTAGIPTDIFRMKDLTIIDLSRNQLREVPTNLE 53
Score = 37.8 bits (87), Expect = 2e-05
Identities = 16/46 (34%), Positives = 26/46 (55%)
Query: 231 LQYLRLSHNELADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENL 276
L+ + + N L +GIP + F + L +DLS N+L+ +PT E
Sbjct: 10 LRSVIVRDNNLKTAGIPTDIFRMKDLTIIDLSRNQLREVPTNLEYA 55
Score = 28.9 bits (64), Expect = 0.011
Identities = 14/58 (24%), Positives = 29/58 (49%), Gaps = 5/58 (8%)
Query: 85 AFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDL 142
++ L+ +I+ N L+ + I +F ++K L + ++ N L E +P +LE
Sbjct: 3 ELSDLPRLRSVIVRDNNLKTAGIPTDIF-RMKDLTIIDLSRNQLRE----VPTNLEYA 55
Score = 20.3 bits (42), Expect = 3.7
Identities = 4/18 (22%), Positives = 8/18 (44%)
Query: 206 SLLTLYLDNNKISNIPDE 223
L + L N++ +P
Sbjct: 34 DLTIIDLSRNQLREVPTN 51
>PDE cyclic NMP phosphodiesterase domain
Length = 350
Score = 25.1 bits (54), Expect = 0.15
Identities = 35/124 (28%), Positives = 49/124 (39%), Gaps = 27/124 (21%)
Query: 205 VSLLTLYLDNNKISNIPD---------EYFKRFNALQYLR-------LSHNELADSGIPG 248
+++ LY NN+ N + FNA QYL L D PG
Sbjct: 54 MTVNALYRKNNRYHNFTHAFDVTQTVYTFLTSFNAAQYLTHLDIFALLISCMCHDLNHPG 113
Query: 249 --NSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKI 306
N+F V++ EL L YN + LEN++ + K S C IL L+ +
Sbjct: 114 FNNTFQVNAQTELSLEYNDI-------SVLENHHAMLT--FKILRNSECNILEGLNEDQY 164
Query: 307 KHLR 310
K LR
Sbjct: 165 KELR 168
>RASGAP RAS-type GTPase GTP hydrolysis activating protein
Length = 292
Score = 21.6 bits (45), Expect = 1.6
Identities = 15/73 (20%), Positives = 34/73 (46%), Gaps = 3/73 (4%)
Query: 77 QIDHI-DEKAFENVTDLQWLILDHNVLENS--KIKGRVFSKLKQLKKLHINHNNLTESVG 133
DH+ ++++ +L +D + S I G + S ++ + + TE +
Sbjct: 14 TADHVFPLATYDDLMNLLLESVDQRPITVSAVSILGELVSGKTEVAQPLVRLFTHTERIA 73
Query: 134 PLPKSLEDLQLTH 146
P+ K+L D +++H
Sbjct: 74 PIIKALADHEISH 86
Score = 20.9 bits (43), Expect = 2.9
Identities = 20/70 (28%), Positives = 32/70 (45%), Gaps = 8/70 (11%)
Query: 209 TLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNSFNVSSLVELDLSYNKLKN 268
T++ N +S + DE R + L YL H L P S V+ ++ +K+K+
Sbjct: 92 TIFRGNTLVSKMMDEAM-RLSGLHYL---HQTLR----PVLSQIVAEKKPCEIDPSKIKD 143
Query: 269 IPTVNENLEN 278
V+ NL N
Sbjct: 144 RSAVDTNLHN 153
>UB Ubiquitin domain
Length = 128
Score = 20.9 bits (44), Expect = 2.3
Identities = 14/79 (17%), Positives = 28/79 (34%), Gaps = 5/79 (6%)
Query: 74 RNNQIDHIDEKAFENV---TDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINHNNLTE 130
++ ID++ K + D Q LI LE+ + +++ LH+
Sbjct: 19 SSDTIDNVKSKIQDKEGIPPDQQRLIFAGKQLEDGRTLS--DYNIQKESTLHLVLRLRGG 76
Query: 131 SVGPLPKSLEDLQLTHNKI 149
+ P K+L +
Sbjct: 77 IIEPSLKALASKYNCDKSV 95
>HISDAC Histone deacetylase domain
Length = 433
Score = 21.0 bits (44), Expect = 2.7
Identities = 12/99 (12%), Positives = 31/99 (31%), Gaps = 7/99 (7%)
Query: 243 DSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKF---DIKSFCKILG 299
+G P + L ++Y K + ++ +F + F +
Sbjct: 35 GAGHPMKPHRIRMAHSLIMNYGLYKKMEIYRAKPAT----KQEMCQFHTDEYIDFLSRVT 90
Query: 300 PLSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN 338
P + K + N + + +YE ++ ++
Sbjct: 91 PDNLEMFKRESVKFNVGDDCPVFDGLYEYCSISGGGSME 129
>RASGEF RAS-type GTPase GDP exchange factor
Length = 196
Score = 20.5 bits (43), Expect = 3.2
Identities = 14/77 (18%), Positives = 29/77 (37%), Gaps = 7/77 (9%)
Query: 103 ENSKIKGRVFSK----LKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITK--LGSFE 156
++SK+K V + + N N L E + L S+ I L ++E
Sbjct: 79 KSSKMKRNVIQRFIHVADHCRTFQ-NFNTLMEIILALSSSVVKFTDAWRLIEPGDLLTWE 137
Query: 157 GLVNLTFIHLQHNRLKE 173
L + + ++ ++
Sbjct: 138 ELKKIPSLDRNYSTIRN 154
>BZIP Basic Zipper domain (A DNA binding domain)
Length = 89
Score = 20.6 bits (42), Expect = 3.4
Identities = 11/51 (21%), Positives = 25/51 (48%)
Query: 49 SAMYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDH 99
+A C + KL+ + + +K L +N+++ E V L+ +++H
Sbjct: 23 AASKCRKRKLERIARLEEKVKTLKAQNSELASTANMLREQVAQLKQKVMNH 73
>MBL Metallo-betalactamase domain
Length = 256
Score = 20.5 bits (42), Expect = 3.7
Identities = 9/106 (8%), Positives = 26/106 (24%), Gaps = 12/106 (11%)
Query: 65 PPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKG--RVFSKLKQLKKLH 122
I +Y+ + DH+ + + + + + + + L +
Sbjct: 54 HRDITDIYVSHLHSDHVG-----GLEYVGFSTMFDPNCGKPNLYLSQDIAADLWERSLAG 108
Query: 123 INHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQH 168
+ + + L + + I L H
Sbjct: 109 GMEAIEGG-MTEVDSYFQIHALGPGETFTWENVN----FQLIKLNH 149
>SH3 Src Homology domain 3
Length = 90
Score = 20.1 bits (42), Expect = 4.1
Identities = 7/16 (43%), Positives = 9/16 (55%)
Query: 57 KLKSVPMVPPGIKYLY 72
+K +P PP K LY
Sbjct: 70 IIKPLPQPPPQCKALY 85
>UBHYD Ubiquitin C-terminal hydrolase domain
Length = 884
Score = 19.6 bits (40), Expect = 6.0
Identities = 10/70 (14%), Positives = 23/70 (32%), Gaps = 5/70 (7%)
Query: 266 LKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKI-----KHLRLDGNRISETS 320
L + + + E + K + ++ L +S K + G ++
Sbjct: 14 LFFTNQLRKAVYMMPTEGDDSSKSVPLALQRVFYELQHSDKPVGTKKLTKSFGWETLDSF 73
Query: 321 LPPDMYECLR 330
+ D+ E R
Sbjct: 74 MQHDVQELCR 83
>HECT A ubiquitin conjugating enzyme domain
Length = 255
Score = 19.3 bits (40), Expect = 7.7
Identities = 17/121 (14%), Positives = 35/121 (28%), Gaps = 15/121 (12%)
Query: 209 TLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIP--------GNSFNVSSLVELD 260
+ + + FK L + + + D + + L E+D
Sbjct: 65 RFLFNPSACLDEHLMQFKFLGILMGVAIRTKKPLDLHLAPLVWKQLCCVPLTLEDLEEVD 124
Query: 261 LS-YNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISET 319
L L +I L + + ++ +G + K+ + GN I T
Sbjct: 125 LLYVQTLNSI------LHIEDSGITEESFHEMIPLDSFVGQSADGKMVPIIPGGNSIPLT 178
Query: 320 S 320
Sbjct: 179 F 179
>SH2 Src Homology domain 2
Length = 119
Score = 19.0 bits (39), Expect = 8.3
Identities = 9/61 (14%), Positives = 20/61 (32%), Gaps = 7/61 (11%)
Query: 60 SVPMVPPGIKYLYLRNNQIDH---IDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLK 116
PM IK+ +R+ F ++D+ + H ++ + R+
Sbjct: 50 YDPMHGDVIKHYKIRSLDNGGYYISPRITFPCISDM----IKHYQKQSDGLCRRLEKACI 105
Query: 117 Q 117
Sbjct: 106 S 106
>KR Kringle domain (Adhesion module)
Length = 86
Score = 19.1 bits (39), Expect = 9.4
Identities = 3/30 (10%), Positives = 8/30 (26%)
Query: 75 NNQIDHIDEKAFENVTDLQWLILDHNVLEN 104
N+ + + + + H N
Sbjct: 28 NSDLLYQELHVDSVGAAALLGLGPHAYCRN 57
Underlying Matrix: BLOSUM62
Number of sequences tested against query: 105
Number of sequences better than 10.0: 13
Number of calls to ALIGN: 18
Length of query: 338
Total length of test sequences: 20182
Effective length of test sequences: 16536.0
Effective search space size: 5016881.8
Initial X dropoff for ALIGN: 25.0 bits
Y. Wolf's SCOP PSSM
IMPALA version 1.1 [20-December-1999]
Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting,
Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999),
"IMPALA: Matching a Protein Sequence Against a Collection of
"PSI-BLAST-Constructed Position-Specific Score Matrices",
Bioinformatics 15:1000-1011.
Query= tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM)
(KERATAN SULFATE PROTEOGLYCAN)
(338 letters)
Searching.................................................done
Results from profile search
Score E
Sequences producing significant alignments: (bits) Value
gi|1082610 [223..548] Leucine-rich repeats 103 6e-24
gi|2623618 [35..332] Leucine-rich repeats 91 2e-20
gi|1418519 [191..423] Leucine-rich repeats 88 2e-19
gi|1881738 [246..599] Leucine-rich repeats 80 5e-17
gi|132575 [1..456] Leucine-rich repeats 70 6e-14
gi|730152 [47..205] Cupredoxins 25 2.2
gi|687687 [74..361] Heme-linked catalases 25 2.2
gi|1173145 [248..328] the C-terminal domain of RNA polymeras... 25 2.9
gi|115682 [1..213] CoA-dependent acetyltransferases 23 5.6
gi|1788027 [59..564] Heme-linked catalases 23 7.2
gi|1742164 [73..325] Periplasmic binding protein-like I 23 8.8
>gi|1082610 [223..548] Leucine-rich repeats
Length = 326
Score = 103 bits (254), Expect = 6e-24
Identities = 40/304 (13%), Positives = 79/304 (25%), Gaps = 41/304 (13%)
Query: 57 KLKSVPMVPPGIKYLYLRNNQIDHIDEKAFEN--------VTDLQWLILDHNVLENSKIK 108
L + ++ L L N + + ++ L+ L L ++
Sbjct: 9 TLCHLLSSWVSLESLTLSYNGLGSNIFRLLDSLRALSGQAGCRLRALHLSDLFSPLPILE 68
Query: 109 G--RVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHL 166
+ L L+ L I ++ ++ P ++ L L
Sbjct: 69 LTRAIVRALPLLRVLSIRVDHPSQRDNPGVPGNAGPPSHIIGDEEI-PENCLEQLEMGFP 127
Query: 167 QHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPVSLLTLYLDNNKISNIPDEYFK 226
+ + + + K SL+ L L + +
Sbjct: 128 RGAQPAPL-LCSVLKASGSLQQLSLDSATF---------------ASPQDFGLVLQTLKE 171
Query: 227 RFNALQYLRLSHNELAD-SGIPGNSFNVSSLVELDLSYNKLKNIPTV------------N 273
AL+ L LAD +L E+ S+ +L N
Sbjct: 172 YNLALKRLSFHDMNLADCQSEVLFLLQNLTLQEITFSFCRLFEKRPAQFLPEMVAAMKGN 231
Query: 274 ENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDMYECLRVAN 333
L+ L N+L + + + S S + L + N I + + L
Sbjct: 232 STLKGLRLPGNRLGNAGLLALADVFSEDSSSSLCQLDISSNCIKPDG-LLEFAKRLERWG 290
Query: 334 EVTL 337
Sbjct: 291 RGAF 294
>gi|2623618 [35..332] Leucine-rich repeats
Length = 298
Score = 91.1 bits (223), Expect = 2e-20
Identities = 39/287 (13%), Positives = 84/287 (28%), Gaps = 39/287 (13%)
Query: 66 PGIKYLYLRNNQIDHIDE----KAFENVTDLQWLILDHNVLEN---------SKIKGRVF 112
GI+ L L N I KA E+ Q + +
Sbjct: 14 EGIQSLKLNGNTIGVEAAQALAKALESKPQFQRARWSDMFTGRLRSEIPPALMSLGAGIM 73
Query: 113 SKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRLK 172
+ L ++ ++ N + L + ++ + GL + +
Sbjct: 74 TAGAHLVEIDLSDNAFGPDGVKAVRELLESSSCYSLREMRFNNNGLGIGGKLMAEALITC 133
Query: 173 EDAVSAAFKGLKSLEYLDLSFNQIARLPS-------GLPVSLLTLYLDNNKISNIPD--- 222
S +L+ N++ + + +L + L N I+
Sbjct: 134 H-EKSTKAGKPLALKVFIAGRNRLENPGATVLAKAFKIIGTLEEIALPQNGINYEGITAL 192
Query: 223 -EYFKRFNALQYLRLSHNELADSG---IPGNSFNVSSLVELDLSYNKLKNIPTV------ 272
E + + L+ L L+ N G + N+S L ++ +++
Sbjct: 193 AEAVEYSHNLKILNLNDNTFTARGAKPMAKAIKNLSKLEVINFGDCLVRSEGADAIANSL 252
Query: 273 ---NENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRI 316
+L+ L +++K + + + L L+GN I
Sbjct: 253 REGVPSLKELNLAFGEIKKEAAVRVAESMDTK--PHLTLLDLNGNNI 297
>gi|1418519 [191..423] Leucine-rich repeats
Length = 233
Score = 88.2 bits (216), Expect = 2e-19
Identities = 33/248 (13%), Positives = 74/248 (29%), Gaps = 46/248 (18%)
Query: 112 FSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRL 171
L+ L++ + NL + + + K+ + +LT +HL++ ++
Sbjct: 2 IRHAVSLQMLNLRYTNLND-------------RSIPALCKMARAQPSASLTCLHLENTQM 48
Query: 172 K---EDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPV-------SLLTLYLDNNKISNIP 221
+ A K L L L N + SL L L NN I +
Sbjct: 49 SGKNLLVLICALKNNTGLRELYLGDNGLQPTDGSHIYQLITSNSSLQLLDLRNNSIGDSG 108
Query: 222 DE---------YFKRFNALQYLRLSHNELADSG---IPGNSFNVSSLVELDLSYNKLKNI 269
++L + L +N + + + + + L++ N L
Sbjct: 109 VRHICDGLRHREAVEKSSLSAMVLWNNNVTGASMDSLAEALIENTKIETLNIGNNNLGVE 168
Query: 270 PTV--------NENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSL 321
N +L L+ + + + + + + N I+
Sbjct: 169 GIARLKPALASNSHLHRLGLQNTGINCEGAIILAECIADN--IALLRVDIRDNPIALA-G 225
Query: 322 PPDMYECL 329
++ +
Sbjct: 226 LLALHSAM 233
Score = 24.7 bits (53), Expect = 2.4
Identities = 9/47 (19%), Positives = 16/47 (33%), Gaps = 4/47 (8%)
Query: 66 PGIKYLYLRNNQIDHIDEK----AFENVTDLQWLILDHNVLENSKIK 108
+ + L NN + A T ++ L + +N L I
Sbjct: 125 SSLSAMVLWNNNVTGASMDSLAEALIENTKIETLNIGNNNLGVEGIA 171
Score = 22.8 bits (48), Expect = 8.1
Identities = 11/39 (28%), Positives = 17/39 (43%), Gaps = 4/39 (10%)
Query: 66 PGIKYLYLRNNQI----DHIDEKAFENVTDLQWLILDHN 100
+ L+L N Q+ + A +N T L+ L L N
Sbjct: 36 ASLTCLHLENTQMSGKNLLVLICALKNNTGLRELYLGDN 74
>gi|1881738 [246..599] Leucine-rich repeats
Length = 354
Score = 80.2 bits (195), Expect = 5e-17
Identities = 49/328 (14%), Positives = 89/328 (26%), Gaps = 56/328 (17%)
Query: 49 SAMYCDELKLKSVPMVPPGIKYLYLRNNQIDH-----IDEKAFENVTDLQWLILDHNVLE 103
+A +EL + + NN I + L + L
Sbjct: 21 TASAFEELGQAIAKNRNSALTSIDWSNNLIKDAGVAALAAAVASMGHGLTSISLKGGDAT 80
Query: 104 N------SKIKGRVFSKLKQLKKLHINHNNLTESVGPL-------PKSLEDLQLTHNKIT 150
+ + L L++ N L P +L+ L ++
Sbjct: 81 KKGTVALCTAFKKNVEMSRTLTVLNLAGNRLDSDGTSALAAFVSGPNALQTLNISGTAAN 140
Query: 151 KLGS----FEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPVS 206
G L ++ HN++ K L+S L + +P +
Sbjct: 141 LEMLLPAVMRGCTELEKFNISHNKVTAKTGPELKKFLQSCGRLSELHMRDTAVPVQVVRD 200
Query: 207 LLTL----------------YLDNNKISNIPDEYFKRFNALQYLRLSHNELADSG---IP 247
++ +N+ ++ L L+ N+ D G I
Sbjct: 201 VIKAIIGNNFITDFQLDLAANKLGVLGANMLAGLAAEITTIKSLDLTDNDFGDEGMSIIA 260
Query: 248 GNSFNVSSLVELDLSYNKLKNIPTV----NENLENYYLEVNQLEKFDIK------SFCKI 297
+ SSL EL L N +N +NL L K D+
Sbjct: 261 DGLCHNSSLRELHLGDNWTRNKTKARSQAVDNLIELISSECPLHKLDLSCKVADNQIKTD 320
Query: 298 LGPL-----SYSKIKHLRLDGNRISETS 320
+ P + +K L + GN + +
Sbjct: 321 ILPFIYSLATNDTLKELDISGNAMGDKV 348
>gi|132575 [1..456] Leucine-rich repeats
Length = 456
Score = 69.8 bits (168), Expect = 6e-14
Identities = 54/312 (17%), Positives = 92/312 (29%), Gaps = 49/312 (15%)
Query: 71 LYLRNNQI-DHIDEKAFENVTDLQWLILDHNVLENSKIKG--RVFSKLKQLKKLHINHNN 127
L ++ Q+ D + + Q + LD L + K L +L + N
Sbjct: 3 LDIQCEQLSDARWTELLPLIQQYQVVRLDDCGLTEVRCKDIRSAIQANPALTELSLRTNE 62
Query: 128 LT--------ESVGPLPKSLEDLQLTHNKITKLG------SFEGLVNLTFIHLQHNRLK- 172
L + + ++ L L + +T+ G L L +HL N L
Sbjct: 63 LGDAGVGLVLQGLQNPTCKIQKLSLQNCSLTEAGCGVLPDVLRSLSTLRELHLNDNPLGD 122
Query: 173 ---EDAVSAAFKGLKSLEYLDLSFNQIARLPS-------GLPVSLLTLYLDNNKISNIPD 222
+ LE L L + + + L L NN
Sbjct: 123 EGLKLLCEGLRDPQCRLEKLQLEYCNLTATSCEPLASVLRVKPDFKELVLSNNDFHEAGI 182
Query: 223 E-----YFKRFNALQYLRLSHNELADSG---IPGNSFNVSSLVELDLSYNKLKNIPTVN- 273
L+ L+L + + + + + +SL ELDL NKL N
Sbjct: 183 HTLCQGLKDSACQLESLKLENCGITSANCKDLCDVVASKASLQELDLGSNKLGNTGIAAL 242
Query: 274 --------ENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDM 325
L +L + K C++L +K L L GN + + +
Sbjct: 243 CSGLLLPSCRLRTLWLWDCDVTAEGCKDLCRVLRAK--QSLKELSLAGNELKD--EGAQL 298
Query: 326 YECLRVANEVTL 337
+ L
Sbjct: 299 LCESLLEPGCQL 310
Score = 50.3 bits (118), Expect = 5e-08
Identities = 23/146 (15%), Positives = 41/146 (27%), Gaps = 20/146 (13%)
Query: 93 QWLILDHNVLENSKIK--GRVFSKLKQLKKLHINHNNLTESVGP--------LPKSLEDL 142
+ L + L + V +K L +L ++ N L +S L L
Sbjct: 311 ESLWVKTCSLTAASCPHFCSVLTKNSSLFELQMSSNPLGDSGVVELCKALGYPDTVLRVL 370
Query: 143 QLTHNKITKLG------SFEGLVNLTFIHLQHNRLK----EDAVSAAFKGLKSLEYLDLS 192
L +T G +L + L +N + + + + L+ L L
Sbjct: 371 WLGDCDVTDSGCSSLATVLLANRSLRELDLSNNCMGDNGVLQLLESLKQPSCILQQLVLY 430
Query: 193 FNQIARLPSGLPVSLLTLYLDNNKIS 218
+L IS
Sbjct: 431 DIYWTDEVEDQLRALEEERPSLRIIS 456
Score = 41.4 bits (95), Expect = 2e-05
Identities = 21/120 (17%), Positives = 36/120 (29%), Gaps = 15/120 (12%)
Query: 223 EYFKRFNALQYLRLSHNELADSG----IPGNSFNVSSLVELDLSYNKLKNIPTV------ 272
+ ++L L++S N L DSG + + L L L + +
Sbjct: 330 SVLTKNSSLFELQMSSNPLGDSGVVELCKALGYPDTVLRVLWLGDCDVTDSGCSSLATVL 389
Query: 273 --NENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDMYECLR 330
N +L L N + + + L ++ L L T D L
Sbjct: 390 LANRSLRELDLSNNCMGDNGVLQLLESL-KQPSCILQQLVLYDIYW--TDEVEDQLRALE 446
>gi|730152 [47..205] Cupredoxins
Length = 159
Score = 24.9 bits (54), Expect = 2.2
Identities = 9/36 (25%), Positives = 11/36 (30%)
Query: 17 SGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMY 52
G F + G +CAP P S M
Sbjct: 109 PGDTAVLRFKATKAGVFVYHCAPAGMVPWHVTSGMN 144
>gi|687687 [74..361] Heme-linked catalases
Length = 288
Score = 24.8 bits (54), Expect = 2.2
Identities = 1/17 (5%), Positives = 3/17 (16%)
Query: 271 TVNENLENYYLEVNQLE 287
T+ +
Sbjct: 240 TLTAASPQPGAACEGIN 256
>gi|1173145 [248..328] the C-terminal domain of RNA polymerase alpha subunit
Length = 81
Score = 24.5 bits (53), Expect = 2.9
Identities = 11/49 (22%), Positives = 22/49 (44%), Gaps = 5/49 (10%)
Query: 258 ELDLS---YNKLK--NIPTVNENLENYYLEVNQLEKFDIKSFCKILGPL 301
+L+L+ N LK I + + ++ +E+ + KS +I L
Sbjct: 11 DLELTVRSANCLKAETIHYIGDLVQRTEVELLKTPNLGKKSLTEIKDVL 59
>gi|115682 [1..213] CoA-dependent acetyltransferases
Length = 213
Score = 23.3 bits (50), Expect = 5.6
Identities = 22/109 (20%), Positives = 39/109 (35%), Gaps = 15/109 (13%)
Query: 213 DNNKISNIPDEYFKRFNALQYLRLS--------HNELADSGIPGNSFNVSSLVELDLSYN 264
+ S + YF + + +P N N+SSL +
Sbjct: 93 ETETFSALSCRYFPDLSEFMAGYNAVTAEYQHDTRLFPQGNLPENHLNISSLPWVSFDGF 152
Query: 265 KLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDG 313
L NI ++Y+ V + KF + ++L P+S ++ H DG
Sbjct: 153 NL-NI----TGNDDYFAPVFTMAKF-QQEGDRVLLPVSV-QVHHAVCDG 194
>gi|1788027 [59..564] Heme-linked catalases
Length = 506
Score = 22.9 bits (49), Expect = 7.2
Identities = 5/17 (29%), Positives = 9/17 (52%)
Query: 271 TVNENLENYYLEVNQLE 287
+N N +N++ E Q
Sbjct: 316 VLNRNPDNFFAENEQAA 332
>gi|1742164 [73..325] Periplasmic binding protein-like I
Length = 253
Score = 22.6 bits (48), Expect = 8.8
Identities = 13/105 (12%), Positives = 33/105 (31%), Gaps = 15/105 (14%)
Query: 215 NKISNIPDEYFKRFNALQYLRLSHN-----ELADSGIPGNSFNVSSLVELDLSYNKLKNI 269
+ K + L + +L IP +V + + ++
Sbjct: 37 DLQKCESKIKQKMIKGIIMLSSPADESFFAQLDKYDIP--------VVVIGKVEGQYAHV 88
Query: 270 PTVN-ENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDG 313
+V+ +N + + L + ++ + PL R++G
Sbjct: 89 YSVDTDNFGDSIALTDALIESGHQNIACLHAPLDVHVSV-DRVNG 132
Underlying Matrix: BLOSUM62
Number of sequences tested against query: 1187
Number of sequences better than 10.0: 11
Number of calls to ALIGN: 15
Length of query: 338
Total length of test sequences: 256703
Effective length of test sequences: 207231.0
Effective search space size: 61545361.0
Initial X dropoff for ALIGN: 25.0 bits