analysis of sequence from tem37 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMYCDELKLKSVPMVPPGIKY LYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLE DLQLTHNKITKLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPVSLLTL YLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYY LEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > tem37_gi|1708878|sp|P51884|LUM_HUMAN . . . . . 1 MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSA 50 ___HHHHHHHHH______________________________________ . . . . . 51 MYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHN 100 _HHHHHHH_________EEEEEE_______________HHHHHHHHHH__ . . . . . 101 VLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKIT 150 ________HHHHHHHHHHHHH______EEE_____HHHHHHHHH_HHHHH . . . . . 151 KLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLP 200 _______HHHHHHHHHHH___HHHHHHHHHH_________HHHHHHH___ . . . . . 201 SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNS 250 ____HHHHHHH___________HHHHHHHHHHHHHHH_____________ . . . . . 251 FNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP 300 __EEEE___________________HHHHHHHHH______HHHHHH____ . . . 301 LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN 338 _________________________HHHHHHHH_____ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 37.8 % beta-contents : 20.3 % coil-contents : 41.9 % class : mixed method : 2 alpha-contents : 33.4 % beta-contents : 6.6 % coil-contents : 60.0 % class : alpha ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -26.13 0.00 0.00 0.00 -4.00 -4.00 -12.00 0.00 0.00 -3.79 -2.19 0.00 -12.00 -8.00 0.00 0.00 -72.12 -4.97 0.00 0.00 0.00 0.00 0.00 -24.00 0.00 0.00 -3.75 -1.81 0.00 -12.00 -8.00 0.00 0.00 -54.53 ID: tem37_gi|1708878|sp|P51884|LUM_HUMAN AC: xxx Len: 280 1:I 243 Sc: -54.53 Pv: 2.839984e-01 NO_GPI_SITE GPI: learning from protozoa -27.61 0.00 0.00 0.00 -4.00 0.00 -24.00 0.00 0.00 -3.36 -7.11 0.00 -12.00 -8.00 0.00 0.00 -86.09 -18.57 0.00 0.00 -0.40 0.00 0.00 -4.00 -0.60 0.00 -4.71 -10.81 -12.00 -12.00 -8.00 -12.00 0.00 -83.08 ID: tem37_gi|1708878|sp|P51884|LUM_HUMAN AC: xxx Len: 280 1:I 248 Sc: -83.08 Pv: 5.654476e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem37_gi|17 1.000 19 Y 0.916 19 Y 0.946 13 Y 0.883 Y # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem37_gi|17 0.529 39 Y 0.307 20 N 0.989 3 Y 0.746 Y # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem37_gi|17 0.364 179 N 0.239 20 N 0.945 4 N 0.802 Y ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) 1-271 MSLSAFTLFLALIGGTSGQYYDYDFPPSIY GQSSPNCAPECNCPESYPSAMYCDELKLKS VPMVPPGIKYLYLRNNQIDHIDEKAFENVT DLQWLILDHNVLENSKIKGRVFSKLKQLKK LHINHNNLTESVGPLPKSLEDLQLTHNKIT KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF KGLKSLEYLDLSFNQIARLPSGLPVSLLTL YLDNNKISNIPDEYFKRFNALQYLRLSHNE LADSGIPGNSFNVSSLVELDLSYNKLKNIP T vnenlenyylevnqle 272-287 288-338 KFDIKSFCKILGPLSYSKIKHLRLDGNRIS ETSLPPDMYECLRVANEVTLN low complexity regions: SEG 25 3.0 3.3 >tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) 1-338 MSLSAFTLFLALIGGTSGQYYDYDFPPSIY GQSSPNCAPECNCPESYPSAMYCDELKLKS VPMVPPGIKYLYLRNNQIDHIDEKAFENVT DLQWLILDHNVLENSKIKGRVFSKLKQLKK LHINHNNLTESVGPLPKSLEDLQLTHNKIT KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF KGLKSLEYLDLSFNQIARLPSGLPVSLLTL YLDNNKISNIPDEYFKRFNALQYLRLSHNE LADSGIPGNSFNVSSLVELDLSYNKLKNIP TVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLR VANEVTLN low complexity regions: SEG 45 3.4 3.75 >tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) 1-338 MSLSAFTLFLALIGGTSGQYYDYDFPPSIY GQSSPNCAPECNCPESYPSAMYCDELKLKS VPMVPPGIKYLYLRNNQIDHIDEKAFENVT DLQWLILDHNVLENSKIKGRVFSKLKQLKK LHINHNNLTESVGPLPKSLEDLQLTHNKIT KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF KGLKSLEYLDLSFNQIARLPSGLPVSLLTL YLDNNKISNIPDEYFKRFNALQYLRLSHNE LADSGIPGNSFNVSSLVELDLSYNKLKNIP TVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLR VANEVTLN low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMYCDELKLKS VPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKK LHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRLKEDAVSAAF KGLKSLEYLDLSFNQIARLPSGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNE LADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN 1 - 338 MSLSAFTLFL ALIGGTSGQY YDYDFPPSIY GQSSPNCAPE CNCPESYPSA MYCDELKLKS VPMVPPGIKY LYLRNNQIDH IDEKAFENVT DLQWLILDHN VLENSKIKGR VFSKLKQLKK LHINHNNLTE SVGPLPKSLE DLQLTHNKIT KLGSFEGLVN LTFIHLQHNR LKEDAVSAAF KGLKSLEYLD LSFNQIARLP SGLPVSLLTL YLDNNKISNI PDEYFKRFNA LQYLRLSHNE LADSGIPGNS FNVSSLVELD LSYNKLKNIP TVNENLENYY LEVNQLEKFD IKSFCKILGP LSYSKIKHLR LDGNRISETS LPPDMYECLR VANEVTLN low complexity regions: DUST >tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMYCDELKLKS VPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKK LHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRLKEDAVSAAF KGLKSLEYLDLSFNQIARLPSGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNE LADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for tem37_gi|1708878|sp|P51884|LUM_HUMAN sequence: 280 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MSLSAFTLFL ALIGGTSGQY YDYDFPPSIY GQSSPNCAPE CNCPESYPSA MYCDELKLKS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 VPMVPPGIKY LYLRNNQIDH IDEKAFENVT DLQWLILDHN VLENSKIKGR VFSKLKQLKK ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~1124444 * 14 M'95 -w local . | . | . | . | . | . 180 LHINHNNLTE SVGPLPKSLE DLQLTHNKIT KLGSFEGLVN LTFIHLQHNR LKEDAVSAAF ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. 4444444444 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 KGLKSLEYLD LSFNQIARLP SGLPVSLLTL YLDNNKISNI PDEYFKRFNA LQYLRLSHNE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | LADSGIPGNS FNVSSLVELD LSYNKLKNIP TVNENLENYY ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ---------- ---------- ---------- ---------- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** tem37.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem37.___inter___ (1 sequences) MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSA MYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHN VLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKIT KLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLP SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNS FNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 1 21 1.745 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 Loop length 0 317 K+R profile 1.00 + CYT-EXT prof - 0.51 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): 0.0521 NEG: 0.0000 POS: 0.0000 -> Orientation: N-in CYT-EXT difference: -0.51 -> Orientation: N-in ---------------------------------------------------------------------- "tem37" 338 1 21 #t 1.74479 ************************************ *TOPPREDM with prokaryotic function* ************************************ tem37.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem37.___inter___ (1 sequences) MSLSAFTLFLALIGGTSGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSA MYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHN VLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKIT KLGSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLP SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNS FNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 1 21 1.745 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 Loop length 0 317 K+R profile 1.00 + CYT-EXT prof - 0.51 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): 0.0521 NEG: 0.0000 POS: 0.0000 -> Orientation: N-in CYT-EXT difference: -0.51 -> Orientation: N-in ---------------------------------------------------------------------- "tem37" 338 1 21 #t 1.74479 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem37.___saps___ SAPS. Version of April 11, 1996. Date run: Tue Oct 31 18:26:47 2000 File: /people/maria/tem37.___saps___ ID tem37_gi|1708878|sp|P51884|LUM_HUMAN DE LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) number of residues: 338; molecular weight: 38.4 kdal 1 MSLSAFTLFL ALIGGTSGQY YDYDFPPSIY GQSSPNCAPE CNCPESYPSA MYCDELKLKS 61 VPMVPPGIKY LYLRNNQIDH IDEKAFENVT DLQWLILDHN VLENSKIKGR VFSKLKQLKK 121 LHINHNNLTE SVGPLPKSLE DLQLTHNKIT KLGSFEGLVN LTFIHLQHNR LKEDAVSAAF 181 KGLKSLEYLD LSFNQIARLP SGLPVSLLTL YLDNNKISNI PDEYFKRFNA LQYLRLSHNE 241 LADSGIPGNS FNVSSLVELD LSYNKLKNIP TVNENLENYY LEVNQLEKFD IKSFCKILGP 301 LSYSKIKHLR LDGNRISETS LPPDMYECLR VANEVTLN -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A- : 12( 3.6%); C : 6( 1.8%); D : 17( 5.0%); E : 21( 6.2%); F : 14( 4.1%) G : 15( 4.4%); H : 9( 2.7%); I : 19( 5.6%); K : 25( 7.4%); L+ : 53(15.7%) M : 4( 1.2%); N+ : 30( 8.9%); P : 19( 5.6%); Q : 10( 3.0%); R : 9( 2.7%) S : 31( 9.2%); T : 11( 3.3%); V : 15( 4.4%); W : 1( 0.3%); Y : 17( 5.0%) KR : 34 ( 10.1%); ED : 38 ( 11.2%); AGP : 46 ( 13.6%); KRED : 72 ( 21.3%); KR-ED : -4 ( -1.2%); FIKMNY : 109 ( 32.2%); LVIFM : 105 ( 31.1%); ST : 42 ( 12.4%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 0000000000 0000000000 0-0-000000 000000000- 0000-00000 000--0+0+0 61 00000000+0 000+0000-0 0--+00-000 -000000-00 00-00+0+0+ 000+0+00++ 121 000000000- 000000+00- -000000+00 +0000-0000 000000000+ 0+--000000 181 +00+00-00- 0000000+00 0000000000 00-00+0000 0--00++000 0000+0000- 241 00-0000000 0000000-0- 0000+0+000 000-00-000 0-0000-+0- 0+000+0000 301 0000+0+00+ 0-00+00-00 000-00-00+ 000-0000 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none Negative charge clusters (cmin = 10/30 or 13/45 or 16/60): none Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 4 | 5 | 7 | 37 | 9 | 9 | 12 | 11 | 11 | 14 | 7 | 8 | lmin1 6 | 6 | 8 | 45 | 11 | 11 | 15 | 13 | 14 | 18 | 8 | 10 | lmin2 7 | 7 | 10 | 50 | 12 | 13 | 16 | 15 | 15 | 20 | 9 | 12 | (Significance level: 0.010000; Minimal displayed length: 6) (*00) 14(0,0,0); at 179- 192: AFKGLKSLEYLDLS (3. quartile) 00+00+00-00-00 Run count statistics: + runs >= 3: 0 - runs >= 3: 0 * runs >= 4: 0 0 runs >= 25: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -3.896 M_0.01= 57.11; M_0.05= 46.67; M_0.30= 34.25 1) From 1 to 15: length= 15, score=39.00 1 MSLSAFTLFL ALIGG L: 4(26.7%); A: 2(13.3%); G: 2(13.3%); S: 2(13.3%); F: 2(13.3%); 2. SPACINGS OF C. H2N-36-C-3-C-1-C-9-C-241-C-32-C-10-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-36-C-3-C-1-C-9-C-26-H-18-H-22-H-2-H-20-H-18-H-2-H-69-H-56-C-12-H-19-C-10-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 Aligned ma tching blocks: [ 105- 108] SKIK [ 304- 307] SKIK ______________________________ [ 137- 142] KSLE__DL [ 184- 191] KSLEYLDL ______________________________ [ 185- 192] SLEYLDLS [ 255- 262] SLVELDLS B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 14 (Expected range: 8-- 36) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 2 (6-10) 4 (11-20) 2 (>=21) 7 3. Clusters of amino acid multiplets (cmin = 9/30 or 11/45 or 13/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 7 (Expected range: 0-- 15) 2 +plets (f+: 10.1%), 5 -plets (f-: 11.2%) Total number of charge altplets: 3 (Critical number: 18) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 1 (6-10) 0 (11-20) 1 (>=21) 6 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 121- 148 7 L...... 4 4 0 183- 214 8 L....... 4 4 0 219- 258 10 N......... 4 4 0 256- 295 10 L......... 4 4 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 149- 166 3 i0. 6 6 /0/2/./ 231- 335 5 i.0.. 18 8 /3/./7/././ -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 0- 37 (1.) C( 37)C 2 of 7 0.9909 small 2. maximal spacing 19- 32 (1.) Q( 13)Q 11 of 11 0.0041 large minimal spacing 53- 295 (3.) C( 242)C 1 of 7 0.0033 large 1. maximal spacing ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: tem37 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- LRR Leucine Rich Repeat 128.1 1.7e-34 10 LRRNT Leucine rich repeat N-terminal domain 42.4 9.9e-09 1 crp Bacterial regulatory proteins, crp famil 2.5 40 1 E1_N E1 Protein, N terminal domain -0.5 87 1 lyase_1 Lyase -1.1 78 1 DUF41 Domain of unknown function DUF41 -73.7 46 1 PI3Ka Phosphoinositide 3-kinase family, access -106.9 94 1 DNA_ligase_N NAD-dependent DNA ligase -256.9 70 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- LRRNT 1/1 36 66 .. 1 31 [] 42.4 9.9e-09 LRR 1/10 67 90 .. 1 23 [] 17.6 0.3 LRR 2/10 91 116 .. 1 23 [] 15.0 1.8 LRR 3/10 117 136 .. 1 23 [] 7.0 96 lyase_1 1/1 138 156 .. 459 485 .] -1.1 78 LRR 4/10 138 159 .. 1 23 [] 18.3 0.19 LRR 5/10 160 184 .. 1 23 [] 15.0 1.8 DUF41 1/1 81 204 .. 1 247 [] -73.7 46 LRR 6/10 185 204 .. 1 23 [] 16.2 0.78 LRR 7/10 206 229 .. 1 23 [] 19.3 0.089 crp 1/1 234 243 .. 1 11 [. 2.5 40 DNA_ligase_N 1/1 83 247 .. 1 327 [] -256.9 70 E1_N 1/1 235 253 .. 142 161 .] -0.5 87 LRR 8/10 230 254 .. 1 23 [] 11.3 22 PI3Ka 1/1 99 274 .. 1 215 [] -106.9 94 LRR 9/10 255 277 .. 1 23 [] 15.4 1.3 LRR 10/10 305 330 .. 1 23 [] 15.8 1 Alignments of top-scoring domains: LRRNT: domain 1 of 1, from 36 to 66: score 42.4, E = 9.9e-09 *->aCpreCtCsp.fglvVdCsgrgLtlevPrdlP<-* C++eC+C+++++++++C++++L+ +vP++ P tem37_gi|1 36 NCAPECNCPEsYPSAMYCDELKLK-SVPMVPP 66 LRR: domain 1 of 10, from 67 to 90: score 17.6, E = 0.3 *->nLeeLdLsnN.LtslppglfsnLp<-* +++L+L+nN++ ++++++f+n + tem37_gi|1 67 GIKYLYLRNNqIDHIDEKAFENVT 90 LRR: domain 2 of 10, from 91 to 116: score 15.0, E = 1.8 *->nLeeLdLsnN.Lt..slppglfsnLp<-* L++L L++N L+++++ +fs+L+ tem37_gi|1 91 DLQWLILDHNvLEnsKIKGRVFSKLK 116 LRR: domain 3 of 10, from 117 to 136: score 7.0, E = 96 *->nLeeLdLsnN.LtslppglfsnLp<-* +L++L++++N+Lt ++ Lp tem37_gi|1 117 QLKKLHINHNnLT--ES--VGPLP 136 lyase_1: domain 1 of 1, from 138 to 156: score -1.1, E = 78 *->alelgqlteeefdsivsPvfefarSve<-* +le++qlt+++++++ s e tem37_gi|1 138 SLEDLQLTHNKITKLGSF--------E 156 LRR: domain 4 of 10, from 138 to 159: score 18.3, E = 0.19 *->nLeeLdLsnN.LtslppglfsnLp<-* +Le L L +N++t+l + f++L tem37_gi|1 138 SLEDLQLTHNkITKLGS--FEGLV 159 LRR: domain 5 of 10, from 160 to 184: score 15.0, E = 1.8 *->nLeeLdLsnN.Lt.slppglfsnLp<-* nL+ ++L++N+L+++ + +f++L+ tem37_gi|1 160 NLTFIHLQHNrLKeDAVSAAFKGLK 184 DUF41: domain 1 of 1, from 81 to 204: score -73.7, E = 46 *->lteeQLlstFsNvkhliGslevqnTnfkslsFLanLesIecgirkrn ++e +F Nv+ l+ l + +++ + I+++ tem37_gi|1 81 IDE----KAFENVTDLQ-WLILDHNVLENSK-------IKGR----V 111 kdrvrkildnihdnpfswidnqnmlelgllnlTnmtrlgLpilsnldlnk + + ++ l++ h n+ tem37_gi|1 112 FSKLKQ-LKKLHINH----------------------------------- 125 LnlpnlknisnpnstgekiivnfenlhpdFClTteEllnfflnsnvsien +nl+ ++ p + ++++d +lT+ + ++ tem37_gi|1 126 ---NNLTESVGPLP----------KSLEDLQLTHNKITKLGSF------- 155 leakyCepksrifflikktdngivyklCnfkslsssvnLdngCtiIfGdL ++ ++ f+ l+ ++ d tem37_gi|1 156 -------------------EGL---VNLTFIHLQH--------NRLKEDA 175 vIgpgdEeyVskLknveviFGsLiIqNTnLtnidFLenLkyIasLedsvs v + + Lk++e+ +d ++ +Ia+L + ++ tem37_gi|1 176 VSAAFKG-----LKSLEY--------------LD--LSFNQIARLPSGLP 204 <-* tem37_gi|1 - - LRR: domain 6 of 10, from 185 to 204: score 16.2, E = 0.78 *->nLeeLdLsnN.LtslppglfsnLp<-* +Le+LdLs N++ lp+ +Lp tem37_gi|1 185 SLEYLDLSFNqIARLPS----GLP 204 LRR: domain 7 of 10, from 206 to 229: score 19.3, E = 0.089 *->nLeeLdLsnN.LtslppglfsnLp<-* +L +L+L+nN+++++p++ f+ + tem37_gi|1 206 SLLTLYLDNNkISNIPDEYFKRFN 229 crp: domain 1 of 1, from 234 to 243: score 2.5, E = 40 *->lpmsLRqeIAd<-* l++s ++e+Ad tem37_gi|1 234 LRLS-HNELAD 243 DNA_ligase_N: domain 1 of 1, from 83 to 247: score -256.9, E = 70 *->eeaqqeieeLrelirkydyeYYvlDaPlVpDaeYDrLyrrLkaLEek e+a + +++L+ li ++ vl ++ + ++ +L ++Lk+L tem37_gi|1 83 EKAFENVTDLQWLILDHN----VLENSKIKGRVFSKL-KQLKKLHIN 124 fPELiTpDSPTQrVGGapllgdFkkvrHpaPMLSLDNAFsedeLrafieR + +L T+ VG + +++ +++L+ + tem37_gi|1 125 HNNL------TESVG-PLPKS-------------------LEDLQLTHNK 148 CCmirrrlgnsekvayvVEPKIDGlAvsLtYedGvLvrAaTRGDGttGED i++ lg+ e Gl v+Lt tem37_gi|1 149 ---ITK-LGSFE-----------GL-VNLT-------------------- 162 VTqNVkTIraIPlklpgdnivrppPerlEvRGEVfmpkedFeaLNeeree +++ +++L tem37_gi|1 163 -------------FIHLQ----------------------HNRLK----- 172 egekpFANPRNAAAGSLRQLDPkiTAkRkLrffvYglglveglelgpdTq + + A +k gl +e l+l tem37_gi|1 173 --------------------EDAVSAAFK------GLKSLEYLDLS---- 192 seaLkqLkkl..GFplVnphtrlck.....gideVldyyaewekkRdsLp q+++l++G+p V+ +t ++++ ++i + +y++ + + L tem37_gi|1 193 ---FNQIARLpsGLP-VSLLTLYLDnnkisNIPD--EYFKRFNA----LQ 232 yeIDGVVvKvnelplQreLGfTskaPRWAiAYKFpAe<-* y +++++ l + s++P tem37_gi|1 233 Y------LRLSHNELAD-----SGIP----------- 247 E1_N: domain 1 of 1, from 235 to 253: score -0.5, E = 87 *->RLFeelPEvpDSGy.GntevE<-* RL++ E+ DSG++Gn+ ++ tem37_gi|1 235 RLSHN--ELADSGIpGNSFNV 253 LRR: domain 8 of 10, from 230 to 254: score 11.3, E = 22 *->nLeeLdLsnN.Lt..slppglfsnLp<-* L++L+Ls+N+L ++++p ++f+ + tem37_gi|1 230 ALQYLRLSHNeLAdsGIPGNSFN-VS 254 PI3Ka: domain 1 of 1, from 99 to 274: score -106.9, E = 94 *->dkdlkpnlsskerkrleaIlayD....PlsaLtaeekdLiWkfRhyy +++l+ s+ + ++++++ ++ + + + Lt+ + L +++++ tem37_gi|1 99 HNVLEN--SKIKGRVFSKLKQLKklhiNHNNLTESVGPLPKSLEDLQ 143 ltsnPkALtLmCVGSPKl.....LlSVkWsdlsevaealsLldkWvWqap lt n+ +t Kl++ ++L V+ + ++ l + a+ tem37_gi|1 144 LTHNK--IT-------KLgsfegL--VNLTFIHLQHNRLKEDAVS---AA 179 idpvdALELLdpkFadnheeVReYAVkcLesYasDdELlfYLLQLVQALK + ++LE+Ld +F + +L s +++ L +YL tem37_gi|1 180 FKGLKSLEYLDLSFNQ---------IARLPSGLPVSLLTLYL-------- 212 YEnldepfhdSpLsrFLlkR..AlkNrsrlGHfFfWyLksEiYKDdldhd + + ++S +kR +Al+ rl+H + +++ tem37_gi|1 213 -D----NNKISNIPDEYFKRfnALQYL-RLSH----NELADS------GI 246 eevkserFgvllEsylrectgtsledlnk<-* + s ++l E+ l++ +++ +n+ tem37_gi|1 247 PGN-SFNVSSLVELDLSYNKLKNIPTVNE 274 LRR: domain 9 of 10, from 255 to 277: score 15.4, E = 1.3 *->nLeeLdLsnN.LtslppglfsnLp<-* +L eLdLs+N+L+++p + +nL tem37_gi|1 255 SLVELDLSYNkLKNIPT-VNENLE 277 LRR: domain 10 of 10, from 305 to 330: score 15.8, E = 1 *->nLeeLdLsnN.Lt..slppglfsnLp<-* ++++L+L++N+++++slpp+ + L+ tem37_gi|1 305 KIKHLRLDGNrISetSLPPDMYECLR 330 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: tem37 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- LRR Leucine Rich Repeat 110.9 2.1e-29 11 LRRNT Leucine rich repeat N-terminal domain 40.5 1.7e-09 1 crp Bacterial regulatory proteins, crp family 2.5 40 1 K-box K-box region 1.4 44 1 SNAP-25 SNAP-25 family 0.2 45 1 E1_N E1 Protein, N terminal domain -0.5 87 1 lyase_1 Lyase -1.1 78 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- LRRNT 1/1 36 66 .. 1 31 [] 40.5 1.7e-09 LRR 1/11 67 90 .. 1 23 [] 15.6 0.017 SNAP-25 1/1 75 91 .. 199 215 .] 0.2 45 LRR 2/11 91 116 .. 1 23 [] 13.0 0.091 LRR 3/11 117 130 .. 1 13 [. 6.4 7 lyase_1 1/1 138 156 .. 459 485 .] -1.1 78 LRR 4/11 138 159 .. 1 23 [] 16.4 0.01 LRR 5/11 160 184 .. 1 23 [] 13.1 0.09 LRR 6/11 185 204 .. 1 23 [] 14.5 0.036 LRR 7/11 206 229 .. 1 23 [] 17.4 0.0054 crp 1/1 234 243 .. 1 11 [. 2.5 40 LRR 8/11 230 252 .. 1 20 [. 11.0 0.35 E1_N 1/1 235 253 .. 142 161 .] -0.5 87 LRR 9/11 255 270 .. 1 15 [. 12.6 0.12 K-box 1/1 261 288 .. 1 29 [. 1.4 44 LRR 10/11 275 298 .. 1 23 [] 1.4 1.7e+02 LRR 11/11 305 330 .. 1 23 [] 13.9 0.053 Alignments of top-scoring domains: LRRNT: domain 1 of 1, from 36 to 66: score 40.5, E = 1.7e-09 *->aCpreCtCsp.fglvVdCsgrgLtlevPrdlP<-* C++eC+C+++++++++C++++L+ +vP++ P tem37_gi|1 36 NCAPECNCPEsYPSAMYCDELKLK-SVPMVPP 66 LRR: domain 1 of 11, from 67 to 90: score 15.6, E = 0.017 *->nLeeLdLsnN.LtslppglfsnLp<-* +++L+L+nN++ ++++++f+n + tem37_gi|1 67 GIKYLYLRNNqIDHIDEKAFENVT 90 SNAP-25: domain 1 of 1, from 75 to 91: score 0.2, E = 45 *->nrqidRIeeKadsndar<-* n qid I eKa +n + tem37_gi|1 75 NNQIDHIDEKAFENVTD 91 LRR: domain 2 of 11, from 91 to 116: score 13.0, E = 0.091 *->nLeeLdLsnN.Lt..slppglfsnLp<-* L++L L++N L+++++ +fs+L+ tem37_gi|1 91 DLQWLILDHNvLEnsKIKGRVFSKLK 116 LRR: domain 3 of 11, from 117 to 130: score 6.4, E = 7 *->nLeeLdLsnN.Lts<-* +L++L++++N+Lt+ tem37_gi|1 117 QLKKLHINHNnLTE 130 lyase_1: domain 1 of 1, from 138 to 156: score -1.1, E = 78 *->alelgqlteeefdsivsPvfefarSve<-* +le++qlt+++++++ s e tem37_gi|1 138 SLEDLQLTHNKITKLGSF--------E 156 LRR: domain 4 of 11, from 138 to 159: score 16.4, E = 0.01 *->nLeeLdLsnN.LtslppglfsnLp<-* +Le L L +N++t+l + f++L tem37_gi|1 138 SLEDLQLTHNkITKLGS--FEGLV 159 LRR: domain 5 of 11, from 160 to 184: score 13.1, E = 0.09 *->nLeeLdLsnN.Lt.slppglfsnLp<-* nL+ ++L++N+L+++ + +f++L+ tem37_gi|1 160 NLTFIHLQHNrLKeDAVSAAFKGLK 184 LRR: domain 6 of 11, from 185 to 204: score 14.5, E = 0.036 *->nLeeLdLsnN.LtslppglfsnLp<-* +Le+LdLs N++ lp+ +Lp tem37_gi|1 185 SLEYLDLSFNqIARLPS----GLP 204 LRR: domain 7 of 11, from 206 to 229: score 17.4, E = 0.0054 *->nLeeLdLsnN.LtslppglfsnLp<-* +L +L+L+nN+++++p++ f+ + tem37_gi|1 206 SLLTLYLDNNkISNIPDEYFKRFN 229 crp: domain 1 of 1, from 234 to 243: score 2.5, E = 40 *->lpmsLRqeIAd<-* l++s ++e+Ad tem37_gi|1 234 LRLS-HNELAD 243 LRR: domain 8 of 11, from 230 to 252: score 11.0, E = 0.35 *->nLeeLdLsnN.Lt..slppglfs<-* L++L+Ls+N+L ++++p ++f+ tem37_gi|1 230 ALQYLRLSHNeLAdsGIPGNSFN 252 E1_N: domain 1 of 1, from 235 to 253: score -0.5, E = 87 *->RLFeelPEvpDSGy.GntevE<-* RL++ E+ DSG++Gn+ ++ tem37_gi|1 235 RLSHN--ELADSGIpGNSFNV 253 LRR: domain 9 of 11, from 255 to 270: score 12.6, E = 0.12 *->nLeeLdLsnN.Ltslp<-* +L eLdLs+N+L+++p tem37_gi|1 255 SLVELDLSYNkLKNIP 270 K-box: domain 1 of 1, from 261 to 288: score 1.4, E = 44 *->dsyqkssgnsslwesnyqnwqqEaaKLka<-* sy+k+ +++ ++n +n++ E++ L + tem37_gi|1 261 LSYNKLKNIP-TVNENLENYYLEVNQLEK 288 LRR: domain 10 of 11, from 275 to 298: score 1.4, E = 1.7e+02 *->nLeeLdLsnN.LtslppglfsnLp<-* nLe +L+ N+L++++ ++f++ tem37_gi|1 275 NLENYYLEVNqLEKFDIKSFCKIL 298 LRR: domain 11 of 11, from 305 to 330: score 13.9, E = 0.053 *->nLeeLdLsnN.Lt..slppglfsnLp<-* ++++L+L++N+++++slpp+ + L+ tem37_gi|1 305 KIKHLRLDGNrISetSLPPDMYECLR 330 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: tem37 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- LRR-ma 73.0 6.2e-18 10 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- LRR-ma 1/10 65 78 .. 1 14 [] 10.4 31 LRR-ma 2/10 89 102 .. 1 14 [] 4.1 4.1e+02 LRR-ma 3/10 115 128 .. 1 14 [] 12.4 11 LRR-ma 4/10 136 149 .. 1 14 [] 9.2 52 LRR-ma 5/10 158 171 .. 1 14 [] 11.5 20 LRR-ma 6/10 183 196 .. 1 14 [] 16.6 0.61 LRR-ma 7/10 204 217 .. 1 14 [] 8.0 85 LRR-ma 8/10 228 241 .. 1 14 [] 11.2 23 LRR-ma 9/10 253 266 .. 1 14 [] 12.9 7.8 LRR-ma 10/10 303 316 .. 1 14 [] 4.4 3.6e+02 Alignments of top-scoring domains: LRR-ma: domain 1 of 10, from 65 to 78: score 10.4, E = 31 *->lpsLeeLdLsnNrl<-* +p +++L+L+nN++ tem37_gi|1 65 PPGIKYLYLRNNQI 78 LRR-ma: domain 2 of 10, from 89 to 102: score 4.1, E = 4.1e+02 *->lpsLeeLdLsnNrl<-* + L+ L L +N l tem37_gi|1 89 VTDLQWLILDHNVL 102 LRR-ma: domain 3 of 10, from 115 to 128: score 12.4, E = 11 *->lpsLeeLdLsnNrl<-* l++L++L++++N+l tem37_gi|1 115 LKQLKKLHINHNNL 128 LRR-ma: domain 4 of 10, from 136 to 149: score 9.2, E = 52 *->lpsLeeLdLsnNrl<-* ++sLe L L +N++ tem37_gi|1 136 PKSLEDLQLTHNKI 149 LRR-ma: domain 5 of 10, from 158 to 171: score 11.5, E = 20 *->lpsLeeLdLsnNrl<-* l +L+ ++L++Nrl tem37_gi|1 158 LVNLTFIHLQHNRL 171 LRR-ma: domain 6 of 10, from 183 to 196: score 16.6, E = 0.61 *->lpsLeeLdLsnNrl<-* l+sLe+LdLs N++ tem37_gi|1 183 LKSLEYLDLSFNQI 196 LRR-ma: domain 7 of 10, from 204 to 217: score 8.0, E = 85 *->lpsLeeLdLsnNrl<-* + sL +L+L nN++ tem37_gi|1 204 PVSLLTLYLDNNKI 217 LRR-ma: domain 8 of 10, from 228 to 241: score 11.2, E = 23 *->lpsLeeLdLsnNrl<-* + L++L+Ls+N+l tem37_gi|1 228 FNALQYLRLSHNEL 241 LRR-ma: domain 9 of 10, from 253 to 266: score 12.9, E = 7.8 *->lpsLeeLdLsnNrl<-* +sL eLdLs+N+l tem37_gi|1 253 VSSLVELDLSYNKL 266 LRR-ma: domain 10 of 10, from 303 to 316: score 4.4, E = 3.6e+02 *->lpsLeeLdLsnNrl<-* +++++L+L +Nr+ tem37_gi|1 303 YSKIKHLRLDGNRI 316 // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Tue Oct 31 18:28:58 2000 Sequence file: tem37 ---------------------------------------- Sequence tem37_gi|1708878|sp|P51884|LUM_HUMAN (338 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 88: NVTD 127: NLTE 160: NLTF 252: NVSS Total matches: 4 Matching pattern PS00006 CK2_PHOSPHO_SITE: 138: SLED 237: SHNE 255: SLVE 271: TVNE Total matches: 4 Matching pattern PS00008 MYRISTYL: 14: GGTSGQ 153: GSFEGL 202: GLPVSL 245: GIPGNS 248: GNSFNV 313: GNRISE Total matches: 6 Matching pattern PS00029 LEUCINE_ZIPPER: 121: LHINHNNLTESVGPLPKSLEDL Total matches: 1 Total no of hits in this sequence: 15 ======================================== 1314 pattern(s) searched in 1 sequence(s), 338 residues. Total no of hits in all sequences: 15. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** argv[1]=P argv[2]=-m /data/patterns/own/motif.fa argv[4]=-seq tem37 ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 338 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: tem37 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: tem37 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) (338 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value LRR Leucine rich repeats 112 5e-28 PDE cyclic NMP phosphodiesterase domain 25 0.15 RASGAP RAS-type GTPase GTP hydrolysis activating protein 22 1.6 UB Ubiquitin domain 21 2.3 HISDAC Histone deacetylase domain 21 2.7 RASGEF RAS-type GTPase GDP exchange factor 21 3.2 BZIP Basic Zipper domain (A DNA binding domain) 21 3.4 MBL Metallo-betalactamase domain 20 3.7 SH3 Src Homology domain 3 20 4.1 UBHYD Ubiquitin C-terminal hydrolase domain 20 6.0 HECT A ubiquitin conjugating enzyme domain 19 7.7 SH2 Src Homology domain 2 19 8.3 KR Kringle domain (Adhesion module) 19 9.4 >LRR Leucine rich repeats Length = 339 Score = 112 bits (279), Expect = 5e-28 Identities = 71/270 (26%), Positives = 116/270 (42%), Gaps = 14/270 (5%) Query: 66 PGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINH 125 G L L N I+ I N+ DL +L L +N L+ + R S L+ LK + Sbjct: 56 KGSIVLNLSYNNIETIPNSVCANLIDLLFLDLSNNKLDMLPPQIRRLSMLQSLKLSNNPL 115 Query: 126 NNLTESVGPLPKSLEDLQLTHNKITK---LGSFEGLVNLTFIHLQHNRLKEDAVSAAFKG 182 N+ P SL L +++ T + + + NL + N L V A Sbjct: 116 NHFQLKQLPSMTSLSVLHMSNTNRTLDNIPPTLDDMHNLRDVDFSENNLPI--VPEALFK 173 Query: 183 LKSLEYLDLSFNQIARLP--SGLPVSLLTLYLDNNKISNIPDEYFKRFNALQYLRLSHNE 240 L++L L+LS N+I +L G +L TL + +N+++ +PD K L L ++N+ Sbjct: 174 LRNLRKLNLSGNKIEKLNMTEGEWENLETLNMSHNQLTVLPDCVVK-LTRLTKLYAANNQ 232 Query: 241 LADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGP 300 L GIP + L L LSYNKL+ +P L+ +L+ + + + + Sbjct: 233 LTFEGIPSGIGKLIQLTVLHLSYNKLELVPEGISRCVK--LQKLKLDHNRLITLPEGIHL 290 Query: 301 LSYSKIKHLRLDGNRISETSLPPDMYECLR 330 L +K L L N +PP + + Sbjct: 291 LPD--LKVLDLHENEN--LVMPPKPNDARK 316 Score = 42.9 bits (100), Expect = 5e-07 Identities = 15/52 (28%), Positives = 24/52 (45%) Query: 153 GSFEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLP 204 G L L + ++ N LK + +K L +DLS NQ+ +P+ L Sbjct: 2 GELSDLPRLRSVIVRDNNLKTAGIPTDIFRMKDLTIIDLSRNQLREVPTNLE 53 Score = 37.8 bits (87), Expect = 2e-05 Identities = 16/46 (34%), Positives = 26/46 (55%) Query: 231 LQYLRLSHNELADSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENL 276 L+ + + N L +GIP + F + L +DLS N+L+ +PT E Sbjct: 10 LRSVIVRDNNLKTAGIPTDIFRMKDLTIIDLSRNQLREVPTNLEYA 55 Score = 28.9 bits (64), Expect = 0.011 Identities = 14/58 (24%), Positives = 29/58 (49%), Gaps = 5/58 (8%) Query: 85 AFENVTDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINHNNLTESVGPLPKSLEDL 142 ++ L+ +I+ N L+ + I +F ++K L + ++ N L E +P +LE Sbjct: 3 ELSDLPRLRSVIVRDNNLKTAGIPTDIF-RMKDLTIIDLSRNQLRE----VPTNLEYA 55 Score = 20.3 bits (42), Expect = 3.7 Identities = 4/18 (22%), Positives = 8/18 (44%) Query: 206 SLLTLYLDNNKISNIPDE 223 L + L N++ +P Sbjct: 34 DLTIIDLSRNQLREVPTN 51 >PDE cyclic NMP phosphodiesterase domain Length = 350 Score = 25.1 bits (54), Expect = 0.15 Identities = 35/124 (28%), Positives = 49/124 (39%), Gaps = 27/124 (21%) Query: 205 VSLLTLYLDNNKISNIPD---------EYFKRFNALQYLR-------LSHNELADSGIPG 248 +++ LY NN+ N + FNA QYL L D PG Sbjct: 54 MTVNALYRKNNRYHNFTHAFDVTQTVYTFLTSFNAAQYLTHLDIFALLISCMCHDLNHPG 113 Query: 249 --NSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKI 306 N+F V++ EL L YN + LEN++ + K S C IL L+ + Sbjct: 114 FNNTFQVNAQTELSLEYNDI-------SVLENHHAMLT--FKILRNSECNILEGLNEDQY 164 Query: 307 KHLR 310 K LR Sbjct: 165 KELR 168 >RASGAP RAS-type GTPase GTP hydrolysis activating protein Length = 292 Score = 21.6 bits (45), Expect = 1.6 Identities = 15/73 (20%), Positives = 34/73 (46%), Gaps = 3/73 (4%) Query: 77 QIDHI-DEKAFENVTDLQWLILDHNVLENS--KIKGRVFSKLKQLKKLHINHNNLTESVG 133 DH+ ++++ +L +D + S I G + S ++ + + TE + Sbjct: 14 TADHVFPLATYDDLMNLLLESVDQRPITVSAVSILGELVSGKTEVAQPLVRLFTHTERIA 73 Query: 134 PLPKSLEDLQLTH 146 P+ K+L D +++H Sbjct: 74 PIIKALADHEISH 86 Score = 20.9 bits (43), Expect = 2.9 Identities = 20/70 (28%), Positives = 32/70 (45%), Gaps = 8/70 (11%) Query: 209 TLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIPGNSFNVSSLVELDLSYNKLKN 268 T++ N +S + DE R + L YL H L P S V+ ++ +K+K+ Sbjct: 92 TIFRGNTLVSKMMDEAM-RLSGLHYL---HQTLR----PVLSQIVAEKKPCEIDPSKIKD 143 Query: 269 IPTVNENLEN 278 V+ NL N Sbjct: 144 RSAVDTNLHN 153 >UB Ubiquitin domain Length = 128 Score = 20.9 bits (44), Expect = 2.3 Identities = 14/79 (17%), Positives = 28/79 (34%), Gaps = 5/79 (6%) Query: 74 RNNQIDHIDEKAFENV---TDLQWLILDHNVLENSKIKGRVFSKLKQLKKLHINHNNLTE 130 ++ ID++ K + D Q LI LE+ + +++ LH+ Sbjct: 19 SSDTIDNVKSKIQDKEGIPPDQQRLIFAGKQLEDGRTLS--DYNIQKESTLHLVLRLRGG 76 Query: 131 SVGPLPKSLEDLQLTHNKI 149 + P K+L + Sbjct: 77 IIEPSLKALASKYNCDKSV 95 >HISDAC Histone deacetylase domain Length = 433 Score = 21.0 bits (44), Expect = 2.7 Identities = 12/99 (12%), Positives = 31/99 (31%), Gaps = 7/99 (7%) Query: 243 DSGIPGNSFNVSSLVELDLSYNKLKNIPTVNENLENYYLEVNQLEKF---DIKSFCKILG 299 +G P + L ++Y K + ++ +F + F + Sbjct: 35 GAGHPMKPHRIRMAHSLIMNYGLYKKMEIYRAKPAT----KQEMCQFHTDEYIDFLSRVT 90 Query: 300 PLSYSKIKHLRLDGNRISETSLPPDMYECLRVANEVTLN 338 P + K + N + + +YE ++ ++ Sbjct: 91 PDNLEMFKRESVKFNVGDDCPVFDGLYEYCSISGGGSME 129 >RASGEF RAS-type GTPase GDP exchange factor Length = 196 Score = 20.5 bits (43), Expect = 3.2 Identities = 14/77 (18%), Positives = 29/77 (37%), Gaps = 7/77 (9%) Query: 103 ENSKIKGRVFSK----LKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITK--LGSFE 156 ++SK+K V + + N N L E + L S+ I L ++E Sbjct: 79 KSSKMKRNVIQRFIHVADHCRTFQ-NFNTLMEIILALSSSVVKFTDAWRLIEPGDLLTWE 137 Query: 157 GLVNLTFIHLQHNRLKE 173 L + + ++ ++ Sbjct: 138 ELKKIPSLDRNYSTIRN 154 >BZIP Basic Zipper domain (A DNA binding domain) Length = 89 Score = 20.6 bits (42), Expect = 3.4 Identities = 11/51 (21%), Positives = 25/51 (48%) Query: 49 SAMYCDELKLKSVPMVPPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDH 99 +A C + KL+ + + +K L +N+++ E V L+ +++H Sbjct: 23 AASKCRKRKLERIARLEEKVKTLKAQNSELASTANMLREQVAQLKQKVMNH 73 >MBL Metallo-betalactamase domain Length = 256 Score = 20.5 bits (42), Expect = 3.7 Identities = 9/106 (8%), Positives = 26/106 (24%), Gaps = 12/106 (11%) Query: 65 PPGIKYLYLRNNQIDHIDEKAFENVTDLQWLILDHNVLENSKIKG--RVFSKLKQLKKLH 122 I +Y+ + DH+ + + + + + + + L + Sbjct: 54 HRDITDIYVSHLHSDHVG-----GLEYVGFSTMFDPNCGKPNLYLSQDIAADLWERSLAG 108 Query: 123 INHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQH 168 + + + L + + I L H Sbjct: 109 GMEAIEGG-MTEVDSYFQIHALGPGETFTWENVN----FQLIKLNH 149 >SH3 Src Homology domain 3 Length = 90 Score = 20.1 bits (42), Expect = 4.1 Identities = 7/16 (43%), Positives = 9/16 (55%) Query: 57 KLKSVPMVPPGIKYLY 72 +K +P PP K LY Sbjct: 70 IIKPLPQPPPQCKALY 85 >UBHYD Ubiquitin C-terminal hydrolase domain Length = 884 Score = 19.6 bits (40), Expect = 6.0 Identities = 10/70 (14%), Positives = 23/70 (32%), Gaps = 5/70 (7%) Query: 266 LKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKI-----KHLRLDGNRISETS 320 L + + + E + K + ++ L +S K + G ++ Sbjct: 14 LFFTNQLRKAVYMMPTEGDDSSKSVPLALQRVFYELQHSDKPVGTKKLTKSFGWETLDSF 73 Query: 321 LPPDMYECLR 330 + D+ E R Sbjct: 74 MQHDVQELCR 83 >HECT A ubiquitin conjugating enzyme domain Length = 255 Score = 19.3 bits (40), Expect = 7.7 Identities = 17/121 (14%), Positives = 35/121 (28%), Gaps = 15/121 (12%) Query: 209 TLYLDNNKISNIPDEYFKRFNALQYLRLSHNELADSGIP--------GNSFNVSSLVELD 260 + + + FK L + + + D + + L E+D Sbjct: 65 RFLFNPSACLDEHLMQFKFLGILMGVAIRTKKPLDLHLAPLVWKQLCCVPLTLEDLEEVD 124 Query: 261 LS-YNKLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISET 319 L L +I L + + ++ +G + K+ + GN I T Sbjct: 125 LLYVQTLNSI------LHIEDSGITEESFHEMIPLDSFVGQSADGKMVPIIPGGNSIPLT 178 Query: 320 S 320 Sbjct: 179 F 179 >SH2 Src Homology domain 2 Length = 119 Score = 19.0 bits (39), Expect = 8.3 Identities = 9/61 (14%), Positives = 20/61 (32%), Gaps = 7/61 (11%) Query: 60 SVPMVPPGIKYLYLRNNQIDH---IDEKAFENVTDLQWLILDHNVLENSKIKGRVFSKLK 116 PM IK+ +R+ F ++D+ + H ++ + R+ Sbjct: 50 YDPMHGDVIKHYKIRSLDNGGYYISPRITFPCISDM----IKHYQKQSDGLCRRLEKACI 105 Query: 117 Q 117 Sbjct: 106 S 106 >KR Kringle domain (Adhesion module) Length = 86 Score = 19.1 bits (39), Expect = 9.4 Identities = 3/30 (10%), Positives = 8/30 (26%) Query: 75 NNQIDHIDEKAFENVTDLQWLILDHNVLEN 104 N+ + + + + H N Sbjct: 28 NSDLLYQELHVDSVGAAALLGLGPHAYCRN 57 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 13 Number of calls to ALIGN: 18 Length of query: 338 Total length of test sequences: 20182 Effective length of test sequences: 16536.0 Effective search space size: 5016881.8 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= tem37_gi|1708878|sp|P51884|LUM_HUMAN LUMICAN PRECURSOR (LUM) (KERATAN SULFATE PROTEOGLYCAN) (338 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|1082610 [223..548] Leucine-rich repeats 103 6e-24 gi|2623618 [35..332] Leucine-rich repeats 91 2e-20 gi|1418519 [191..423] Leucine-rich repeats 88 2e-19 gi|1881738 [246..599] Leucine-rich repeats 80 5e-17 gi|132575 [1..456] Leucine-rich repeats 70 6e-14 gi|730152 [47..205] Cupredoxins 25 2.2 gi|687687 [74..361] Heme-linked catalases 25 2.2 gi|1173145 [248..328] the C-terminal domain of RNA polymeras... 25 2.9 gi|115682 [1..213] CoA-dependent acetyltransferases 23 5.6 gi|1788027 [59..564] Heme-linked catalases 23 7.2 gi|1742164 [73..325] Periplasmic binding protein-like I 23 8.8 >gi|1082610 [223..548] Leucine-rich repeats Length = 326 Score = 103 bits (254), Expect = 6e-24 Identities = 40/304 (13%), Positives = 79/304 (25%), Gaps = 41/304 (13%) Query: 57 KLKSVPMVPPGIKYLYLRNNQIDHIDEKAFEN--------VTDLQWLILDHNVLENSKIK 108 L + ++ L L N + + ++ L+ L L ++ Sbjct: 9 TLCHLLSSWVSLESLTLSYNGLGSNIFRLLDSLRALSGQAGCRLRALHLSDLFSPLPILE 68 Query: 109 G--RVFSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHL 166 + L L+ L I ++ ++ P ++ L L Sbjct: 69 LTRAIVRALPLLRVLSIRVDHPSQRDNPGVPGNAGPPSHIIGDEEI-PENCLEQLEMGFP 127 Query: 167 QHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPVSLLTLYLDNNKISNIPDEYFK 226 + + + + K SL+ L L + + Sbjct: 128 RGAQPAPL-LCSVLKASGSLQQLSLDSATF---------------ASPQDFGLVLQTLKE 171 Query: 227 RFNALQYLRLSHNELAD-SGIPGNSFNVSSLVELDLSYNKLKNIPTV------------N 273 AL+ L LAD +L E+ S+ +L N Sbjct: 172 YNLALKRLSFHDMNLADCQSEVLFLLQNLTLQEITFSFCRLFEKRPAQFLPEMVAAMKGN 231 Query: 274 ENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDMYECLRVAN 333 L+ L N+L + + + S S + L + N I + + L Sbjct: 232 STLKGLRLPGNRLGNAGLLALADVFSEDSSSSLCQLDISSNCIKPDG-LLEFAKRLERWG 290 Query: 334 EVTL 337 Sbjct: 291 RGAF 294 >gi|2623618 [35..332] Leucine-rich repeats Length = 298 Score = 91.1 bits (223), Expect = 2e-20 Identities = 39/287 (13%), Positives = 84/287 (28%), Gaps = 39/287 (13%) Query: 66 PGIKYLYLRNNQIDHIDE----KAFENVTDLQWLILDHNVLEN---------SKIKGRVF 112 GI+ L L N I KA E+ Q + + Sbjct: 14 EGIQSLKLNGNTIGVEAAQALAKALESKPQFQRARWSDMFTGRLRSEIPPALMSLGAGIM 73 Query: 113 SKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRLK 172 + L ++ ++ N + L + ++ + GL + + Sbjct: 74 TAGAHLVEIDLSDNAFGPDGVKAVRELLESSSCYSLREMRFNNNGLGIGGKLMAEALITC 133 Query: 173 EDAVSAAFKGLKSLEYLDLSFNQIARLPS-------GLPVSLLTLYLDNNKISNIPD--- 222 S +L+ N++ + + +L + L N I+ Sbjct: 134 H-EKSTKAGKPLALKVFIAGRNRLENPGATVLAKAFKIIGTLEEIALPQNGINYEGITAL 192 Query: 223 -EYFKRFNALQYLRLSHNELADSG---IPGNSFNVSSLVELDLSYNKLKNIPTV------ 272 E + + L+ L L+ N G + N+S L ++ +++ Sbjct: 193 AEAVEYSHNLKILNLNDNTFTARGAKPMAKAIKNLSKLEVINFGDCLVRSEGADAIANSL 252 Query: 273 ---NENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRI 316 +L+ L +++K + + + L L+GN I Sbjct: 253 REGVPSLKELNLAFGEIKKEAAVRVAESMDTK--PHLTLLDLNGNNI 297 >gi|1418519 [191..423] Leucine-rich repeats Length = 233 Score = 88.2 bits (216), Expect = 2e-19 Identities = 33/248 (13%), Positives = 74/248 (29%), Gaps = 46/248 (18%) Query: 112 FSKLKQLKKLHINHNNLTESVGPLPKSLEDLQLTHNKITKLGSFEGLVNLTFIHLQHNRL 171 L+ L++ + NL + + + K+ + +LT +HL++ ++ Sbjct: 2 IRHAVSLQMLNLRYTNLND-------------RSIPALCKMARAQPSASLTCLHLENTQM 48 Query: 172 K---EDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPV-------SLLTLYLDNNKISNIP 221 + A K L L L N + SL L L NN I + Sbjct: 49 SGKNLLVLICALKNNTGLRELYLGDNGLQPTDGSHIYQLITSNSSLQLLDLRNNSIGDSG 108 Query: 222 DE---------YFKRFNALQYLRLSHNELADSG---IPGNSFNVSSLVELDLSYNKLKNI 269 ++L + L +N + + + + + L++ N L Sbjct: 109 VRHICDGLRHREAVEKSSLSAMVLWNNNVTGASMDSLAEALIENTKIETLNIGNNNLGVE 168 Query: 270 PTV--------NENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSL 321 N +L L+ + + + + + + N I+ Sbjct: 169 GIARLKPALASNSHLHRLGLQNTGINCEGAIILAECIADN--IALLRVDIRDNPIALA-G 225 Query: 322 PPDMYECL 329 ++ + Sbjct: 226 LLALHSAM 233 Score = 24.7 bits (53), Expect = 2.4 Identities = 9/47 (19%), Positives = 16/47 (33%), Gaps = 4/47 (8%) Query: 66 PGIKYLYLRNNQIDHIDEK----AFENVTDLQWLILDHNVLENSKIK 108 + + L NN + A T ++ L + +N L I Sbjct: 125 SSLSAMVLWNNNVTGASMDSLAEALIENTKIETLNIGNNNLGVEGIA 171 Score = 22.8 bits (48), Expect = 8.1 Identities = 11/39 (28%), Positives = 17/39 (43%), Gaps = 4/39 (10%) Query: 66 PGIKYLYLRNNQI----DHIDEKAFENVTDLQWLILDHN 100 + L+L N Q+ + A +N T L+ L L N Sbjct: 36 ASLTCLHLENTQMSGKNLLVLICALKNNTGLRELYLGDN 74 >gi|1881738 [246..599] Leucine-rich repeats Length = 354 Score = 80.2 bits (195), Expect = 5e-17 Identities = 49/328 (14%), Positives = 89/328 (26%), Gaps = 56/328 (17%) Query: 49 SAMYCDELKLKSVPMVPPGIKYLYLRNNQIDH-----IDEKAFENVTDLQWLILDHNVLE 103 +A +EL + + NN I + L + L Sbjct: 21 TASAFEELGQAIAKNRNSALTSIDWSNNLIKDAGVAALAAAVASMGHGLTSISLKGGDAT 80 Query: 104 N------SKIKGRVFSKLKQLKKLHINHNNLTESVGPL-------PKSLEDLQLTHNKIT 150 + + L L++ N L P +L+ L ++ Sbjct: 81 KKGTVALCTAFKKNVEMSRTLTVLNLAGNRLDSDGTSALAAFVSGPNALQTLNISGTAAN 140 Query: 151 KLGS----FEGLVNLTFIHLQHNRLKEDAVSAAFKGLKSLEYLDLSFNQIARLPSGLPVS 206 G L ++ HN++ K L+S L + +P + Sbjct: 141 LEMLLPAVMRGCTELEKFNISHNKVTAKTGPELKKFLQSCGRLSELHMRDTAVPVQVVRD 200 Query: 207 LLTL----------------YLDNNKISNIPDEYFKRFNALQYLRLSHNELADSG---IP 247 ++ +N+ ++ L L+ N+ D G I Sbjct: 201 VIKAIIGNNFITDFQLDLAANKLGVLGANMLAGLAAEITTIKSLDLTDNDFGDEGMSIIA 260 Query: 248 GNSFNVSSLVELDLSYNKLKNIPTV----NENLENYYLEVNQLEKFDIK------SFCKI 297 + SSL EL L N +N +NL L K D+ Sbjct: 261 DGLCHNSSLRELHLGDNWTRNKTKARSQAVDNLIELISSECPLHKLDLSCKVADNQIKTD 320 Query: 298 LGPL-----SYSKIKHLRLDGNRISETS 320 + P + +K L + GN + + Sbjct: 321 ILPFIYSLATNDTLKELDISGNAMGDKV 348 >gi|132575 [1..456] Leucine-rich repeats Length = 456 Score = 69.8 bits (168), Expect = 6e-14 Identities = 54/312 (17%), Positives = 92/312 (29%), Gaps = 49/312 (15%) Query: 71 LYLRNNQI-DHIDEKAFENVTDLQWLILDHNVLENSKIKG--RVFSKLKQLKKLHINHNN 127 L ++ Q+ D + + Q + LD L + K L +L + N Sbjct: 3 LDIQCEQLSDARWTELLPLIQQYQVVRLDDCGLTEVRCKDIRSAIQANPALTELSLRTNE 62 Query: 128 LT--------ESVGPLPKSLEDLQLTHNKITKLG------SFEGLVNLTFIHLQHNRLK- 172 L + + ++ L L + +T+ G L L +HL N L Sbjct: 63 LGDAGVGLVLQGLQNPTCKIQKLSLQNCSLTEAGCGVLPDVLRSLSTLRELHLNDNPLGD 122 Query: 173 ---EDAVSAAFKGLKSLEYLDLSFNQIARLPS-------GLPVSLLTLYLDNNKISNIPD 222 + LE L L + + + L L NN Sbjct: 123 EGLKLLCEGLRDPQCRLEKLQLEYCNLTATSCEPLASVLRVKPDFKELVLSNNDFHEAGI 182 Query: 223 E-----YFKRFNALQYLRLSHNELADSG---IPGNSFNVSSLVELDLSYNKLKNIPTVN- 273 L+ L+L + + + + + +SL ELDL NKL N Sbjct: 183 HTLCQGLKDSACQLESLKLENCGITSANCKDLCDVVASKASLQELDLGSNKLGNTGIAAL 242 Query: 274 --------ENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDM 325 L +L + K C++L +K L L GN + + + Sbjct: 243 CSGLLLPSCRLRTLWLWDCDVTAEGCKDLCRVLRAK--QSLKELSLAGNELKD--EGAQL 298 Query: 326 YECLRVANEVTL 337 + L Sbjct: 299 LCESLLEPGCQL 310 Score = 50.3 bits (118), Expect = 5e-08 Identities = 23/146 (15%), Positives = 41/146 (27%), Gaps = 20/146 (13%) Query: 93 QWLILDHNVLENSKIK--GRVFSKLKQLKKLHINHNNLTESVGP--------LPKSLEDL 142 + L + L + V +K L +L ++ N L +S L L Sbjct: 311 ESLWVKTCSLTAASCPHFCSVLTKNSSLFELQMSSNPLGDSGVVELCKALGYPDTVLRVL 370 Query: 143 QLTHNKITKLG------SFEGLVNLTFIHLQHNRLK----EDAVSAAFKGLKSLEYLDLS 192 L +T G +L + L +N + + + + L+ L L Sbjct: 371 WLGDCDVTDSGCSSLATVLLANRSLRELDLSNNCMGDNGVLQLLESLKQPSCILQQLVLY 430 Query: 193 FNQIARLPSGLPVSLLTLYLDNNKIS 218 +L IS Sbjct: 431 DIYWTDEVEDQLRALEEERPSLRIIS 456 Score = 41.4 bits (95), Expect = 2e-05 Identities = 21/120 (17%), Positives = 36/120 (29%), Gaps = 15/120 (12%) Query: 223 EYFKRFNALQYLRLSHNELADSG----IPGNSFNVSSLVELDLSYNKLKNIPTV------ 272 + ++L L++S N L DSG + + L L L + + Sbjct: 330 SVLTKNSSLFELQMSSNPLGDSGVVELCKALGYPDTVLRVLWLGDCDVTDSGCSSLATVL 389 Query: 273 --NENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDGNRISETSLPPDMYECLR 330 N +L L N + + + L ++ L L T D L Sbjct: 390 LANRSLRELDLSNNCMGDNGVLQLLESL-KQPSCILQQLVLYDIYW--TDEVEDQLRALE 446 >gi|730152 [47..205] Cupredoxins Length = 159 Score = 24.9 bits (54), Expect = 2.2 Identities = 9/36 (25%), Positives = 11/36 (30%) Query: 17 SGQYYDYDFPPSIYGQSSPNCAPECNCPESYPSAMY 52 G F + G +CAP P S M Sbjct: 109 PGDTAVLRFKATKAGVFVYHCAPAGMVPWHVTSGMN 144 >gi|687687 [74..361] Heme-linked catalases Length = 288 Score = 24.8 bits (54), Expect = 2.2 Identities = 1/17 (5%), Positives = 3/17 (16%) Query: 271 TVNENLENYYLEVNQLE 287 T+ + Sbjct: 240 TLTAASPQPGAACEGIN 256 >gi|1173145 [248..328] the C-terminal domain of RNA polymerase alpha subunit Length = 81 Score = 24.5 bits (53), Expect = 2.9 Identities = 11/49 (22%), Positives = 22/49 (44%), Gaps = 5/49 (10%) Query: 258 ELDLS---YNKLK--NIPTVNENLENYYLEVNQLEKFDIKSFCKILGPL 301 +L+L+ N LK I + + ++ +E+ + KS +I L Sbjct: 11 DLELTVRSANCLKAETIHYIGDLVQRTEVELLKTPNLGKKSLTEIKDVL 59 >gi|115682 [1..213] CoA-dependent acetyltransferases Length = 213 Score = 23.3 bits (50), Expect = 5.6 Identities = 22/109 (20%), Positives = 39/109 (35%), Gaps = 15/109 (13%) Query: 213 DNNKISNIPDEYFKRFNALQYLRLS--------HNELADSGIPGNSFNVSSLVELDLSYN 264 + S + YF + + +P N N+SSL + Sbjct: 93 ETETFSALSCRYFPDLSEFMAGYNAVTAEYQHDTRLFPQGNLPENHLNISSLPWVSFDGF 152 Query: 265 KLKNIPTVNENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDG 313 L NI ++Y+ V + KF + ++L P+S ++ H DG Sbjct: 153 NL-NI----TGNDDYFAPVFTMAKF-QQEGDRVLLPVSV-QVHHAVCDG 194 >gi|1788027 [59..564] Heme-linked catalases Length = 506 Score = 22.9 bits (49), Expect = 7.2 Identities = 5/17 (29%), Positives = 9/17 (52%) Query: 271 TVNENLENYYLEVNQLE 287 +N N +N++ E Q Sbjct: 316 VLNRNPDNFFAENEQAA 332 >gi|1742164 [73..325] Periplasmic binding protein-like I Length = 253 Score = 22.6 bits (48), Expect = 8.8 Identities = 13/105 (12%), Positives = 33/105 (31%), Gaps = 15/105 (14%) Query: 215 NKISNIPDEYFKRFNALQYLRLSHN-----ELADSGIPGNSFNVSSLVELDLSYNKLKNI 269 + K + L + +L IP +V + + ++ Sbjct: 37 DLQKCESKIKQKMIKGIIMLSSPADESFFAQLDKYDIP--------VVVIGKVEGQYAHV 88 Query: 270 PTVN-ENLENYYLEVNQLEKFDIKSFCKILGPLSYSKIKHLRLDG 313 +V+ +N + + L + ++ + PL R++G Sbjct: 89 YSVDTDNFGDSIALTDALIESGHQNIACLHAPLDVHVSV-DRVNG 132 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 11 Number of calls to ALIGN: 15 Length of query: 338 Total length of test sequences: 256703 Effective length of test sequences: 207231.0 Effective search space size: 61545361.0 Initial X dropoff for ALIGN: 25.0 bits