analysis of sequence from tem44 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] MPPAQGYEFAAAKGPRDELGPSFPMASPPGLELKTLSNGPQAPRRSAPLGPVAPTREGVENACFSSEEHE THFQNPGNTRLGSSPSPPGGVSSLPRSQRDDLSLHSEEGPALEPVSRPVDYGFVSALVFLVSGILLVVTA YAIPREARVNPDTVTAREMERLEMYYARLGSHLDRCIIAGLGLLTVGGMLLSVLLMVSLCKGELYRRRTF VPGKGSRKTYGSINLRMRQLNGDGGQALVENEVVQVSETSHTLQRS ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > gi|8922927|ref|NP_060824.1| . . . . . 1 MPPAQGYEFAAAKGPRDELGPSFPMASPPGLELKTLSNGPQAPRRSAPLG 50 ____HHHHHHHH___________________EEEE_______________ . . . . . 51 PVAPTREGVENACFSSEEHETHFQNPGNTRLGSSPSPPGGVSSLPRSQRD 100 _______HHHHHHH____________________________________ . . . . . 101 DLSLHSEEGPALEPVSRPVDYGFVSALVFLVSGILLVVTAYAIPREARVN 150 ______________________HHHHH__HHHHHEEEE_EEE________ . . . . . 151 PDTVTAREMERLEMYYARLGSHLDRCIIAGLGLLTVGGMLLSVLLMVSLC 200 __HHHHHHHHHHHHHHHHH_____HHHHH_________HHHHHHHHHHHH . . . . . 201 KGELYRRRTFVPGKGSRKTYGSINLRMRQLNGDGGQALVENEVVQVSETS 250 _____________________HHHHHHHHH______EEEE___EEEE___ 251 HTLQRS 256 ______ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 18.4 % beta-contents : 15.8 % coil-contents : 65.8 % class : mixed method : 2 alpha-contents : 0.0 % beta-contents : 15.3 % coil-contents : 84.7 % class : beta ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -8.05 0.00 0.00 0.00 0.00 0.00 0.00 -0.54 0.00 -6.13 -2.27 -12.00 -12.00 0.00 -12.00 0.00 -52.99 -4.54 0.00 0.00 0.00 0.00 0.00 0.00 -0.04 -0.06 -5.36 -2.27 -12.00 -12.00 0.00 -12.00 0.00 -48.27 ID: gi|8922927|ref|NP_060824.1| AC: xxx Len: 256 1:I 233 Sc: -48.27 Pv: 2.059361e-01 NO_GPI_SITE GPI: learning from protozoa -29.03 0.00 0.00 0.00 -4.00 0.00 0.00 -0.15 0.00 -4.71 -8.43 -12.00 -12.00 0.00 -12.00 0.00 -82.32 -20.14 0.00 0.00 0.00 -4.00 0.00 0.00 -0.55 -0.02 -6.88 -8.43 -12.00 -12.00 0.00 -12.00 0.00 -76.04 ID: gi|8922927|ref|NP_060824.1| AC: xxx Len: 256 1:I 235 Sc: -76.04 Pv: 4.399766e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? gi|8922927| 0.942 143 Y 0.829 143 Y 0.975 187 Y 0.161 N # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? gi|8922927| 0.446 143 N 0.556 143 Y 0.980 132 Y 0.197 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? gi|8922927| 0.875 143 Y 0.705 143 Y 0.976 131 Y 0.213 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] 1-79 MPPAQGYEFAAAKGPRDELGPSFPMASPPG LELKTLSNGPQAPRRSAPLGPVAPTREGVE NACFSSEEHETHFQNPGNT rlgsspsppggvsslprs 80-97 98-179 QRDDLSLHSEEGPALEPVSRPVDYGFVSAL VFLVSGILLVVTAYAIPREARVNPDTVTAR EMERLEMYYARLGSHLDRCIIA glglltvggmllsvllmvsl 180-199 200-256 CKGELYRRRTFVPGKGSRKTYGSINLRMRQ LNGDGGQALVENEVVQVSETSHTLQRS low complexity regions: SEG 25 3.0 3.3 >gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] 1-80 MPPAQGYEFAAAKGPRDELGPSFPMASPPG LELKTLSNGPQAPRRSAPLGPVAPTREGVE NACFSSEEHETHFQNPGNTR lgsspsppggvsslprsqrddlslhseegp 81-120 alepvsrpvd 121-179 YGFVSALVFLVSGILLVVTAYAIPREARVN PDTVTAREMERLEMYYARLGSHLDRCIIA glglltvggmllsvllmvsl 180-199 200-256 CKGELYRRRTFVPGKGSRKTYGSINLRMRQ LNGDGGQALVENEVVQVSETSHTLQRS low complexity regions: SEG 45 3.4 3.75 >gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] 1-38 MPPAQGYEFAAAKGPRDELGPSFPMASPPG LELKTLSN gpqaprrsaplgpvaptregvenacfssee 39-171 hethfqnpgntrlgsspsppggvsslprsq rddlslhseegpalepvsrpvdygfvsalv flvsgillvvtayaiprearvnpdtvtare merlemyyarlgs 172-256 HLDRCIIAGLGLLTVGGMLLSVLLMVSLCK GELYRRRTFVPGKGSRKTYGSINLRMRQLN GDGGQALVENEVVQVSETSHTLQRS low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] MPPAQGYEFAAAKGPRDELGPSFPMASPPGLELKTLSNGPQAPRRSAPLGPVAPTREGVE NACFSSEEHETHFQNPGNTRLGSSPSPPGGVSSLPRSQRDDLSLHSEEGPALEPVSRPVD YGFVSALVFLVSGILLVVTAYAIPREARVNPDTVTAREMERLEMYYARLGSHLDRCIIAG LGLLTVGGMLLSVLLMVSLCKGELYRRRTFVPGKGSRKTYGSINLRMRQLNGDGGQALVE NEVVQVSETSHTLQRS 1 - 256 MPPAQGYEFA AAKGPRDELG PSFPMASPPG LELKTLSNGP QAPRRSAPLG PVAPTREGVE NACFSSEEHE THFQNPGNTR LGSSPSPPGG VSSLPRSQRD DLSLHSEEGP ALEPVSRPVD YGFVSALVFL VSGILLVVTA YAIPREARVN PDTVTAREME RLEMYYARLG SHLDRCIIAG LGLLTVGGML LSVLLMVSLC KGELYRRRTF VPGKGSRKTY GSINLRMRQL NGDGGQALVE NEVVQVSETS HTLQRS low complexity regions: DUST >gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] MPPAQGYEFAAAKGPRDELGPSFPMASPPGLELKTLSNGPQAPRRSAPLGPVAPTREGVE NACFSSEEHETHFQNPGNTRLGSSPSPPGGVSSLPRSQRDDLSLHSEEGPALEPVSRPVD YGFVSALVFLVSGILLVVTAYAIPREARVNPDTVTAREMERLEMYYARLGSHLDRCIIAG LGLLTVGGMLLSVLLMVSLCKGELYRRRTFVPGKGSRKTYGSINLRMRQLNGDGGQALVE NEVVQVSETSHTLQRS ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for gi|8922927|ref|NP_060824.1| sequence: 256 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MPPAQGYEFA AAKGPRDELG PSFPMASPPG LELKTLSNGP QAPRRSAPLG PVAPTREGVE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 NACFSSEEHE THFQNPGNTR LGSSPSPPGG VSSLPRSQRD DLSLHSEEGP ALEPVSRPVD ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 YGFVSALVFL VSGILLVVTA YAIPREARVN PDTVTAREME RLEMYYARLG SHLDRCIIAG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~1333333 3333333333 3333311~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~33333 333333333~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 LGLLTVGGML LSVLLMVSLC KGELYRRRTF VPGKGSRKTY GSINLRMRQL NGDGGQALVE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . NEVVQVSETS HTLQRS ~~~~~~~~~~ ~~~~~~ ---------- ------ ~~~~~~~~~~ ~~~~~~ ~~~~~~~~~~ ~~~~~~ ~~~~~~~~~~ ~~~~~~ ~~~~~~~~~~ ~~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** tem44.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem44.___inter___ (1 sequences) MPPAQGYEFAAAKGPRDELGPSFPMASPPGLELKTLSNGPQAPRRSAPLG PVAPTREGVENACFSSEEHETHFQNPGNTRLGSSPSPPGGVSSLPRSQRD DLSLHSEEGPALEPVSRPVDYGFVSALVFLVSGILLVVTAYAIPREARVN PDTVTAREMERLEMYYARLGSHLDRCIIAGLGLLTVGGMLLSVLLMVSLC KGELYRRRTFVPGKGSRKTYGSINLRMRQLNGDGGQALVENEVVQVSETS HTLQRS (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 122 142 2.242 Certain 2 180 200 2.221 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 Loop length 121 37 56 K+R profile + 10.00 6.00 CYT-EXT prof 0.82 - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -3.00 (NEG-POS)/(NEG+POS): 0.2000 NEG: 15.0000 POS: 10.0000 -> Orientation: N-out CYT-EXT difference: 0.82 -> Orientation: N-out ---------------------------------------------------------------------- "tem44" 256 122 142 #t 2.24167 180 200 #t 2.22083 ************************************ *TOPPREDM with prokaryotic function* ************************************ tem44.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem44.___inter___ (1 sequences) MPPAQGYEFAAAKGPRDELGPSFPMASPPGLELKTLSNGPQAPRRSAPLG PVAPTREGVENACFSSEEHETHFQNPGNTRLGSSPSPPGGVSSLPRSQRD DLSLHSEEGPALEPVSRPVDYGFVSALVFLVSGILLVVTAYAIPREARVN PDTVTAREMERLEMYYARLGSHLDRCIIAGLGLLTVGGMLLSVLLMVSLC KGELYRRRTFVPGKGSRKTYGSINLRMRQLNGDGGQALVENEVVQVSETS HTLQRS (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 122 142 2.242 Certain 2 180 200 2.221 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 Loop length 121 37 56 K+R profile + 10.00 6.00 CYT-EXT prof 0.82 - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -3.00 (NEG-POS)/(NEG+POS): 0.2000 NEG: 15.0000 POS: 10.0000 -> Orientation: N-out CYT-EXT difference: 0.82 -> Orientation: N-out ---------------------------------------------------------------------- "tem44" 256 122 142 #t 2.24167 180 200 #t 2.22083 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem44.___saps___ SAPS. Version of April 11, 1996. Date run: Tue Oct 31 18:38:45 2000 File: /people/maria/tem44.___saps___ ID gi|8922927|ref|NP_060824.1| DE hypothetical protein FLJ11190 [Homo sapiens] number of residues: 256; molecular weight: 27.6 kdal 1 MPPAQGYEFA AAKGPRDELG PSFPMASPPG LELKTLSNGP QAPRRSAPLG PVAPTREGVE 61 NACFSSEEHE THFQNPGNTR LGSSPSPPGG VSSLPRSQRD DLSLHSEEGP ALEPVSRPVD 121 YGFVSALVFL VSGILLVVTA YAIPREARVN PDTVTAREME RLEMYYARLG SHLDRCIIAG 181 LGLLTVGGML LSVLLMVSLC KGELYRRRTF VPGKGSRKTY GSINLRMRQL NGDGGQALVE 241 NEVVQVSETS HTLQRS -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 18( 7.0%); C : 3( 1.2%); D : 7( 2.7%); E : 19( 7.4%); F : 7( 2.7%) G : 26(10.2%); H : 5( 2.0%); I- : 5( 2.0%); K : 5( 2.0%); L : 30(11.7%) M : 7( 2.7%); N : 8( 3.1%); P : 23( 9.0%); Q : 8( 3.1%); R : 21( 8.2%) S : 25( 9.8%); T : 12( 4.7%); V : 20( 7.8%); W : 0( 0.0%); Y : 7( 2.7%) KR : 26 ( 10.2%); ED : 26 ( 10.2%); AGP : 67 ( 26.2%); KRED : 52 ( 20.3%); KR-ED : 0 ( 0.0%); FIKMNY- : 39 ( 15.2%); LVIFM : 69 ( 27.0%); ST : 37 ( 14.5%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 0000000-00 00+00+--00 0000000000 0-0+000000 000++00000 00000+-00- 61 000000--0- 000000000+ 0000000000 00000+00+- -00000--00 00-000+00- 121 0000000000 0000000000 0000+-0+00 0-0000+-0- +0-0000+00 000-+00000 181 0000000000 0000000000 +0-00+++00 000+00++00 00000+0+00 00-000000- 241 0-00000-00 0000+0 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none Negative charge clusters (cmin = 9/30 or 12/45 or 15/60): none Mixed charge clusters (cmin = 14/30 or 19/45 or 24/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 4 | 4 | 6 | 38 | 9 | 9 | 12 | 11 | 11 | 14 | 6 | 8 | lmin1 6 | 6 | 8 | 46 | 11 | 11 | 14 | 13 | 13 | 17 | 8 | 9 | lmin2 7 | 7 | 9 | 51 | 12 | 12 | 16 | 15 | 15 | 19 | 9 | 11 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 1, at 206; - runs >= 3: 0 * runs >= 4: 0 0 runs >= 25: 1, at 176; -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. __________________________________ High scoring hydrophobic segments: 2.00 (LVIFM) 1.00 (AGYCW) 0.00 (BZX) -2.00 (PH) -4.00 (STNQ) -8.00 (KEDR) Expected score/letter: -1.922 M_0.01= 22.86; M_0.05= 18.66 1) From 121 to 143: length= 23, score=21.00 * 121 YGFVSALVFL VSGILLVVTA YAI L: 4(17.4%); A: 3(13.0%); V: 5(21.7%); 2) From 176 to 197: length= 22, score=26.00 ** 176 CIIAGLGLLT VGGMLLSVLL MV L: 7(31.8%); G: 4(18.2%); V: 3(13.6%); ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -3.391 M_0.01= 56.08; M_0.05= 45.35; M_0.30= 32.58 1) From 122 to 143: length= 22, score=68.00 ** 122 GFVSALVFLV SGILLVVTAY AI L: 4(18.2%); A: 3(13.6%); V: 5(22.7%); 2) From 177 to 199: length= 23, score=73.00 ** 177 IIAGLGLLTV GGMLLSVLLM VSL L: 8(34.8%); G: 4(17.4%); V: 3(13.0%); 2. SPACINGS OF C. H2N-62-C-112-C-23-C-56-COOH 2*. SPACINGS OF C and H. (additiona l deluxe function for ALEX) H2N-62-C-5-H-2-H-32-H-66-H-3-C-23-C-50-H-5-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 Aligned matching blocks: [ 27- 30] SPPG [ 86- 89] SPPG ______________________________ [ 80- 83] RLGS [ 168- 171] RLGS B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 23 (Expected range: 5-- 30) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 10 (6-10) 4 (11-20) 7 (>=21) 3 3. Clusters of amino acid multiplets (cmin = 14/30 or 18/45 or 22/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 7 (Expected range: 0-- 12) 3 +plets (f+: 10.2%), 4 -plets (f-: 10.2%) Total number of charge altplets: 7 (Critical number: 14) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 0 (6-10) 2 (11-20) 1 (>=21) 5 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors There are no periodicities of the prescribed length. B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 8) Location Period Element Copies Core Errors 115- 138 4 i... 6 6 0 189- 200 2 i. 6 6 0 -------------------------------------------------------------------------------- SPACING ANALYSIS. There are no unusual spacings. ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: tem44 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: tem44 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: tem44 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Tue Oct 31 18:41:49 2000 Sequence file: tem44 ---------------------------------------- Sequence gi|8922927|ref|NP_060824.1| (256 residues): Matching pattern PS00004 CAMP_PHOSPHO_SITE: 206: RRRT Total matches: 1 Matching pattern PS00005 PKC_PHOSPHO_SITE: 97: SQR 155: TAR 216: SRK Total matches: 3 Matching pattern PS00006 CK2_PHOSPHO_SITE: 65: SSEE 97: SQRD 155: TARE 171: SHLD Total matches: 4 Matching pattern PS00008 MYRISTYL: 58: GVENAC 89: GGVSSL 188: GMLLSV 215: GSRKTY Total matches: 4 Total no of hits in this sequence: 12 ======================================== 1314 pattern(s) searched in 1 sequence(s), 256 residues. Total no of hits in all sequences: 12. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** argv[1]=P argv[2]=-m /data/patterns/own/motif.fa argv[4]=-seq tem44 ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 256 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: tem44 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: tem44 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] (256 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value INSL Insulinase like Metallo protease domain 20 3.4 JAB JAB1 associated domain involved in proteolysis 19 5.2 RASGAP RAS-type GTPase GTP hydrolysis activating protein 19 5.2 DHHC Novel zinc finger domain with DHHC signature 19 5.5 DNASE1 DNASE-1/Sphingomyelinase like domain 19 7.6 ACTIN Actin ATPase/ Cytoskeletal ATPase domain 19 7.7 LRR Leucine rich repeats 19 8.9 >INSL Insulinase like Metallo protease domain Length = 433 Score = 19.9 bits (41), Expect = 3.4 Identities = 4/30 (13%), Positives = 10/30 (33%) Query: 24 PMASPPGLELKTLSNGPQAPRRSAPLGPVA 53 + + L ++ L NG + + Sbjct: 14 QVLTAQELYIRDLPNGAKLIVKPRDDTEAV 43 >JAB JAB1 associated domain involved in proteolysis Length = 136 Score = 19.4 bits (40), Expect = 5.2 Identities = 9/37 (24%), Positives = 17/37 (45%) Query: 133 GILLVVTAYAIPREARVNPDTVTAREMERLEMYYARL 169 +L V ++A+P + D+V + + LE Y Sbjct: 39 KVLDVSNSFAVPFDEDDKDDSVWFLDHDYLENMYGMF 75 >RASGAP RAS-type GTPase GTP hydrolysis activating protein Length = 292 Score = 19.3 bits (39), Expect = 5.2 Identities = 8/17 (47%), Positives = 10/17 (58%) Query: 114 PVSRPVDYGFVSALVFL 130 P +R V Y VS +FL Sbjct: 192 PSNREVRYSVVSGFIFL 208 >DHHC Novel zinc finger domain with DHHC signature Length = 217 Score = 19.3 bits (39), Expect = 5.5 Identities = 13/61 (21%), Positives = 21/61 (34%), Gaps = 6/61 (9%) Query: 83 SSPSPPGGVSSLPRSQRDDLSLHSEEGPALEPVSRPVDYGFVSALVFLVSGILLVVTAYA 142 ++P+ P G S + QR S P ++ L F V G ++V Sbjct: 22 TAPAQPSGPSPELQGQR------SRRNGWSWPPHPLQIVAWLLYLFFAVIGFGILVPLLP 75 Query: 143 I 143 Sbjct: 76 H 76 >DNASE1 DNASE-1/Sphingomyelinase like domain Length = 388 Score = 18.9 bits (38), Expect = 7.6 Identities = 9/93 (9%), Positives = 19/93 (19%), Gaps = 5/93 (5%) Query: 89 GGVSSLPRSQRDDLSLHSEEGPALEPVSRPVDYGFVSALVFLVSG-ILLVVTAYAIPREA 147 G + R L ++ + + L +G L + R Sbjct: 191 DGCALFFLQDRFQLVNSAKIRLSARTLKT-NQVAIAETLQCCETGRQLCFAVTHLKARTG 249 Query: 148 RVNPDTVTAREMERLEMYYARLGSHLDRCIIAG 180 + L + +I Sbjct: 250 WER---FRLAQGSDLLDNLESITQGATVPLIIC 279 >ACTIN Actin ATPase/ Cytoskeletal ATPase domain Length = 376 Score = 19.0 bits (39), Expect = 7.7 Identities = 8/23 (34%), Positives = 10/23 (42%) Query: 32 ELKTLSNGPQAPRRSAPLGPVAP 54 EL+ +S P APL P Sbjct: 95 ELRVVSEEPXVLSXEAPLNPKVN 117 >LRR Leucine rich repeats Length = 339 Score = 18.8 bits (38), Expect = 8.9 Identities = 8/32 (25%), Positives = 9/32 (28%), Gaps = 1/32 (3%) Query: 77 GNTRLGSSPSPPGGVSSLPRSQRDDLSLHSEE 108 N L P P L D SL + Sbjct: 301 ENENLVMPPKPNDARKKL-AFYNIDFSLEHQR 331 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 7 Number of calls to ALIGN: 7 Length of query: 256 Total length of test sequences: 20182 Effective length of test sequences: 16536.0 Effective search space size: 3664766.2 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= gi|8922927|ref|NP_060824.1| hypothetical protein FLJ11190 [Homo sapiens] (256 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|129541 [240..563] N-terminal nucleophile aminohydrolases ... 23 3.8 gi|2384758 [390..431] FAD/NAD(P)-binding domain 23 6.5 >gi|129541 [240..563] N-terminal nucleophile aminohydrolases (Ntn hydrolases) Length = 324 Score = 23.4 bits (50), Expect = 3.8 Identities = 7/26 (26%), Positives = 9/26 (33%) Query: 229 QLNGDGGQALVENEVVQVSETSHTLQ 254 Q DG A NE V+ + Sbjct: 82 QFAEDGRTARFGNEFEPVAWRRDRIA 107 >gi|2384758 [390..431] FAD/NAD(P)-binding domain Length = 42 Score = 22.6 bits (48), Expect = 6.5 Identities = 8/23 (34%), Positives = 12/23 (51%), Gaps = 2/23 (8%) Query: 54 PTREGVENACFSSEEHE--THFQ 74 P E + CF S ++ THF+ Sbjct: 1 PVNEPSLDNCFVSTSYDATTHFE 23 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 2 Number of calls to ALIGN: 2 Length of query: 256 Total length of test sequences: 256703 Effective length of test sequences: 208388.0 Effective search space size: 44829786.3 Initial X dropoff for ALIGN: 25.0 bits