analysis of sequence from T25032.fa ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. MATPTGWQKI LYRKQPFPDN YSGGDAQFLK ELRKNVSVVH YDYKSAVFGC MNFLTHLDMI TMYFVLFLNI LHSNWSINIL YSVFSLTIVL YLFFCKFLIP NPANAKEHAR TIFTLFIFAY AFTPVIRTLT TSISTDTIYS TSIITAIFSC FFHDYGVKAP VVSYPTSVST GLSSAIFLLS RLEGDTPTLL LLVVAFTLHA YGAEFRNRIF HVYPCLSSTI FCFLSLFSIY CISDFSLELS ICFALLHIFI LFICPLILIL KQTGKCTIHG PWDEAVPLKS NT ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > T25032 . . . . . 1 MATPTGWQKILYRKQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGC 50 ______HHHHHHH____________HHHHHHHHH__EEEE____EEEEEE . . . . . 51 MNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIP 100 ___HHHHHHHHHHHHHHHHHH______HHHHH_HHHHHHHHHHHHHH___ . . . . . 101 NPANAKEHARTIFTLFIFAYAFTPVIRTLTTSISTDTIYSTSIITAIFSC 150 _____HHHHHHHHHHHHHHHH____EEE_______________EEEE___ . . . . . 151 FFHDYGVKAPVVSYPTSVSTGLSSAIFLLSRLEGDTPTLLLLVVAFTLHA 200 ________EEEE___________HHHHHHHH______HHHHHHHHHHHHH . . . . . 201 YGAEFRNRIFHVYPCLSSTIFCFLSLFSIYCISDFSLELSICFALLHIFI 250 HHHHHHHHHHH_______EEEHHHHHHHH______HHHHHHHHHHHHHHH . . . 251 LFICPLILILKQTGKCTIHGPWDEAVPLKSNT 282 HH_HHHHHHHH_____________________ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 1.0 % beta-contents : 80.8 % coil-contents : 18.1 % class : beta method : 2 alpha-contents : 0.0 % beta-contents : 91.0 % coil-contents : 9.0 % class : beta ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -19.38 -1.15 -0.29 -0.06 -4.00 0.00 0.00 0.00 -0.01 -6.82 -0.91 -12.00 -12.00 -4.00 -12.00 0.00 -72.62 -20.02 -0.43 0.00 0.00 -4.00 0.00 0.00 0.00 -0.38 -3.75 -0.91 -12.00 -12.00 0.00 -12.00 0.00 -65.48 ID: T25032 AC: xxx Len: 282 1:I 265 Sc: -65.48 Pv: 4.720264e-01 NO_GPI_SITE GPI: learning from protozoa -31.91 -1.25 -0.05 0.00 -4.00 0.00 0.00 0.00 0.00 -5.37 -4.30 -12.00 -12.00 0.00 -12.00 0.00 -82.88 -32.03 -0.41 -0.18 0.00 -4.00 0.00 0.00 0.00 -0.04 -3.36 -4.30 -12.00 -12.00 0.00 -12.00 0.00 -80.33 ID: T25032 AC: xxx Len: 282 1:I 265 Sc: -80.33 Pv: 5.148141e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T25032 0.995 106 Y 0.491 267 Y 0.970 248 Y 0.432 N # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T25032 0.790 106 Y 0.270 106 N 0.961 224 Y 0.299 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T25032 0.954 106 Y 0.519 106 Y 0.989 91 Y 0.346 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. 1-125 MATPTGWQKILYRKQPFPDNYSGGDAQFLK ELRKNVSVVHYDYKSAVFGCMNFLTHLDMI TMYFVLFLNILHSNWSINILYSVFSLTIVL YLFFCKFLIPNPANAKEHARTIFTLFIFAY AFTPV irtlttsistdtiystsiitaifs 126-149 150-244 CFFHDYGVKAPVVSYPTSVSTGLSSAIFLL SRLEGDTPTLLLLVVAFTLHAYGAEFRNRI FHVYPCLSSTIFCFLSLFSIYCISDFSLEL SICFA llhifilficplilil 245-260 261-282 KQTGKCTIHGPWDEAVPLKSNT low complexity regions: SEG 25 3.0 3.3 >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. 1-108 MATPTGWQKILYRKQPFPDNYSGGDAQFLK ELRKNVSVVHYDYKSAVFGCMNFLTHLDMI TMYFVLFLNILHSNWSINILYSVFSLTIVL YLFFCKFLIPNPANAKEH artiftlfifayaftpvirtlttsistdti 109-149 ystsiitaifs 150-214 CFFHDYGVKAPVVSYPTSVSTGLSSAIFLL SRLEGDTPTLLLLVVAFTLHAYGAEFRNRI FHVYP clsstifcflslfsiycisdfslelsicfa 215-260 llhifilficplilil 261-282 KQTGKCTIHGPWDEAVPLKSNT low complexity regions: SEG 45 3.4 3.75 >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. 1-78 MATPTGWQKILYRKQPFPDNYSGGDAQFLK ELRKNVSVVHYDYKSAVFGCMNFLTHLDMI TMYFVLFLNILHSNWSIN ilysvfsltivlylffckflipnpanakeh 79-180 artiftlfifayaftpvirtlttsistdti ystsiitaifscffhdygvkapvvsyptsv stglssaiflls 181-214 RLEGDTPTLLLLVVAFTLHAYGAEFRNRIF HVYP clsstifcflslfsiycisdfslelsicfa 215-260 llhifilficplilil 261-282 KQTGKCTIHGPWDEAVPLKSNT low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. MATPTGWQKILYRKQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLDMI TMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIPNPANAKEHARTIFTLfifay aftpvirtlttsistdtiystsiitaifscffHDYGVKAPVVSYPTSVSTGLSSAIFLLS RLEGDTPTllllvvAFTLHAYGAEFRNRIFHVYPCLSSTIFCFLSLFSIYCISDFSLELS ICFALLHIFILFICPLILILKQTGKCTIHGPWDEAVPLKSNT 1 - 115 MATPTGWQKI LYRKQPFPDN YSGGDAQFLK ELRKNVSVVH YDYKSAVFGC MNFLTHLDMI TMYFVLFLNI LHSNWSINIL YSVFSLTIVL YLFFCKFLIP NPANAKEHAR TIFTL 116 - 152 fifay aftpvirtlt tsistdtiys tsiitaifsc ff 153 - 188 HDYGVKAP VVSYPTSVST GLSSAIFLLS RLEGDTPT 189 - 194 ll llvv 195 - 282 AFTLHA YGAEFRNRIF HVYPCLSSTI FCFLSLFSIY CISDFSLELS ICFALLHIFI LFIC PLILIL KQTGKCTIHG PWDEAVPLKS NT low complexity regions: DUST >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. MATPTGWQKILYRKQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLDMI TMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIPNPANAKEHARTIFTLFIFAY AFTPVIRTLTTSISTDTIYSTSIITAIFSCFFHDYGVKAPVVSYPTSVSTGLSSAIFLLS RLEGDTPTLLLLVVAFTLHAYGAEFRNRIFHVYPCLSSTIFCFLSLFSIYCISDFSLELS ICFALLHIFILFICPLILILKQTGKCTIHGPWDEAVPLKSNT ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for T25032 sequence: 282 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MATPTGWQKI LYRKQPFPDN YSGGDAQFLK ELRKNVSVVH YDYKSAVFGC MNFLTHLDMI ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 TMYFVLFLNI LHSNWSINIL YSVFSLTIVL YLFFCKFLIP NPANAKEHAR TIFTLFIFAY ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 AFTPVIRTLT TSISTDTIYS TSIITAIFSC FFHDYGVKAP VVSYPTSVST GLSSAIFLLS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 RLEGDTPTLL LLVVAFTLHA YGAEFRNRIF HVYPCLSSTI FCFLSLFSIY CISDFSLELS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | ICFALLHIFI LFICPLILIL KQTGKCTIHG PWDEAVPLKS NT ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~ ---------- ---------- ---------- ---------- -- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** T25032.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: T25032.fa.___inter___ (1 sequences) MATPTGWQKILYRKQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGC MNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIP NPANAKEHARTIFTLFIFAYAFTPVIRTLTTSISTDTIYSTSIITAIFSC FFHDYGVKAPVVSYPTSVSTGLSSAIFLLSRLEGDTPTLLLLVVAFTLHA YGAEFRNRIFHVYPCLSSTIFCFLSLFSIYCISDFSLELSICFALLHIFI LFICPLILILKQTGKCTIHGPWDEAVPLKSNT (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 52 72 1.172 Certain 2 79 99 1.882 Certain 3 110 130 1.455 Certain 4 133 153 1.369 Certain 5 160 180 1.414 Certain 6 185 205 1.422 Certain 7 213 233 2.051 Certain 8 240 260 2.321 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 7 8 Loop length 51 6 10 2 6 4 7 6 22 K+R profile 8.00 3.00 1.00 2.00 3.00 0.00 1.00 1.00 0.00 CYT-EXT prof - - - - - - - - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 15.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 0.00 (NEG-POS)/(NEG+POS): -0.2727 NEG: 4.0000 POS: 7.0000 -> Orientation: undecided CYT-EXT difference: 0.00 -> Orientation: undecided ---------------------------------------------------------------------- "T25032" 282 52 72 #t 1.17188 79 99 #t 1.88229 110 130 #t 1.45521 133 153 #t 1.36875 160 180 #t 1.41354 185 205 #t 1.42188 213 233 #t 2.05104 240 260 #t 2.32083 ************************************ *TOPPREDM with prokaryotic function* ************************************ T25032.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: T25032.fa.___inter___ (1 sequences) MATPTGWQKILYRKQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGC MNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIP NPANAKEHARTIFTLFIFAYAFTPVIRTLTTSISTDTIYSTSIITAIFSC FFHDYGVKAPVVSYPTSVSTGLSSAIFLLSRLEGDTPTLLLLVVAFTLHA YGAEFRNRIFHVYPCLSSTIFCFLSLFSIYCISDFSLELSICFALLHIFI LFICPLILILKQTGKCTIHGPWDEAVPLKSNT (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 52 72 1.172 Certain 2 79 99 1.882 Certain 3 110 130 1.455 Certain 4 133 153 1.369 Certain 5 160 180 1.414 Certain 6 185 205 1.422 Certain 7 213 233 2.051 Certain 8 240 260 2.321 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 3 4 5 6 7 8 Loop length 51 6 10 2 6 4 7 6 22 K+R profile 8.00 3.00 1.00 2.00 3.00 0.00 1.00 1.00 0.00 CYT-EXT prof - - - - - - - - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 15.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 0.00 (NEG-POS)/(NEG+POS): -0.2727 NEG: 4.0000 POS: 7.0000 -> Orientation: undecided CYT-EXT difference: 0.00 -> Orientation: undecided ---------------------------------------------------------------------- "T25032" 282 52 72 #t 1.17188 79 99 #t 1.88229 110 130 #t 1.45521 133 153 #t 1.36875 160 180 #t 1.41354 185 205 #t 1.42188 213 233 #t 2.05104 240 260 #t 2.32083 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ SAPS. Version of April 11, 1996. Date run: Thu Feb 28 17:06:56 2002 File: /people/b_eisen/T25032.fa.___saps___ ID T25032 DE hypothetical protein T20D3.8 - Caenorhabditis elegans. number of residues: 282; molecular weight: 32.0 kdal 1 MATPTGWQKI LYRKQPFPDN YSGGDAQFLK ELRKNVSVVH YDYKSAVFGC MNFLTHLDMI 61 TMYFVLFLNI LHSNWSINIL YSVFSLTIVL YLFFCKFLIP NPANAKEHAR TIFTLFIFAY 121 AFTPVIRTLT TSISTDTIYS TSIITAIFSC FFHDYGVKAP VVSYPTSVST GLSSAIFLLS 181 RLEGDTPTLL LLVVAFTLHA YGAEFRNRIF HVYPCLSSTI FCFLSLFSIY CISDFSLELS 241 ICFALLHIFI LFICPLILIL KQTGKCTIHG PWDEAVPLKS NT -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 16( 5.7%); C : 9( 3.2%); D : 9( 3.2%); E- : 6( 2.1%); F++: 28( 9.9%) G : 10( 3.5%); H : 9( 3.2%); I+ : 27( 9.6%); K : 11( 3.9%); L : 36(12.8%) M : 4( 1.4%); N : 10( 3.5%); P : 13( 4.6%); Q- : 4( 1.4%); R : 7( 2.5%) S : 26( 9.2%); T : 24( 8.5%); V : 16( 5.7%); W : 3( 1.1%); Y : 14( 5.0%) KR : 18 ( 6.4%); ED - : 15 ( 5.3%); AGP : 39 ( 13.8%); KRED - : 33 ( 11.7%); KR-ED : 3 ( 1.1%); FIKMNY : 94 ( 33.3%); LVIFM + : 111 ( 39.4%); ST : 50 ( 17.7%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 00000000+0 00++0000-0 0000-0000+ -0++000000 0-0+000000 0000000-00 61 0000000000 0000000000 0000000000 00000+0000 00000+-00+ 0000000000 121 000000+000 00000-0000 0000000000 000-000+00 0000000000 0000000000 181 +0-0-00000 0000000000 000-0+0+00 0000000000 0000000000 000-000-00 241 0000000000 0000000000 +000+00000 00--0000+0 00 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 7/30 or 9/45 or 11/60): none Negative charge clusters (cmin = 6/30 or 8/45 or 10/60): none Mixed charge clusters (cmin = 10/30 or 13/45 or 16/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 4 | 3 | 5 | 65 | 8 | 8 | 10 | 10 | 9 | 12 | 7 | 9 | lmin1 5 | 5 | 6 | 79 | 10 | 9 | 12 | 12 | 12 | 15 | 8 | 11 | lmin2 6 | 6 | 7 | 88 | 11 | 11 | 13 | 14 | 13 | 17 | 10 | 12 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 0 - runs >= 3: 0 * runs >= 3: 0 0 runs >= 43: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. There are no high scoring transmembrane segments. 2. SPACINGS OF C. H2N-49-C-44-C-54-C-64-C-6-C-8-C-10-C-11-C-11-C-16-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-39-H-9-C-5-H-15-H-22-C-12-H-41-C-2-H-45-H-11-H-3-C-6-C-8-C-10-C-4-H-6-C-11-C-2-H-13-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 13 (Expected range: 5-- 31) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 2 (6-10) 3 (11-20) 3 (>=21) 6 3. Clusters of amino acid multiplets (cmin = 9/30 or 12/45 or 14/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 3 (Expected range: 0-- 6) 2 +plets (f+: 6.4%), 1 -plets (f-: 5.3%) Total number of charge altplets: 2 (Critical number: 8) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 0 (6-10) 1 (11-20) 2 (>=21) 1 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 80- 103 6 L..... 4 4 0 189- 192 1 L 4 4 0 232- 276 9 I........ 5 5 ! 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 5) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 47- 100 3 i.. 14 6 4 53- 187 9 i..0...0. 12 6 /3/././5/./././4/./ 60- 71 2 i. 6 6 0 84- 95 2 i0 6 6 /0/2/ 189- 194 1 i 6 6 0 196- 276 9 i...0.... 8 6 /1/./././3/././././ 220- 237 3 i.0 6 6 /0/./2/ 235- 260 2 i. 11 6 2 245- 253 1 i 8 6 1 -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 27- 262 (3.) Q( 235)Q 1 of 5 0.0035 large 1. maximal spacing 153- 199 (3.) H( 46)H 1 of 10 0.9922 small maximal spacing 199- 211 (3.) H( 12)H 10 of 10 0.0068 large minimal spacing 262- 283 (4.) Q( 21)Q 2 of 5 0.9962 small 2. maximal spacing ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- hormone Somatotropin hormone family -1.2 33 1 ATP-synt_8 ATP synthase protein 8 -10.2 10 1 DUF6 Integral membrane protein DUF6 -17.8 3.6 1 Abi CAAX amino terminal protease family -38.1 65 1 UPF0005 Uncharacterized protein family UPF0005 -69.8 22 1 BPD_transp_2 Branched-chain amino acid transport sys -139.8 36 1 UPF0073 Uncharacterised protein family (Hly-III -149.0 23 1 ABC2_membrane ABC-2 type transporter -149.6 52 1 ABC-3 ABC 3 transport family -213.3 62 1 Cyto_ox_2 Cytochrome oxidase subunit II -252.3 22 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- ATP-synt_8 1/1 73 123 .. 1 56 [] -10.2 10 Abi 1/1 79 175 .. 1 119 [] -38.1 65 hormone 1/1 188 198 .. 1 11 [. -1.2 33 UPF0005 1/1 61 234 .. 1 208 [] -69.8 22 DUF6 1/1 118 234 .. 1 126 [] -17.8 3.6 ABC-3 1/1 62 254 .. 1 267 [] -213.3 62 ABC2_membrane 1/1 29 266 .. 1 261 [] -149.6 52 Cyto_ox_2 1/1 59 267 .. 1 363 [] -252.3 22 BPD_transp_2 1/1 29 272 .. 1 370 [] -139.8 36 UPF0073 1/1 62 277 .. 1 287 [] -149.0 23 Alignments of top-scoring domains: ATP-synt_8: domain 1 of 1, from 73 to 123: score -10.2, E = 10 *->MPQLnpspWfliflsswltLliilqlKilshtfpnnpslkktklkkk s+W +++l s ++L i+l l + +++ pn+ ++k ++ ++ T25032 73 ------SNWSINILYSVFSLTIVLYLFFCKFLIPNPANAKEHARTIF 113 pn.pWnWkWT<-* + T T25032 114 TLfIFAYAFT 123 Abi: domain 1 of 1, from 79 to 175: score -38.1, E = 65 *->llilllvllaplaEElfFRGilltalerr.lkkrytlfgpllaiiis +l+ +++l + l lfF +l++ ++++ + r i + T25032 79 ILYSVFSLTIVLY--LFFCKFLIPNPANAkEHAR---------TIFT 114 sliFallHlanalellqllgnvliqpvlinwlqllytfllGlvlgllylr +iFa + ++ ++ l+ ++i+ ++ t ++ ++++ ++ T25032 115 LFIFAYAFTPVIRTLT----------TSISTDTIYSTSIITAIFSCFFH- 153 rtgsLlapilvHalnNligfill<-* g + ap++ + +++g+ + T25032 154 DYG-VKAPVVSYPTSVSTGLSSA 175 hormone: domain 1 of 1, from 188 to 198: score -1.2, E = 33 *->lLLLLlvSnlL<-* +LLLL+v++ L T25032 188 TLLLLVVAFTL 198 UPF0005: domain 1 of 1, from 61 to 234: score -69.8, E = 22 *->alassalvafvrtnfalvylslivslvlmisLmclpskrd.....rn ++++ + +++++n + +l ++sl ++ L+++ ++ +++ + + T25032 61 TMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIPnpanaKE 107 kpvnlillfiFTlltgvtlgpilshYaakaVlmalvIttavfgtavifal ++ +++ lfiF+ + ++ ++ +++++ ++++It+++ +++ +++ T25032 108 HARTIFTLFIFAYAFTPVIRTLTTSISTDTIYSTSIITAIFSCFFHDYGV 157 ltkydlttkkdllflgsmllalfillavvlalivllfflssafmlvyWPF ++ ++ + s++l+ +i f+ls++ + T25032 158 KAPVV-----SYPTSVSTGLSSAI------------FLLSRLEGDTP--- 187 QNallGailfvgylliDtqllmetaekidsnkyeispeey..IfaAlsLY +ll+++++ ++ +++ ++ +++ +++p +++If+ T25032 188 --TLLLLVVAFTLHAYGAE--FR------NRIFHVYPCLSstIFC----- 222 lDiiNlFlsLLrIfGisr<-* FlsL +I is T25032 223 ------FLSLFSIYCISD 234 DUF6: domain 1 of 1, from 118 to 234: score -17.8, E = 3.6 *->fiWalytvfskklle.spltftawrfliagilllilllflkkgppll f++a++ v+ ++++ s+ t+ + +++a + f+ ++ T25032 118 FAYAFTPVIRTLTTSiSTDTIYSTSIITAI-----FSCFFHDYG--- 156 allslk.ilallylgilgtalgyllyfyalkyvsaskasvlsslsPvftl ++ + +++ + g+ + + +l+ + ++ v+ ++ ++ + T25032 157 VKAPVVsYPTSVSTGLSS---AIFLLSRLEGDTPTLLLLVVAFTLHAYGA 203 ilsvllLgEkltlkqllGivlillGvllisl<-* + +++ + l+ + +l+l++++ is T25032 204 EFRNRIFHVYPCLSSTIFCFLSLFSIYCISD 234 ABC-3: domain 1 of 1, from 62 to 254: score -213.3, E = 62 *->qyefmqrAllasilvglacgiLGsFlVLRRqSLmGDAiSHavLpGVA +y ++ +l s +S T25032 62 MYFVLFLNILHS-----------------NWS--------------- 76 LAffLginkSleipliGAflfgliaAvai.gylkrnsrlkeDtaiGIvfs in +l+++f ++++ +++ l n++ + a I + T25032 77 ------IN-----ILYSVFSLTIVLYLFFcKFLIPNPANAKEHARTIFTL 115 sflAlGl..llislikgsnaaskvdLdhyLFGniLgisqqDliqiaiita + +A ++++++ +l+ is + +++++iita T25032 116 FIFAYAFtpVIRTLTT-------------------SISTDTIYSTSIITA 146 iiLlllllfwke..LllitFDpdlAkviGlpvnflkllLl.....iLlal i ++ f+++ + ++ ++ ++ +Gl+ ++ l l++++++Ll l T25032 147 I----FSCFFHDygVKAPVVSYPTSVSTGLSSAIFLLSRLegdtpTLLLL 192 tiVvalqaVGvILViAlLitPAatArlltkslesm..lliAsaiGvvssv +++ l a G+ + ++ ++ ++l + ++ ++s++ T25032 193 VVAFTLHAYGA---------------EFRNRIFHVypCLSSTIFCFLSLF 227 aGlllS.YyfdtatGpvIVLiatllFlisflfa<-* + +S+++++++ ++++ll ++++++ T25032 228 SIYCISdFSLELS------ICFALLHIFILFIC 254 ABC2_membrane: domain 1 of 1, from 29 to 266: score -149.6, E = 52 *->lrslfryreLilqLlkrdiKtRYrgSaLGylWsfLnPLlmilvytfv l +l+++ +++ +d Y+ ++G + fL+ L mi y ++ T25032 29 LKELRKNVSVVH----YD----YKSAVFGCM-NFLTHLDMITMYFVL 66 FsfllrsrlpgddrlnyivylltGllpwqfFsealsrgtssvvanasLik F ++l+s+++ +++ y+v+ lt +l +f + + +++ ++ +++ T25032 67 FLNILHSNWS--INILYSVFSLTIVLYLFFCKFLIPNPANAKEHARTIFT 114 KlnfPreilplsavlselvvnflisliillivlilfgenp...fswnvll f ++p++++l+ +++ ++ +i +i+ ++ + ++ + v++ T25032 115 LFIFAYAFTPVIRTLTTS-ISTDTIYSTSIITAIFSCFFHdygVKAPVVS 163 lpllllllvlfslGlglilsalgvffRDigqilglllqllfflsPifYpl +p + +Gl+ sa+ + R g ++ + l+ + + l T25032 164 YPTSVS------TGLS---SAIFLLSRLEGD------TPTLLLLVVAFTL 198 stiPeqyrsillelNP.....lvhiiesyRdillgGawvpsdlesllyll ++ +++ r+ ++ P ++++++++ ++ +++ + +le + + T25032 199 HAYGAEFRNRIFHVYPclsstIFCFLSLFSIYCISD--F--SLELSICFA 244 lvslillliGlliFrkfekrfa<-* l+ + +l+i li + + T25032 245 LLHIFILFICPLILILKQTGKC 266 Cyto_ox_2: domain 1 of 1, from 59 to 267: score -252.3, E = 22 *->elLplvWfvligvllfgYvvlDGFDlGVGmllpflakdeeERRillN +++Y+vl +l l ++ T25032 59 -------------MITMYFVL---------FLNILHSNW-------- 75 sIGPVWDGNEVWLvlaGGALFAAFPlaYAtllSglYlPl.ilvLvgLiFR ++lY +++++++ L+F T25032 76 ------------------------------SINILYSVFsLTIVLYLFF- 94 GVaFEyRgKiedakWkkvWDwaffiGSlvpalllGvafGnllqGlPFlvd K ++ ++ ++ + + + af+ +++ l+ + T25032 95 -------CKFLIPNP---ANAKEHARTIFTLFIFAYAFTPVIRTLTTSIS 134 adlrtsYaGsswdlLnRPfaLLcGlvlvslyalhGatflalKTeGeLqer d t Y + ++++++++ ++ T25032 135 TD--TIYS-------T---SIITAIFSCFF-------------------- 152 arklArylafvtLvavllvglwllyGiDGyvlvsiDtpatsaplakrVav +++++++ +v+ + gl+ +a+ + + T25032 153 -HDYGVKAPVVSYPTSVSTGLS---------------SAIFLLSRLE--- 183 eigaWwfnfprmpillalpvLgvvafllllvalrrgrygwaFiltlll.i + ++l+l v+++ + + ++ + +r ++++ ++++++ + T25032 184 ----------GDTPTLLLLVVAFTLHAYGAEFRNRIFHVYPCLSSTIFcF 223 alailgagislfPnvmPSsldpaysLTiwnAassplTLkiMLvialiflP ++++ ++ is f S +L i A ++ +++l++ P T25032 224 LSLFSIYCISDF-----S-----LELSICFA--------LLHIFILFICP 255 ivLgYtiwsYwVFRGKis<-* ++L + GK + T25032 256 LILI------LKQTGKCT 267 BPD_transp_2: domain 1 of 1, from 29 to 272: score -139.8, E = 36 *->lsqlilnglv........lgglyailAlgltlvyGvagipnfahGef l++l+ n +v + + ++ ++g + +l ++ + +n+ h+++ T25032 29 LKELRKNVSVvhydyksaVFGCMNFLTHLDMITMYFVLFLNILHSNW 75 yllgaylayvllslglsl...glailylGlvalltaafvglgayvaalll + ++y ++ l ++++ + + l+++ +++ a ++ +++a+++ T25032 76 SINILYSVFSLTIVLYLFfckFLIPN---PANAKEHARTIFTLFIFAYAF 122 kplrlrgllllsvAvitvGvvaaiatliigiilllllgkyleiltgirgl +p+++ + s+ + t++ + i+++++ ++ ++ g+ T25032 123 TPVIRTL--TTSI---------STDTIYSTSIITAIFSCFFHDY-GVKAP 160 sispfvfialefggieiggsyllsvpdaglelsrtvreggWdrfslifga ++s + ++ + + ++++ s+l +++ T25032 161 VVSYPTSVSTGLSSAIFLLSRLEGDTPT---------------------- 188 flspyyrkvilylviLlllaLvlilyrrLlnskfGralrAvredeelAra ll+Lv++++ ++G +r T25032 189 -----------------LLLLVVAFTL----HAYGAEFR----------- 206 lGinvekiklltFaiSsalAGlAGaLyalytGvisPeigdfavlilkafa ++ +++P +++ ++ f+ T25032 207 ---------------------------NRIF-HVYPCLSST----IFCFL 224 ivvLGGlGniyGavlGtlglviglsleviltlyiaylfggplaLivvtvl + ++ i+ +++ l+l+i +++ T25032 225 S-----------------LFSIYCISDFSLELSICFALLH---------- 247 fnFFgivllkdvvflllliavLilkPqGl..fgkke<-* + +l + ++Lilk +G+ +++ ++ T25032 248 -----------IFILFICPLILILKQTGKctIHGPW 272 UPF0073: domain 1 of 1, from 62 to 277: score -149.0, E = 23 *->sSakNAfkkcfkSiFswHNETvNIWtykkekflerlvklsHLlGfil + f++ f++i ++ N ++NI ++f l T25032 62 MY----FVL-FLNILHS-NWSINILY----------------SVFSL 86 Fffllildflfllv.pilasvtshLyilqdrvvfgfftdlcvhdlagWpf ++l +++ fl+++p++a +r++f++f +++ T25032 87 TIVLYLFFCKFLIPnPANAK-------EHARTIFTLF-------IFA--- 119 yfl.gaflCLllSsiyHtfschSlekvsefflkl..DYlGIsllIvaSfi y +++++ L S + t++++S + +f+++ +DY G + v S+ T25032 120 YAFtPVIRTLTTSISTDTIYSTS--IITAIFSCFfhDY-GVKAPVV-SYP 165 piiYyaFychpffrtlYisiilvLGliaiyvslsdkFsspkfRkrRvplR + ++++ ++ +f+ + + +L l+++++ l+ + + R + T25032 166 TSVSTGLSSAIFLLSRLEGDTPTLLLLVVAFTLHAYGAEFRNR----IFH 211 agfFvllglsGviPllHalilfgghenlkvrialpwvllmallYivGavf + ++ + ++ ++l+ +++ +++ + + + ll ++ +++ ++ T25032 212 VYPCLSSTIFCFLSLFSIYCISDFSLE----LSICFALLHIFILFICPLI 257 YgtRIPERffrCPHaGKFDivGhSHQlFHvlVVlaafcHyravl<-* ++ + GK+ i G + av T25032 258 LILK-----Q--T--GKCTIHGP---------------WDEAVP 277 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- PSI_8 Photosystem I reaction centre subunit VIII 1.7 99 1 DsbB Disulfide bond formation protein DsbB 1.6 26 1 ATP-synt_8 ATP synthase protein 8 1.4 27 1 UPF0073 Uncharacterised protein family (Hly-III / 1.0 39 1 UPF0118 Domain of unknown function DUF20 0.1 64 1 hormone Somatotropin hormone family -1.2 33 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- ATP-synt_8 1/1 73 114 .. 7 48 .. 1.4 27 UPF0118 1/1 109 129 .. 1 24 [. 0.1 64 hormone 1/1 188 198 .. 1 11 [. -1.2 33 UPF0073 1/1 217 229 .. 1 17 [. 1.0 39 PSI_8 1/1 241 250 .. 16 25 .] 1.7 99 DsbB 1/1 239 263 .. 140 164 .] 1.6 26 Alignments of top-scoring domains: ATP-synt_8: domain 1 of 1, from 73 to 114: score 1.4, E = 27 *->spWfliflsswltLliilqlKilshtfpnnpslkktklkkkp<-* s+W +++l s ++L i+l l + +++ pn+ ++k ++ ++ + T25032 73 SNWSINILYSVFSLTIVLYLFFCKFLIPNPANAKEHARTIFT 114 UPF0118: domain 1 of 1, from 109 to 129: score 0.1, E = 64 *->llliliflllilafipfinvietl<-* +++i++++++++af+p+i+ tl T25032 109 ARTIFTLFIFAYAFTPVIR---TL 129 hormone: domain 1 of 1, from 188 to 198: score -1.2, E = 33 *->lLLLLlvSnlL<-* +LLLL+v++ L T25032 188 TLLLLVVAFTL 198 UPF0073: domain 1 of 1, from 217 to 229: score 1.0, E = 39 *->sSakNAfkkcfkSiFsw<-* sS +++cf+S+Fs+ T25032 217 SS----TIFCFLSLFSI 229 PSI_8: domain 1 of 1, from 241 to 250: score 1.7, E = 99 *->itMalLFlYI<-* i +alL+++I T25032 241 ICFALLHIFI 250 DsbB: domain 1 of 1, from 239 to 263: score 1.6, E = 26 *->itmpqlslvaFililivlvLllisk<-* +++ +++l +Fil+++ l L+l++ T25032 239 LSICFALLHIFILFICPLILILKQT 263 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Thu Feb 28 17:08:55 2002 Sequence file: T25032.fa ---------------------------------------- Sequence T25032 (282 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 20: NYSG 35: NVSV 74: NWSI Total matches: 3 Matching pattern PS00005 PKC_PHOSPHO_SITE: 263: TGK Total matches: 1 Matching pattern PS00006 CK2_PHOSPHO_SITE: 22: SGGD 55: THLD 180: SRLE Total matches: 3 Matching pattern PS00008 MYRISTYL: 171: GLSSAI Total matches: 1 Total no of hits in this sequence: 8 ======================================== 1314 pattern(s) searched in 1 sequence(s), 282 residues. Total no of hits in all sequences: 8. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >ER-GOLGI-traffic signal is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M minimal SH3 binding is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name >T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. ;LENGTH=282; DIRECT_SEQUENCE n 1 solutions m %_D 185-185 %_XXXL 186-189 %_L 190-190 f >STATISTICS Total : 1 solutions in 1 sequences, 282 units; out of 1 sequences, 282 units >TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >EXTRACELL-M minimal furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >EXTRACELL-M extended furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >EXTRACELL-M zinc binding motif in MMPs is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >EXTRACELL-M g alpha binding go loco is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SV40 LrgT experimentally determined is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS Max experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >PDZ domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units >WW domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 282 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB+ PSSM from other authors IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. (282 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value UBHYD Ubiquitin C-terminal hydrolase domain 22 1.3 PDE cyclic NMP phosphodiesterase domain 20 3.6 ACYC Adenylyl/Guanylyl cyclase domain 20 4.0 AP2 A plant specific DNA binding domain (Apetala 2 like) 20 4.2 DHHC Novel zinc finger domain with DHHC signature 19 7.7 VWA Von Willebrand factor A domain 19 9.9 >UBHYD Ubiquitin C-terminal hydrolase domain Length = 884 Score = 21.5 bits (45), Expect = 1.3 Identities = 4/71 (5%), Positives = 4/71 (5%), Gaps = 11/71 (15%) Query: 48 FGCMNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIPNPANAKE 107 Sbjct: 27 MPTEGDDSSKSVPLALQRVFYELQHSDKPVG------TKKLTKSFGWETLDSFMQHDVQE 80 Query: 108 HARTIFTLFIF 118 Sbjct: 81 -----LCRVLL 86 >PDE cyclic NMP phosphodiesterase domain Length = 350 Score = 20.1 bits (41), Expect = 3.6 Identities = 19/81 (23%), Positives = 19/81 (23%), Gaps = 1/81 (1%) Query: 104 NAKEHARTIFTLFIFAYAFTPVIRTLTTSI-STDTIYSTSIITAIFSCFFHDYGVKAPVV 162 Sbjct: 57 NALYRKNNRYHNFTHAFDVTQTVYTFLTSFNAAQYLTHLDIFALLISCMCHDLNHPGFNN 116 Query: 163 SYPTSVSTGLSSAIFLLSRLE 183 Sbjct: 117 TFQVNAQTELSLEYNDISVLE 137 Score = 18.5 bits (37), Expect = 9.9 Identities = 6/15 (40%), Positives = 6/15 (40%) Query: 48 FGCMNFLTHLDMITM 62 Sbjct: 86 FNAAQYLTHLDIFAL 100 >ACYC Adenylyl/Guanylyl cyclase domain Length = 244 Score = 20.2 bits (42), Expect = 4.0 Identities = 5/94 (5%), Positives = 5/94 (5%), Gaps = 25/94 (26%) Query: 29 LKELRKNVSVVHYDYKSAVFGCMNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTI 88 Sbjct: 31 VLHERRLITVLVADMRNFTGMAQQ-VEEELLSML--------------IGNWFRQAGHIL 75 Query: 89 VLY----------LFFCKFLIPNPANAKEHARTI 112 Sbjct: 76 REAGSWVDKYIGDAVMAIWFHGYNEATPAEIIQI 109 >AP2 A plant specific DNA binding domain (Apetala 2 like) Length = 218 Score = 20.1 bits (41), Expect = 4.2 Identities = 6/14 (42%), Positives = 6/14 (42%) Query: 16 PFPDNYSGGDAQFL 29 Sbjct: 189 PYPPQWSEGDYQMI 202 >DHHC Novel zinc finger domain with DHHC signature Length = 217 Score = 18.9 bits (38), Expect = 7.7 Identities = 10/64 (15%), Positives = 10/64 (15%), Gaps = 3/64 (4%) Query: 74 NWSINILYSVFSLTIVLYLFFCKFLIPNPANAKEHARTIFTLFIFAYAFTPVIRTLTTSI 133 Sbjct: 45 SWPPHPL-QIVAWLLYLFFAVIGFGILVPLLPHHWVPAGYACMGAIFAG--HLVVHLTAV 101 Query: 134 STDT 137 Sbjct: 102 SIDP 105 >VWA Von Willebrand factor A domain Length = 255 Score = 18.7 bits (38), Expect = 9.9 Identities = 7/39 (17%), Positives = 7/39 (17%), Gaps = 4/39 (10%) Query: 23 GGDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLDMIT 61 Sbjct: 197 SAQVAICKEL---CKATNYGD-ESFYKILLDETHLKELF 231 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 6 Number of calls to ALIGN: 7 Length of query: 282 Total length of test sequences: 20182 Effective length of test sequences: 16941.0 Effective search space size: 4264911.7 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. (282 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|2136468 [19..195] Lipocalins 27 0.56 gi|120277 [2..148] Flavodoxin-like 25 1.7 gi|113882 [84..473] Serpins 24 3.7 gi|1197643 [1..285] NAD(P)-binding Rossmann-fold domains 24 4.1 gi|121213 [9..130] Globin-like 23 4.6 gi|2118666 [36..183] 4-helical cytokines 23 6.5 gi|114545 [80..382] P-loop containing nucleotide triphosphat... 23 6.6 gi|547785 [449..624] Protein kinases (PK), catalytic core 23 6.9 gi|1065375 [211..605] Aldehyde ferredoxin oxidoreductase, C-... 23 7.6 gi|135013 [1..310] Periplasmic binding protein-like II 23 7.6 gi|2842409 [43..367] beta/alpha (TIM)-barrel 23 9.2 >gi|2136468 [19..195] Lipocalins Length = 177 Score = 26.5 bits (58), Expect = 0.56 Identities = 10/53 (18%), Positives = 10/53 (18%), Gaps = 3/53 (5%) Query: 9 KILYRKQPFPDNYS---GGDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLD 58 Sbjct: 74 VGTFTDTEDPAKFKMKYWGVASFLQKGNDDHWIIDTDYDTYAVQYSCRLLNLD 126 >gi|120277 [2..148] Flavodoxin-like Length = 147 Score = 24.9 bits (54), Expect = 1.7 Identities = 8/57 (14%), Positives = 8/57 (14%) Query: 42 DYKSAVFGCMNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFL 98 Sbjct: 48 GFDLVLLGCSTWGDDSIELQDDFIPLFDSLEETGAQGRKVACFGCGDSSYEYFCGAV 104 >gi|113882 [84..473] Serpins Length = 390 Score = 23.7 bits (51), Expect = 3.7 Identities = 11/86 (12%), Positives = 11/86 (12%), Gaps = 8/86 (9%) Query: 128 TLTTSISTDTIYSTSIITAIFSCFFHDYGVKAPVVSYPTSVSTGLSSAIFLLSRLEGDTP 187 Sbjct: 211 DNSTSVSVPMLSGTGNF-----QHWSDAQNNFSVTRVPL---GESVTLLLIQPQCASDLD 262 Query: 188 TLLLLVVAFTLHAYGAEFRNRIFHVY 213 Sbjct: 263 RVEVLVFQHDFLTWIKNPPPRAIRLT 288 >gi|1197643 [1..285] NAD(P)-binding Rossmann-fold domains Length = 285 Score = 23.8 bits (50), Expect = 4.1 Identities = 14/165 (8%), Positives = 14/165 (8%), Gaps = 6/165 (3%) Query: 24 GDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLDMITMYFVL-FLNILHSNWSINILYS 82 Sbjct: 52 LDEKNLCSLVENINPDIVIHIASLTSVTHDYSTIENLLRSNIEFPTKLLEAMEVAGVKKF 111 Query: 83 VFSLTIVLYLFFCKFLIPNPANAKEHARTIFTLFIFAYAFTPVIRTLTTSISTDTIYSTS 142 Sbjct: 112 INTGTTWQNYNSADYEPVNLYAATKQAFEDILKYYIFAKGFSSITLRLFDTYGPNDTRKK 171 Query: 143 IITAIFSCFFH----DYGVKAPVVSYPTSVSTGLSSAIFLLSRLE 183 Sbjct: 172 LIPLLDRLAETKESLDMSEGNQEIEL-VHINDICSAYKTAILKLQ 215 >gi|121213 [9..130] Globin-like Length = 122 Score = 23.4 bits (50), Expect = 4.6 Identities = 3/15 (20%), Positives = 3/15 (20%) Query: 200 AYGAEFRNRIFHVYP 214 Sbjct: 15 EDGLKFYQTLFDSNS 29 >gi|2118666 [36..183] 4-helical cytokines Length = 148 Score = 22.9 bits (48), Expect = 6.5 Identities = 22/77 (28%), Positives = 22/77 (28%), Gaps = 7/77 (9%) Query: 8 QKILYRKQPFPDNYSGGDA-QFLKELRKNV-SVVHYDYKSAVFGCMNF-----LTHLDMI 60 Sbjct: 55 QDIIDETMRFKDNTPNRNATERLQELSNNLNSCFTKDYEEQNKACVRTFHETPLQLLEKI 114 Query: 61 TMYFVLFLNILHSNWSI 77 Sbjct: 115 KNFFNETKNLLEKDWNI 131 >gi|114545 [80..382] P-loop containing nucleotide triphosphate hydrolases Length = 303 Score = 23.1 bits (48), Expect = 6.6 Identities = 9/143 (6%), Positives = 9/143 (6%), Gaps = 6/143 (4%) Query: 11 LYRKQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLDMITMYFVLFLNI 70 Sbjct: 93 NIAKGHGGLSVFAGVGERTREGNDLLREMLESGIIKYGDDFMHSMEEGGWDLSKVDKSVM 152 Query: 71 LHSNWSINILYSVFS------LTIVLYLFFCKFLIPNPANAKEHARTIFTLFIFAYAFTP 124 Sbjct: 153 KDSKATFVFGQMNEPPGARARVALSGLTIAEYFRDGAGEGQGKDVLFFVDNIFRFTQAGS 212 Query: 125 VIRTLTTSISTDTIYSTSIITAI 147 Sbjct: 213 EVSALLGRMPSAVGYQPTLATEM 235 >gi|547785 [449..624] Protein kinases (PK), catalytic core Length = 176 Score = 22.7 bits (47), Expect = 6.9 Identities = 10/89 (11%), Positives = 10/89 (11%), Gaps = 2/89 (2%) Query: 14 KQPFPDNYSGGDAQFLKELRKNVSVVHYDYKSAVFGCMNFLTHLDMITMYFV--LFLNIL 71 Sbjct: 30 KIVKFKGHQNIKKHVLREVAIWRTLKHNRILPLLDWKLDDNYAMYCLTERINDGTLYDLV 89 Query: 72 HSNWSINILYSVFSLTIVLYLFFCKFLIP 100 Sbjct: 90 ISWDEFKRSKIPFAERCRLTIFLSLQLLS 118 >gi|1065375 [211..605] Aldehyde ferredoxin oxidoreductase, C-terminal domains Length = 395 Score = 22.6 bits (48), Expect = 7.6 Identities = 7/33 (21%), Positives = 7/33 (21%) Query: 141 TSIITAIFSCFFHDYGVKAPVVSYPTSVSTGLS 173 Sbjct: 275 TALIDSAGLCLFTTFGLGADDYRDLLNAALGWD 307 >gi|135013 [1..310] Periplasmic binding protein-like II Length = 310 Score = 22.7 bits (48), Expect = 7.6 Identities = 8/36 (22%), Positives = 8/36 (22%), Gaps = 1/36 (2%) Query: 8 QKILYRKQPF-PDNYSGGDAQFLKELRKNVSVVHYD 42 Sbjct: 138 YLAAWGYALHHNNNDQAKAEDFVKALFKNVEVLDSG 173 >gi|2842409 [43..367] beta/alpha (TIM)-barrel Length = 325 Score = 22.6 bits (48), Expect = 9.2 Identities = 3/42 (7%), Positives = 3/42 (7%) Query: 201 YGAEFRNRIFHVYPCLSSTIFCFLSLFSIYCISDFSLELSIC 242 Sbjct: 93 PWGDCQAYFNNGFSSYKDFGFGPGTTYNGGSQEDCFKESHPN 134 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 11 Number of calls to ALIGN: 11 Length of query: 282 Total length of test sequences: 256703 Effective length of test sequences: 214185.0 Effective search space size: 52710686.0 Initial X dropoff for ALIGN: 25.0 bits ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ calculation of internal repeats with prospero ***** PROSPERO v1.3 Thu Feb 28 17:09:29 2002 ***** Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford For help see http://www.well.ox.ac.uk/ariadne For usage use -help using gap penalty 11+1k using matrix BLOSUM62 printing all alignments with eval < 0.100000 using sequence1 T25032 using self-comparison ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ TIGRFAM hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR01065 hlyIII: channel protein, hemolysin III fami -146.5 97 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR01065 1/1 51 229 .. 1 224 [] -146.5 97 Alignments of top-scoring domains: TIGR01065: domain 1 of 1, from 51 to 229: score -146.5, E = 97 *->eEiaNaiTHg..iGavlsiialalLviyavdhgavavvgfsiYGiSL +N TH++ i + +++ +l + ++++ +s++ + + T25032 51 ---MNFLTHLdmITMYFVLFLNILHSNWSINIL------YSVFSL-T 87 ilLFlvSTlYHsipWkgskWPCLWqlrtaknwlrkiDHs.mIY.VLI.AG i+L+l+ + + + + ++ ak H+++I++++I A T25032 88 IVLYLFFCKFLIPN-P-AN---------AKE------HArTIFtLFIfAY 120 TYTPflllanldgplgwtvlviiWglAilGIilklffh....krpfrwLs TP+ ++ + ++ t+ ++ i+ i+ ffh+ + k+p ++s T25032 121 AFTPVIRTL-TTSISTDTIY----STSIITAIFSCFFHdygvKAP--VVS 163 lvlYLvMGWlvvlvikplyenLpgaglvlLalGGllYtvGaifYalkwki ++ + G++ ++++ + +e +++ l+l ++ t+ a Y+++ + T25032 164 YPTSVSTGLSSAIFLLSRLEGDTPTLLLLVVA----FTLHA--YGAEFRN 207 PPRRLGnfntfgyHaIWHlFVlgASacHfvailfyv<-* + I+H + S ++ lf++ T25032 208 R--------------IFHVYPCLSSTIFCFLSLFSI 229 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm-f Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR00353 nrfE: c-type cytochrome biogenesis protein -1.5 43 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR00353 1/1 222 238 .. 1 17 [. -1.5 43 Alignments of top-scoring domains: TIGR00353: domain 1 of 1, from 222 to 238: score -1.5, E = 43 *->AfgvLvYsFaVnDFsVe<-* f+ L+ + ++DFs+e T25032 222 CFLSLFSIYCISDFSLE 238 // SMART hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/iprscan/data/smart.HMMs Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- LRRcap occurring C-terminal to leucine-rich repeats 2.6 1e+02 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- LRRcap 1/1 40 58 .. 1 19 [] 2.6 1e+02 Alignments of top-scoring domains: LRRcap: domain 1 of 1, from 40 to 58: score 2.6, E = 1e+02 *->laqYRekVirlLPqLrqLD<-* + +Y ++V+ + +L+ LD T25032 40 HYDYKSAVFGCMNFLTHLD 58 // COG hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG0671 -26.1 18 1 COG1563 -36.4 49 1 COG1687 -69.9 45 1 COG2891 -89.7 70 1 COG0705 -93.0 74 1 COG3371 -93.8 21 1 COG0839 -97.9 59 1 COG1684 -171.9 42 1 COG1108 -172.7 20 1 COG1173 -183.0 9 1 COG1968 -199.3 63 1 COG0395 -207.7 80 1 COG0109 -246.7 74 1 COG0601 -261.6 83 1 COG0733 -355.3 44 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG1563 1/1 45 131 .. 1 93 [] -36.4 49 COG2891 1/1 28 175 .. 1 173 [] -89.7 70 COG0839 1/1 55 208 .. 1 192 [] -97.9 59 COG1687 1/1 125 229 .. 1 118 [] -69.9 45 COG0705 1/1 46 234 .. 1 243 [] -93.0 74 COG0109 1/1 24 236 .. 1 331 [] -246.7 74 COG0671 1/1 63 245 .. 1 244 [] -26.1 18 COG1968 1/1 56 253 .. 1 285 [] -199.3 63 COG1108 1/1 50 260 .. 1 292 [] -172.7 20 COG0601 1/1 77 265 .. 1 356 [] -261.6 83 COG1173 1/1 83 265 .. 1 259 [] -183.0 9 COG0733 1/1 11 268 .. 1 491 [] -355.3 44 COG3371 1/1 116 270 .. 1 217 [] -93.8 21 COG0395 1/1 106 280 .. 1 304 [] -207.7 80 COG1684 1/1 59 282 .] 1 272 [] -171.9 42 Alignments of top-scoring domains: COG1563: domain 1 of 1, from 45 to 131: score -36.4, E = 49 *->parkallffl..ldeiilyvivilllLaAllaliqRDLLkAvIasgv a+ ++fl++ld i y + +l +L + ++ I+ +v T25032 45 SAVFGCMNFLthLDMITMYFVLFLNILHSNWSI--------NILYSV 83 lsllia..llyylLlAPDVAlTEAaVGaglstalfavavrKterleeevk +sl i+ l + l P A A a + +lf +a + t++++ T25032 84 FSLTIVlyLFFCKFLIPNPA--NAKEHARTIFTLFIFAYAFTPVIRTLTT 131 <-* T25032 - - COG2891: domain 1 of 1, from 28 to 175: score -89.7, E = 70 *->mmtRfilnrwvilvsfllalvLqlvpwPyfvsyqvlrPdwLlLvLly ++ ++ +++ + ++v++ + + + ++++ +++l + T25032 28 FLKE-LRKNVSVVHYDYKSAVFGCMNFL--THLDMITMYFVLFL--- 68 wvlalphrVGigtgfimGllfDllyGslLGvhalglsiigYlvaknylrL ++l+ + +i+++ ++ l+i Yl+ ++l+ T25032 69 NILHSNWSINILYS------------------VFSLTIVLYLFFCKFLI- 99 rnn.vllplwqqaLvvillvflvlvLiflvellimnysgFvldrFsptll +n + ++ ++++ + +i + +i + + t++ T25032 100 PNPaNAKEHARTIFTLFIFAYAFTPVIRTLTTSISTDT------IYSTSI 143 lnillailLw.....PwvflLLrkvrkelrvr<-* + + +++++ + + +++v +v ++l + T25032 144 ITAIFSCFFHdygvkAPVVSYPTSVSTGLSSA 175 COG0839: domain 1 of 1, from 55 to 208: score -97.9, E = 59 *->mmietlaFylfaalaiasalgVVlakNpvYSalyLilsllsiAglff ++++ y++ +l i+ + + N++YS++ L ++l ++ f+ T25032 55 THLDMITMYFVLFLNILHSNWSI---NILYSVFSLTIVLYLFFCKFL 98 l...lgAeFlgvlqviVYvGAVmVLflFvvMmlnigveeikeeerrylss ++++ +A + + + Lf+F+ ++++ i++ + + T25032 99 IpnpANAKEHART--------IFTLFIFAYAFTPV----IRTLTTSIS-- 134 ivgllvapvaLvlililsivyilialpsplgiaaadinktgNlkaiGavL + ++ ++ ++ ++i+s + ++ + +p+++ ++ ++ + +++ + T25032 135 TDTIYSTSII---TAIFSCFFHDYGVKAPVVSYPTSVS-----TGLSSAI 176 FtdYLlpFElas...vLLLvAmVgAIaLarrkrlgrkktdrkkqdekqkt +l l +++++LLL +V+A++L + +++++ T25032 177 ----FLLSRLEGdtpTLLL--LVVAFTL-----HA--------YGAEFRN 207 k<-* + T25032 208 R 208 COG1687: domain 1 of 1, from 125 to 229: score -69.9, E = 45 *->mtMtLieqiitIlvi...IlttyfTRllPFmVFpskkPipdYVrYLG + tL+ +i t + +++I+t f + F + +k P Y T25032 125 VIRTLTTSISTDTIYstsIITAIFSCF--FHDYGVKAPVVSYPTSVS 169 kvLpcaviGmLVvYCfKdIeiLkGphG.IpelvAalsViLLHlwkKnmlL L++a++ +L+ G+ p l+ ++ LH + T25032 170 TGLSSAIF------------LLSRLEGdTPTLLLLVVAFTLHAYGAEFRN 207 SIalGTilYMvLVQlvflEKAFFNi<-* I +Y L +f F i T25032 208 RIFH---VYPCLSSTIFCFLSLFSI 229 COG0705: domain 1 of 1, from 46 to 234: score -93.0, E = 74 *->avliaplkfvqqrrtrptaplvtllliiltalvffislfaldrvisl av++ ++ + +++ ++ l+ + l + + +++s f l + T25032 46 AVFGCMNFLTHLDMITMYFVLFLNILHSNWSINILYSVFSLTI---V 89 lllsrqeqllllllpnpaladgqwwRvrvslitliTsmFLHagflHLlfN l+l+ + +++p a+ + t++T T25032 90 LYLF------FCKFLIPNPANAKEH-----ARTIFT-------------- 114 mlaLwiFGtalErilGssrflllyllsgllgglanlaqyllsgpaygana ++ + + +t++ r+l +s + + ++ +++++++ ++ +a T25032 115 LFIFAYAFTPVIRTLTTSISTDTIYSTSIIT---AIFSCFFHDYGV--KA 159 s.....faphlGASGaiFGllGallllgpfarigllllliipvvlllali + + +++++ G+S aiF ll ++ + +ll+ + T25032 160 PvvsypTSVSTGLSSAIF-------LLSRLEGD------TPTLLLLVVAF 196 alvfisiwaelpgiswagVAhlAHLgGlivGlllgyllsrkkrrrryllr l ++++ + +++ ++ +l + + + y ++ T25032 197 TLHAYGAE---FRNRIFH----------VYPCLSSTIFCFLSLFSIYCIS 233 s<-* + T25032 234 D 234 COG0109: domain 1 of 1, from 24 to 236: score -246.7, E = 74 *->lvdplvrksarssiaisesarvkasqqstlKdYlqLTKPriisLlli +d ++k+ r ++++ ++ a ++++l + L +i T25032 24 -GDAQFLKELRKNVSVVHYDYKSAVFG--CMNFL-------THLDMI 60 TtlgGmfLAsrglg.....ltgsvdplllvltliGgsLaaAsacalNnyi T++ fL + ++ + + l +++ ++++++ + +L T25032 61 TMYFVLFLNILHSNwsiniLYSVFSLTIVLYLFFCKFLIP---------- 100 DRDIDakMaRTrkRPlVtGkisPrnALaFGlvLgvlGlaiLaafvNplaA +P+nA ++ ++++l +++++af+ ++ T25032 101 ---------------------NPANAKEHARTIFTL-FIFAYAFTPVIRT 128 vLglaGlffYvvlYTlwlKRrtpqNiviGGlaGAmPPLiGWaAVtGsisi +++ ++ t++++ i +t +s T25032 129 LTTSIST-------------DTIYSTSI---------------ITAIFSC 150 gawllLfaIiFlWqPPHFWaLAifrkdDYrrAgIPmLPVvkGeevTkrqI ++ DY g+ k T25032 151 FFH------------------------DY---GV------------KAPV 161 llYtialfavslllpyllGlggyilYlvvAlvlg....awfLflAiklyr + Y + s +++Gl++ ++++ + g+++++ +L+ A+ l+ T25032 162 VSY-----PTS----VSTGLSSA--IFLLSRLEGdtptLLLLVVAFTLHA 200 qqrindaddrkwArklFkySiiYLal...lfvalvidsflvlllll<-* + ++ + ++ F+ +Y +l++++f++l+++s +++ T25032 201 -----YGAE--FRNRIFH---VYPCLsstIFCFLSLFSIYCISDFS 236 COG0671: domain 1 of 1, from 63 to 245: score -26.1, E = 18 *->slllvekaipflpgiilallgilllalfallalllffslredllrld ++ l ++ + ++ i ++ l++++ l+ + ++ + T25032 63 YFVLFLNILHSNWSINILYSVFSLTIVLYLFFCKFLIP--------- 100 ssvfdplilnplllrflallraifgvaflvlsmlmldpynnslilglalv + +++ ++r +++ +if++af T25032 101 -----NPANAKEHART-IFTLFIFAYAFTPVI------------------ 126 lairlalglltlivalrlllflllvltlllknlhgqlrslprlrarpipl +tl+ ++ ++++ ++++ + + ++ + +++p+ T25032 127 ---------RTLTTSISTDTIYSTSIITAI------FSCFFHDYGVKAPV 161 aglapvswvqhwklasgfsFPSgHaagafafalllalllp......lgli ++ P ++g++++ +ll++l +++++ l ++ T25032 162 VS-----------------YPTSVSTGLSSAIFLLSRLEGdtptllLLVV 194 llllallvglsRvylGvHypsDVlgGallgillallvlalyrllapfllr ++ l + + R + + yp+ l +++ +l ++ ++++ + + T25032 195 AFTLHAYGAEFRNRIFHVYPC--LSSTIFCFLSLFSIYCISDFSLELSIC 242 far<-* fa T25032 243 FAL 245 COG1968: domain 1 of 1, from 56 to 253: score -199.3, E = 63 *->mmdlmslfqAliLGiveGlTEFLPISSTGHLilvshlLgfiwedelg ++d+++ +++L + ++L+ w++ + T25032 56 HLDMITM--YFVL--------------------FLNILHSNWSI--N 78 ksFeivIQlGsiLAVvlyFrkdilrllkgfikyrltkreeskdfvlgliy + ++ s+ V++ F+ ++l + + ++k++ ++f+l+ T25032 79 ILYSVF----SLTIVLYLFFCKFL-----IPNPANAKEHARTIFTLF--- 116 iilatIPavvlGlLfkdliksyLfnslw..vVaiaLIvgGilLllaEkls ++ f+ i++ L +s++++++ + I+++i+ + ++ T25032 117 ---------IFAYAFTPVIRT-LTTSIStdTIYSTSIITAIFSCFFHDYG 156 ekkdrrtedvedltlkdAliiGlAQcLAlifPGiSRSGaTIsgGlllGln +k+ +++ ++ Gl + A++ ll l+ T25032 157 VKAPV-------VSYPTSVSTGL--SSAIF--------------LLSRLE 183 ReaAaeFSFLLaIPtmlGAgllaikDllksgellntsedlppliiGFItA + + LL + A +l +++++ ++ p + +I+ T25032 184 GDTPT----LL---LLVVAFTLHAYGAEFRNRIF----HVYPCLSSTIFC 222 FvvallaIkwLLkfikrhslipFaiYRiilGililallli<-* F+ +l++I+ +f + s i +++l +++l+i T25032 223 FL-SLFSIYCISDFSLELS--------ICFALLHIFILFI 253 COG1108: domain 1 of 1, from 50 to 260: score -172.7, E = 20 *->mmslLlellqyeflqnAllagllvslacgllGsflVlRRmaligDAl m +L l+++++ + +l T25032 50 CMNFLTHLDMITMYFVLFLN--------------------------- 69 SHaaLpGvalgyLltgqlglfllginpllgaiafgllgAllig.flrrks + + +in+l++ + ++++ l+ +fl + T25032 70 -------ILHS----------NWSINILYSVFSLTIVLYLFFCkFLIPNP 102 kvkeDtaiGivfssglalGvvllsllpgfnnsavdlmnyLFGniLaVspt +++ a i+ + +a + + + + ++ ++s++ T25032 103 ANAKEHARTIFTLFIFAYAFTPVIRTLTT----------------SISTD 136 DLwliaivslGPsVlclvllllllfyrelllisFDpefAkvlGipvrllh ++ + i+++ ++++f++++ pv + T25032 137 TIYSTSIITA----------IFSCFFHDYGVK-----------APVVSYP 165 yllllLialtiVaavkaVG....viLvsaLLiiPaatArllsrslrsmli + + ++ +i ++ ++ G++++++L ++ +++ a A + ++ + + T25032 166 TSVSTGLSSAIFLLSRLEGdtptLLLLVVAFTLHAYGAE-FRNR--IFHV 212 iAillgllsgvlGvlllSyyldlppGpsivliatllfllslllrkkygvl + l ++++++l l++ y++ + +++ +ll ++ l++ + +l T25032 213 YPCLSSTIFCFLS-LFSIYCI-SDFSLELSICFALLHIFILFICPLILIL 260 <-* T25032 - - COG0601: domain 1 of 1, from 77 to 265: score -261.6, E = 83 *->MlkYilrRlllliptllgvstlvFlil..rlaPGdakdPLykalrla + ++ ++ ++++ +l+ + l+P P a T25032 77 ---------INILYSVFSLTIVLYLFFckFLIPN----P-------A 103 eailgeaspeviealreeyGLDkPlyvQYlnylknllqsGDFGttSfvsg +a ++ + ++ + + + l+ t ++ T25032 104 NAKEH-----------ARTIFTLFIFAYAFTPVIRTLT-TSIS-TDTIYS 140 prPVsdlIkerlPaTLeLallalilslliGipLGiiaAlkrnswiDylir ++++ I++ ++ +++ a ++s+ ++ G +A T25032 141 T-SIITAIFSCFFHD--YGVKAPVVSYPTSVSTGLSSA------------ 175 ilaligiSiPsFwlAllLillFavgvkLgwlPvgGrysplefldpstggd i+l + g +P+ + + T25032 176 ------------------IFLLSR--LEGDTPTLLLLVV----------- 194 ppiTglylidallsgswekfldvlkHliLPaltLglvslAgiaRltRnsm T25032 195 -----------------------AF------------------------- 196 levLnqDYirtArAKGLserrviykHaLRNAllPviTvlGlqlggllgGa t A G r i+ H++ P ++ + +++l++ T25032 197 ----------TLHAYGAEFRNRIF-HVY-----PCLSSTIFCFLSLFS-- 228 vitEtvFswPGlGrllvdAIlnrDYpvvqgsvliiallvllgNLivDllY ++ + + l +++ + l +++ ++ T25032 229 ------------------------IYCISDFSLELSICFALLHIFILFIC 254 ailDPRIRyeg<-* ++ + ++ T25032 255 PLILILKQTGK 265 COG1173: domain 1 of 1, from 83 to 265: score -183.0, E = 9 *->villliivlalfaplllpgdpdaldlspllpskehlLGTDdlGGRDi v+ l+i+++ +f +l p + +a+ + R i T25032 83 VFSLTIVLYLFFCKFLIPNPANAK----------------EHA-RTI 112 fsRllyGaRiSLliGliavlisl....lIGillGliaGYfG..GwvDevi f+ +++ + +i + + is+++ i+ +++ +f + G v+ T25032 113 FTLFIFAYAFTPVIRTLTTSISTdtiySTSIITAIFSCFFHdyGVKAPVV 162 MRitdillaiPgllLlIllvailg.....llniilalglvgWpgyARvvR + t + + + +I l + l++++++ll++++a+ T25032 163 SYPTSVSTGLSS---AIFLLSRLEgdtptLLLLVVAF------------- 196 gqvLslrereYVeAAkalGasdlrIifkHiLPNvlspiivqatlsiggaI a Ga + if H+ P + s i+ +++l I T25032 197 -------------TLHAYGAEFRNRIF-HVYPCLSSTIFCFLSLFSIYCI 232 ltEAgLSFLGLGaqpPtpsWGamLsdgrnagnayl.gawWlllfPGlaIv + + L L + + +L+ + l + +l+ T25032 233 SDFS----LELSIC--FA----LLHIFI------LfICPLILIL------ 260 ltvLafnllGDGLRDAlDPrlrrk<-* ++ +k T25032 261 -------------------KQTGK 265 COG0733: domain 1 of 1, from 11 to 268: score -355.3, E = 44 *->mmmkrqWsSRlGFILAAaGSAVGLGNIWRFPymageNGGGAFlLpYL ++k+ P+ +GG A +L T25032 11 LYRKQ-------------------------PFPDNYSGGDAQFL--- 29 ialllvGiPlllaEfaIGRYGGrrtrknavdafrrLapkkgkkdsrkwew + rkn+ + ++ ++++ ++ T25032 30 ----------------------KELRKNV----SVVHYDYKS------AV 47 vGwfgvavafvIlsyYsViiGWilsYlvksitGalpgdtdaakfFgeyFq +G + +++ + + Y V+ ++ + +++ + + + ++F+ T25032 48 FGCMNFLTHLDMITMYFVL-------FLNILHSNWSI-N-ILY---SVFS 85 ssingpgdignpvlavfffllfmvitaliVssrGvkkGIEkankilMPlL ++i + +f+ ++i + + + E a +i T25032 86 LTIV-----------LYLFFCKFLIPNPANA-K------EHARTI----- 112 FvlfiiLviyalTLpGPGAmeGlkfllsPDfsklkddLpkvwlaAlGQiF + +++ +ya+T p l+ T25032 113 --FTLFIFAYAFT-PV--------------IRTLT--------------- 130 FsLSLGfGiMiTYaSYLpKkedlvksalsvvllNtlvslLAGlmiFpalF +++ +++++ + +i++a+F T25032 131 --------------------TSISTDTIYST------------SIITAIF 148 vfgaaggkpvsevssGsllaGpGLvFivLPavFnqlpailGtifgilFFl + + ++ +++ p + p+ T25032 149 SCFFH-----DYGVKA-----PVVSY----------PT-S---------- 167 lLvfAaLTSaISmlEvlvaaLidkfgisRkkatwlvggvifllGvpsals v +L+SaI l + L+ + + +++l++++++l T25032 168 --VSTGLSSAI----FLLSRLE-----GDTPTLLLLVVAFTLHA------ 200 lggvwsdvliFGlslFDlvDffasnilmplgaLlivifvgWvfkkdklrk +g + f + i+ + +L + if ++ + + T25032 201 YG-----------------AEFRNRIFHVYPCLSSTIFCFLSLFSIYCIS 233 elnsgsptsdikvgkiWlylvrYfitPiiiaivlflsaigilse<-* ++ ++i + + +++ fi+P+i++ l + g + T25032 234 DFSL---ELSICFALLHIFIL--FICPLILI----LKQTGKCTI 268 COG3371: domain 1 of 1, from 116 to 270: score -93.8, E = 21 *->miknyamtpkmlkilgiigpliailgilisvllNrpWFSfTkNALSD i+ ya tp + ++i + + i + + f++ D T25032 116 FIFAYAFTPVIRTLTTSISTDTIYSTSIITAIFS----CFFH----D 154 LGggnLLqnGHvkapkpWvyNyGLIigGvlvllfSvdlgilalkksenlg G vkap + + +s +l T25032 155 YG---------VKAP----------------------VVSYPTSVSTGLS 173 ralliiSglFLaLIGvFpEGtGrpHvfvSilFFilmfiamlilSvrASLP +a++++S l EG + + +l a+ + T25032 174 SAIFLLSRL---------EGD--------TPTLLLLVVAFTL-------- 198 AWiRvygisrlir.......klaigifgLalfivyilvfiplgwvSlAv. +yg+ ++ r + + ++++++ L+lf +y + + l l + T25032 199 ---HAYGAEFRNRifhvypcLSSTIFCFLSLFSIYCISDFSLE---LSIc 242 pEligialIlaiviflglryllkrVkdg<-* + l i ++ + ++l+l+ k +g T25032 243 FALLHIFILFICPLILILKQTGKCTIHG 270 COG0395: domain 1 of 1, from 106 to 280: score -207.7, E = 80 *->rkillylfLilfaliilfPflwlvltSfkpdgntdsselfsgpptlf ++ + ++f +++ ++ + P++ +++tS + + ++s++++ T25032 106 KEHARTIFTLFIFAYAFTPVIRTLTTSIS------TDTIYSTSIITA 146 PstftlenyfrnYrkvfklttggnfpflraflNSlivalvttvlsvllss ++ + ++g + a ++S +++t + s ++ T25032 147 IFSCFFH------------DYG-----VKAPVVSYPTSVSTGLSSAIFL- 178 lAAYAlaRlrFkGrkllfllilatlMiPfqvlliPlYllirkLGLlnplG l+R G +ll+ ++ +L+ T25032 179 -----LSR--LEGDTPTLLLL----------------VVAFTLH------ 199 vlldTywGLILpyaagglpfnifllrqfFdtIPkELeEAAriDGAspfqi +y a +r+ + T25032 200 -----------AYGAE--------FRNRI--------------------- 209 FfrIvLPLskPglAtvaiftFigsWNdFlwpliflsdpndslnyplyTLp f+++ Ls +++ +++f+ i+ dF + l + T25032 210 -FHVYPCLSSTIFCFLSLFS-IYCISDFSLELSICFA------------- 244 vgLanlingeygtdlvtapewglimAaavlaaLPililFlffQkyfvkGl L +i l + P +++ k + G T25032 245 -LLHIFI----------------------LFICPLILILKQTGKCTIHGP 271 taGg..vKG<-* + ++K T25032 272 WDEAvpLKS 280 COG1684: domain 1 of 1, from 59 to 282: score -171.9, E = 42 *->MmefinlqplslvsltFlLllvRilaflstaPffserlvPa.vvRlg M++ + ++l ++ +++ ++++ ++v+ l+ T25032 59 MITMYFVLFLNILHSNW------------------SINILYsVFSLT 87 LALflsfivlPtlpasppqvpllsalyfaLlllkEiLlGlllGFilql.. ++L+l f++ +++p +p +++ + f+L ++ ++ ++ +++++++ T25032 88 IVLYL-FFCKFLIP-NPANAKEHARTIFTLFIFAYAFTPVIRTLTTSIst 135 .lFaAfqaAGeiIsf...QmGlgGfAsmvDpfsgeqtpliGqfltllalL + ++ ++ + i+s + G A +v+ + ++t l + ++ l T25032 136 dTIYSTSIITAIFSCffhDYGVK--APVVSYPTSVSTGLSSAIFLLSR-- 181 lFLslnGHllliliglldSfksiPvgsffpe.mnvlnenlfkfllkalsl L + + ll+l ++++++++ g f+++++ + + l + + +lsl T25032 182 --LEGDTPTLLLL-VVAFTLHAY--GAEFRNrIFHVYPCLSSTIFCFLSL 226 mFiialllalPiiialLlvdlvLGllnRaaPQlNvFviGfPLkilvGlll + i +++ f+L + + + l T25032 227 FSIYCIS-------------------------------DFSLELSICFAL 245 Lililpviaiqfknlfllafetlr....ellallgkp<-* L +++ +i + + l + t+ ++ +e ++l++ + T25032 246 LHIFILFICPLILILKQTGKCTIHgpwdEAVPLKSNT 282 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm-f Sequence file: T25032.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T25032 hypothetical protein T20D3.8 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG3124 2.8 6.9 1 COG2140 2.2 8.7 1 COG2704 -1.2 54 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG3124 1/1 51 57 .. 1 7 [. 2.8 6.9 COG2704 1/1 51 65 .. 1 15 [. -1.2 54 COG2140 1/1 157 167 .. 384 394 .] 2.2 8.7 Alignments of top-scoring domains: COG3124: domain 1 of 1, from 51 to 57: score 2.8, E = 6.9 *->MNFLAHL<-* MNFL HL T25032 51 MNFLTHL 57 COG2704: domain 1 of 1, from 51 to 65: score -1.2, E = 54 *->iiFssllemsimlll<-* + F++ l+m+ m ++ T25032 51 MNFLTHLDMITMYFV 65 COG2140: domain 1 of 1, from 157 to 167: score 2.2, E = 8.7 *->kktPvvklptq<-* +k Pvv++pt+ T25032 157 VKAPVVSYPTS 167 //