analysis of sequence from T20374.fa ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD LFDSPRKNGT NDKTSEKNVD PDYQ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > T20374 . . . . . 1 MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYG 50 ___EEEEEEEEEE_______________HHHHHHHHHH__EEEEEE____ . . . . . 51 NRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQII 100 ___EEEEE____EEEE____________________HHHHHHHHHEEEEE . . . . . 101 HGHSTFSSLAHETLMIGGLMGLRTVFTDHSLFGFADASAILTNKLVLQYS 150 E____HHHHHHHHHHHH_EEEEEEEEE______HHHHHHHHHHHHHHHHH . . . . . 151 LINVDQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQFFN 200 H____EEEEEEE_____EEE_________________EEEE_________ . . . . . 201 NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGDGPKRIELEE 250 ___EEEE__EEEEE___HHHHHEEEEEEE____EEEEEE_____HHHHHH . . . . . 251 MLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS 300 HHHHHHHHHHEEE________EEEEEE__EEEEE___HHHHHHHHHH___ . . . . . 301 CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLM 350 ___EEEEEEE____EEEE___EEEE________HHHHHHHHHHHHHHH__ . . . . . 351 DPTEKHEAVSKMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIG 400 ___HHHHHHHHH_____HHHHHHHHHHHHHH___________EEE_____ . . . . 401 FGIMYIVVSCIIIFWLTVLDLFDSPRKNGTNDKTSEKNVDPDYQ 444 __EEEEEEEHHHHHHHHHH_________________________ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 0.8 % beta-contents : 62.2 % coil-contents : 37.0 % class : beta method : 2 alpha-contents : 0.0 % beta-contents : 54.5 % coil-contents : 45.5 % class : beta ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -21.92 -0.11 -0.66 -0.35 -4.00 0.00 -8.00 0.00 -0.06 -13.36 -4.18 -12.00 -12.00 -4.00 -12.00 0.00 -92.64 -8.85 -0.67 -0.29 -1.35 0.00 0.00 -4.00 0.00 0.00 -13.36 -4.18 -12.00 -12.00 -4.00 -12.00 0.00 -72.72 ID: T20374 AC: xxx Len: 444 1:I 428 Sc: -72.72 Pv: 6.246781e-01 NO_GPI_SITE GPI: learning from protozoa -18.20 -2.35 -0.45 -0.17 -4.00 0.00 -16.00 0.00 0.00 -11.23 -13.79 -12.00 -12.00 0.00 -12.00 0.00 -102.19 -24.64 -0.69 -0.39 -0.13 -4.00 0.00 -4.00 0.00 0.00 -11.23 -13.79 -12.00 -12.00 -4.00 -12.00 0.00 -98.87 ID: T20374 AC: xxx Len: 444 1:I 428 Sc: -98.87 Pv: 8.473754e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T20374 0.443 299 Y 0.399 430 Y 0.922 417 Y 0.156 N # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T20374 0.553 24 Y 0.335 24 N 0.827 9 N 0.335 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T20374 0.513 299 Y 0.269 217 N 0.969 410 Y 0.250 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. 1-327 MSLKIGPYSIALVSDFFCPNAGGVETHIYF LAQCLIELGHRVVVITHGYGNRKGIRYLSN GLKVYYLPFIVAYNGATLGSIVGSMPWLRK VLLRENVQIIHGHSTFSSLAHETLMIGGLM GLRTVFTDHSLFGFADASAILTNKLVLQYS LINVDQTICVSYTSKENTVLRGKLDPNKVS TIPNAIETSLFTPDRNQFFNNPTTIVFLGR LVYRKGADLLCEIVPKVCARHKSVRFIIGG DGPKRIELEEMLERFKLHERVVILGMLPHN QVKRVLNQGQIFINTSLTEAFCMSIVEAAS CGLHVVSTRVGGVPEVLPIGEFISLEE pvpddlvdallkavd 328-342 343-444 RREKGLLMDPTEKHEAVSKMYNWPDVAART QVIYQKAVESEPTGRLGRLKGYYDQGIGFG IMYIVVSCIIIFWLTVLDLFDSPRKNGTND KTSEKNVDPDYQ low complexity regions: SEG 25 3.0 3.3 >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. 1-301 MSLKIGPYSIALVSDFFCPNAGGVETHIYF LAQCLIELGHRVVVITHGYGNRKGIRYLSN GLKVYYLPFIVAYNGATLGSIVGSMPWLRK VLLRENVQIIHGHSTFSSLAHETLMIGGLM GLRTVFTDHSLFGFADASAILTNKLVLQYS LINVDQTICVSYTSKENTVLRGKLDPNKVS TIPNAIETSLFTPDRNQFFNNPTTIVFLGR LVYRKGADLLCEIVPKVCARHKSVRFIIGG DGPKRIELEEMLERFKLHERVVILGMLPHN QVKRVLNQGQIFINTSLTEAFCMSIVEAAS C glhvvstrvggvpevlpigefisleepvpd 302-342 dlvdallkavd 343-444 RREKGLLMDPTEKHEAVSKMYNWPDVAART QVIYQKAVESEPTGRLGRLKGYYDQGIGFG IMYIVVSCIIIFWLTVLDLFDSPRKNGTND KTSEKNVDPDYQ low complexity regions: SEG 45 3.4 3.75 >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. 1-444 MSLKIGPYSIALVSDFFCPNAGGVETHIYF LAQCLIELGHRVVVITHGYGNRKGIRYLSN GLKVYYLPFIVAYNGATLGSIVGSMPWLRK VLLRENVQIIHGHSTFSSLAHETLMIGGLM GLRTVFTDHSLFGFADASAILTNKLVLQYS LINVDQTICVSYTSKENTVLRGKLDPNKVS TIPNAIETSLFTPDRNQFFNNPTTIVFLGR LVYRKGADLLCEIVPKVCARHKSVRFIIGG DGPKRIELEEMLERFKLHERVVILGMLPHN QVKRVLNQGQIFINTSLTEAFCMSIVEAAS CGLHVVSTRVGGVPEVLPIGEFISLEEPVP DDLVDALLKAVDRREKGLLMDPTEKHEAVS KMYNWPDVAARTQVIYQKAVESEPTGRLGR LKGYYDQGIGFGIMYIVVSCIIIFWLTVLD LFDSPRKNGTNDKTSEKNVDPDYQ low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSN GLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGHSTFSSLAHETLMIGGLM GLRTVFTDHSLFGFADASAILTNKLVLQYSLINVDQTICVSYTSKENTVLRGKLDPNKVS TIPNAIETSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGG DGPKRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLMDPTEKHEAVS KMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIGFGIMYIVVSCIIIFWLTVLD LFDSPRKNGTNDKTSEKNVDPDYQ 1 - 444 MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD LFDSPRKNGT NDKTSEKNVD PDYQ low complexity regions: DUST >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSN GLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGHSTFSSLAHETLMIGGLM GLRTVFTDHSLFGFADASAILTNKLVLQYSLINVDQTICVSYTSKENTVLRGKLDPNKVS TIPNAIETSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGG DGPKRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLMDPTEKHEAVS KMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIGFGIMYIVVSCIIIFWLTVLD LFDSPRKNGTNDKTSEKNVDPDYQ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for T20374 sequence: 444 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~3336666 6666666666 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 360 CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 420 KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | LFDSPRKNGT NDKTSEKNVD PDYQ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ---------- ---------- ---- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** T20374.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: T20374.fa.___inter___ (1 sequences) MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYG NRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQII HGHSTFSSLAHETLMIGGLMGLRTVFTDHSLFGFADASAILTNKLVLQYS LINVDQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQFFN NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGDGPKRIELEE MLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLM DPTEKHEAVSKMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIG FGIMYIVVSCIIIFWLTVLDLFDSPRKNGTNDKTSEKNVDPDYQ (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 4 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 66 86 1.301 Certain 2 102 122 0.903 Putative 3 288 308 0.902 Putative 4 399 419 2.415 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 4 Loop length 65 15 276 25 K+R profile + + 3.00 4.00 CYT-EXT prof 1.13 0.76 - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -7.00 Tm probability: 0.76 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 1.89 -> Orientation: N-out ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 3 4 Loop length 65 201 90 25 K+R profile + + + 4.00 CYT-EXT prof 1.13 0.72 0.45 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -4.00 Tm probability: 0.76 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 1.40 -> Orientation: N-out ---------------------------------------------------------------------- Structure 3 Transmembrane segments included in this structure: Segment 1 4 Loop length 65 312 25 K+R profile + 4.00 + CYT-EXT prof 1.13 - 0.71 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 0.42 -> Orientation: N-out ---------------------------------------------------------------------- Structure 4 Transmembrane segments included in this structure: Segment 1 2 3 4 Loop length 65 15 165 90 25 K+R profile + + 4.00 3.00 + CYT-EXT prof 1.13 0.46 - - 0.72 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.57 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 0.87 -> Orientation: N-out ---------------------------------------------------------------------- "T20374" 444 66 86 #t 1.30104 102 122 #f 0.903125 288 308 #f 0.902083 399 419 #t 2.41458 ************************************ *TOPPREDM with prokaryotic function* ************************************ T20374.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: T20374.fa.___inter___ (1 sequences) MSLKIGPYSIALVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYG NRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQII HGHSTFSSLAHETLMIGGLMGLRTVFTDHSLFGFADASAILTNKLVLQYS LINVDQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQFFN NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGDGPKRIELEE MLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAAS CGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRREKGLLM DPTEKHEAVSKMYNWPDVAARTQVIYQKAVESEPTGRLGRLKGYYDQGIG FGIMYIVVSCIIIFWLTVLDLFDSPRKNGTNDKTSEKNVDPDYQ (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 4 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 66 86 1.301 Certain 2 102 122 0.903 Putative 3 288 308 0.902 Putative 4 399 419 2.415 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 4 Loop length 65 15 276 25 K+R profile + + 3.00 4.00 CYT-EXT prof 1.13 0.76 - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -7.00 Tm probability: 0.76 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 1.89 -> Orientation: N-out ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 3 4 Loop length 65 201 90 25 K+R profile + + + 4.00 CYT-EXT prof 1.13 0.72 0.45 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -4.00 Tm probability: 0.76 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 1.40 -> Orientation: N-out ---------------------------------------------------------------------- Structure 3 Transmembrane segments included in this structure: Segment 1 4 Loop length 65 312 25 K+R profile + 4.00 + CYT-EXT prof 1.13 - 0.71 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 4.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 0.42 -> Orientation: N-out ---------------------------------------------------------------------- Structure 4 Transmembrane segments included in this structure: Segment 1 2 3 4 Loop length 65 15 165 90 25 K+R profile + + 4.00 3.00 + CYT-EXT prof 1.13 0.46 - - 0.72 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.57 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -0.3333 NEG: 3.0000 POS: 6.0000 -> Orientation: N-in CYT-EXT difference: 0.87 -> Orientation: N-out ---------------------------------------------------------------------- "T20374" 444 66 86 #t 1.30104 102 122 #f 0.903125 288 308 #f 0.902083 399 419 #t 2.41458 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ SAPS. Version of April 11, 1996. Date run: Thu Nov 22 14:08:45 2001 File: /people/b_eisen/T20374.fa.___saps___ ID T20374 DE hypothetical protein D2085.6 - Caenorhabditis elegans. number of residues: 444; molecular weight: 49.5 kdal 1 MSLKIGPYSI ALVSDFFCPN AGGVETHIYF LAQCLIELGH RVVVITHGYG NRKGIRYLSN 61 GLKVYYLPFI VAYNGATLGS IVGSMPWLRK VLLRENVQII HGHSTFSSLA HETLMIGGLM 121 GLRTVFTDHS LFGFADASAI LTNKLVLQYS LINVDQTICV SYTSKENTVL RGKLDPNKVS 181 TIPNAIETSL FTPDRNQFFN NPTTIVFLGR LVYRKGADLL CEIVPKVCAR HKSVRFIIGG 241 DGPKRIELEE MLERFKLHER VVILGMLPHN QVKRVLNQGQ IFINTSLTEA FCMSIVEAAS 301 CGLHVVSTRV GGVPEVLPIG EFISLEEPVP DDLVDALLKA VDRREKGLLM DPTEKHEAVS 361 KMYNWPDVAA RTQVIYQKAV ESEPTGRLGR LKGYYDQGIG FGIMYIVVSC IIIFWLTVLD 421 LFDSPRKNGT NDKTSEKNVD PDYQ -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 21( 4.7%); C : 8( 1.8%); D : 20( 4.5%); E : 24( 5.4%); F : 20( 4.5%) G : 36( 8.1%); H : 12( 2.7%); I : 34( 7.7%); K : 23( 5.2%); L : 48(10.8%) M : 10( 2.3%); N : 20( 4.5%); P : 20( 4.5%); Q : 12( 2.7%); R : 23( 5.2%) S : 27( 6.1%); T : 25( 5.6%); V : 42( 9.5%); W : 3( 0.7%); Y : 16( 3.6%) KR : 46 ( 10.4%); ED : 44 ( 9.9%); AGP : 77 ( 17.3%); KRED : 90 ( 20.3%); KR-ED : 2 ( 0.5%); FIKMNY : 123 ( 27.7%); LVIFM : 154 ( 34.7%); ST : 52 ( 11.7%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 000+000000 0000-00000 0000-00000 000000-000 +000000000 0++00+0000 61 00+0000000 0000000000 00000000++ 000+-00000 0000000000 0-00000000 121 00+0000-00 00000-0000 000+000000 0000-00000 0000+-0000 +0+0-00+00 181 000000-000 000-+00000 000000000+ 000++00-00 0-000+000+ 0+00+00000 241 -00++0-0-- 00-+0+00-+ 0000000000 00++000000 00000000-0 000000-000 301 00000000+0 0000-00000 -0000--000 --00-000+0 0-++-+0000 -00-+0-000 361 +00000-000 +000000+00 -0-000+00+ 0+000-0000 0000000000 000000000- 421 00-00++000 0-+00-+00- 0-00 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 12/45 or 15/60): none Negative charge clusters (cmin = 9/30 or 12/45 or 15/60): none Mixed charge clusters (cmin = 14/30 or 19/45 or 24/60): 1) From 326 to 357: EEPVPDDLVDALLKAVDRREKGLLMDPTEKHE --000--00-000+00-++-+0000-00-+0- quartile: 3; size: 32, +count: 5, -count: 10, 0count: 17; t-value: 3.74 L: 5 (15.6%); E: 5 (15.6%); D: 5 (15.6%); LVIFM: 9 (28.1%); B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 5 | 5 | 7 | 40 | 9 | 9 | 12 | 11 | 11 | 15 | 7 | 9 | lmin1 6 | 6 | 8 | 49 | 11 | 11 | 15 | 14 | 14 | 18 | 8 | 11 | lmin2 7 | 7 | 10 | 54 | 13 | 13 | 17 | 16 | 15 | 20 | 10 | 12 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 0 - runs >= 3: 0 * runs >= 4: 1, at 342; 0 runs >= 27: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. __________________________________ High scoring hydrophobic segments: 2.00 (LVIFM) 1.00 (AGYCW) 0.00 (BZX) -2.00 (PH) -4.00 (STNQ) -8.00 (KEDR) Expected score/letter: -1.640 M_0.01= 30.79; M_0.05= 25.32 1) From 398 to 416: length= 19, score=26.00 * 398 GIGFGIMYIV VSCIIIFWL G: 3(15.8%); V: 2(10.5%); I: 6(31.6%); F: 2(10.5%); ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -2.921 M_0.01= 76.86; M_0.05= 62.67; M_0.30= 45.80 1) From 398 to 419: length= 22, score=66.00 * 398 GIGFGIMYIV VSCIIIFWLT VL G: 3(13.6%); V: 3(13.6%); I: 6(27.3%); 2. SPACINGS OF C. H2N-17-C-15-C-124-C-61-C-6-C-63-C-8-C-108-C-34-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-17-C-8-H-6-C-5-H-6-H-53-H-1-H-7-H-17-H-29-C-61-C-6-C-2-H-26-H-10-H-22-C-8-C-2-H-51-H-53-C-34-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 Aligned matching blocks: [ 208- 211] LGRL [ 388- 391] LGRL B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 28 (Expected range: 10-- 41) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 10 (6-10) 5 (11-20) 8 (>=21) 6 3. Clusters of amino acid multiplets (cmin = 11/30 or 14/45 or 17/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 10 (Expected range: 0-- 17) 7 +plets (f+: 10.4%), 3 -plets (f-: 9.9%) Total number of charge altplets: 10 (Critical number: 20) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 2 (6-10) 0 (11-20) 2 (>=21) 7 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 49- 80 8 Y....... 4 4 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core:10) Location Period Element Copies Core Errors 3- 51 7 i000... 7 7 /0/2/2/2/./././ 64- 112 7 i00.0.. 7 7 /0/2/1/./2/././ 210- 233 4 *..0 6 6 0 232- 261 3 *.. 9 7 1 283- 342 10 i0...0.0.. 6 6 /0/1/./././1/./1/././ 310- 327 3 i.. 6 6 0 -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 0- 4 (1.) K( 4)K 24 of 24 0.0037 large minimal spacing 433- 437 (4.) K( 4)K 23 of 24 0.0037 matching minimum ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Glycos_transf_1 Glycosyl transferases group 1 91.6 2.9e-24 1 SRP54 SRP54-type protein, GTPase domain 0.1 68 1 fer4_NifH 4Fe-4S iron sulfur cluster binding pr -1.0 95 1 KH-domain KH domain -8.4 61 1 LysM LysM domain -11.8 74 1 DUF196 Uncharacterized ACR, COG1343 -37.5 38 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- LysM 1/1 123 161 .. 1 44 [] -11.8 74 fer4_NifH 1/1 247 259 .. 269 281 .] -1.0 95 KH-domain 1/1 226 265 .. 1 49 [] -8.4 61 DUF196 1/1 234 323 .. 1 93 [] -37.5 38 SRP54 1/1 325 340 .. 200 215 .] 0.1 68 Glycos_transf_1 1/1 188 346 .. 1 180 [. 91.6 2.9e-24 Alignments of top-scoring domains: LysM: domain 1 of 1, from 123 to 161: score -11.8, E = 74 *->YtVKsGDTLwkIArkygisvqeLkslNpgLssdn..lyvGQkLkip< tV +L++ A + i N +L ++ + v Q+++++ T20374 123 RTVFTDHSLFGFADASAIL------TN-KLVLQYslINVDQTICVS 161 -* T20374 - - fer4_NifH: domain 1 of 1, from 247 to 259: score -1.0, E = 95 *->eLeeLlvkfgimd<-* eLee+l+ f + + T20374 247 ELEEMLERFKLHE 259 KH-domain: domain 1 of 1, from 226 to 265: score -8.4, E = 61 *->evlvpasrvGliIGkgGsnIkeireetgakIdipddsegsverivti +v +++ v++iIG G e++e++ ++++ + v i T20374 226 KVCARHKSVRFIIGGDGPKRIELEEML-ERFKLHER--------VVI 263 tg<-* +g T20374 264 LG 265 DUF196: domain 1 of 1, from 234 to 323: score -37.5, E = 38 *->myvLVvYDvsvdeRvnrlkKfLrkfGLn.wVQnSaFEGELtkadler + ++ ++ + l L++f L+++V + G+L ++ r T20374 234 VRFIIG---GDGPKRIELEEMLERFKLHeRV---VILGMLPHNQVKR 274 lkagidriid...eDrDsviIYkfrsRCSsaAvkrevlGl.EkspGeeev + + + +i+++ + + I s C v v G++E+ p + e+ T20374 275 VLNQGQIFINtslTEAFCMSIVEAAS-CGLHVVSTRVGGVpEVLP-IGEF 322 i<-* i T20374 323 I 323 SRP54: domain 1 of 1, from 325 to 340: score 0.1, E = 68 *->LepFdperfvsrLLgm<-* Le+ p+ +v++LL T20374 325 LEEPVPDDLVDALLKA 340 Glycos_transf_1: domain 1 of 1, from 188 to 346: score 91.6, E = 2.9e-24 *->dreeirkklgikedkkiilfvGRlvpeKGidllieAfkkLkkkpkll + + ++ +++++++ +i+f GRlv++KG dll e ++k++++ T20374 188 TSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCAR---- 230 klnpnlkLvivGgpYdsedgeeedelkklaeklglednviflGfvpdedl + ++++i G dg+++ el+++ e l ++v +lG +p++++ T20374 231 --HKSVRFIIGG------DGPKRIELEEMLERFKLHERVVILGMLPHNQV 272 pelyksadvfvlPSryEgFGivllEAmAcGlPVIatncvgGipEvvkdge + ++++ +f+ +S +E+F+++++EA +cGl V++t vgG+pEv+ ge T20374 273 KRVLNQGQIFINTSLTEAFCMSIVEAASCGLHVVSTR-VGGVPEVLPIGE 321 tGllvepgqdpealaeaiekllkdeekkdllel<-* l ep p++l++a++k+ + e+ T20374 322 FISLEEPV--PDDLVDALLKAVDRR------EK 346 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Glycos_transf_1 Glycosyl transferases group 1 91.6 2.9e-24 1 PK_C Pyruvate kinase, alpha/beta domain 2.3 24 1 RuvA RuvA N terminal domain 0.7 31 1 Bac_export_1 Bacterial export proteins, family 1 0.4 41 1 PA_phosphat_N Purple acid phosphatase, N-terminal i 0.4 92 1 SRP54 SRP54-type protein, GTPase domain 0.1 68 1 denso_VP4 Capsid protein VP4 -0.3 25 1 fer4_NifH 4Fe-4S iron sulfur cluster binding pr -1.0 95 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- PA_phosphat_N 1/1 9 22 .. 1 14 [. 0.4 92 denso_VP4 1/1 74 81 .. 435 442 .] -0.3 25 RuvA 1/1 129 134 .. 63 68 .] 0.7 31 fer4_NifH 1/1 247 259 .. 269 281 .] -1.0 95 PK_C 1/1 288 299 .. 1 12 [. 2.3 24 SRP54 1/1 325 340 .. 200 215 .] 0.1 68 Glycos_transf_1 1/1 188 346 .. 1 180 [. 91.6 2.9e-24 Bac_export_1 1/1 399 427 .. 227 255 .] 0.4 41 Alignments of top-scoring domains: PA_phosphat_N: domain 1 of 1, from 9 to 22: score 0.4, E = 92 *->dmpldsdvFrvppG<-* ++ l+sd+F+ ++G T20374 9 SIALVSDFFCPNAG 22 denso_VP4: domain 1 of 1, from 74 to 81: score -0.3, E = 25 *->ngAtLGnv<-* ngAtLG++ T20374 74 NGATLGSI 81 RuvA: domain 1 of 1, from 129 to 134: score 0.7, E = 31 *->hlLYGF<-* h+L+GF T20374 129 HSLFGF 134 fer4_NifH: domain 1 of 1, from 247 to 259: score -1.0, E = 95 *->eLeeLlvkfgimd<-* eLee+l+ f + + T20374 247 ELEEMLERFKLHE 259 PK_C: domain 1 of 1, from 288 to 299: score 2.3, E = 24 *->tEaiAmSAVrAA<-* tEa +mS V+AA T20374 288 TEAFCMSIVEAA 299 SRP54: domain 1 of 1, from 325 to 340: score 0.1, E = 68 *->LepFdperfvsrLLgm<-* Le+ p+ +v++LL T20374 325 LEEPVPDDLVDALLKA 340 Glycos_transf_1: domain 1 of 1, from 188 to 346: score 91.6, E = 2.9e-24 *->dreeirkklgikedkkiilfvGRlvpeKGidllieAfkkLkkkpkll + + ++ +++++++ +i+f GRlv++KG dll e ++k++++ T20374 188 TSLFTPDRNQFFNNPTTIVFLGRLVYRKGADLLCEIVPKVCAR---- 230 klnpnlkLvivGgpYdsedgeeedelkklaeklglednviflGfvpdedl + ++++i G dg+++ el+++ e l ++v +lG +p++++ T20374 231 --HKSVRFIIGG------DGPKRIELEEMLERFKLHERVVILGMLPHNQV 272 pelyksadvfvlPSryEgFGivllEAmAcGlPVIatncvgGipEvvkdge + ++++ +f+ +S +E+F+++++EA +cGl V++t vgG+pEv+ ge T20374 273 KRVLNQGQIFINTSLTEAFCMSIVEAASCGLHVVSTR-VGGVPEVLPIGE 321 tGllvepgqdpealaeaiekllkdeekkdllel<-* l ep p++l++a++k+ + e+ T20374 322 FISLEEPV--PDDLVDALLKAVDRR------EK 346 Bac_export_1: domain 1 of 1, from 399 to 427: score 0.4, E = 41 *->vglllLvlylpyilplfkeelsllfdlls<-* +g+ ++++ ++ i++++ +l+l++++ + T20374 399 IGFGIMYIVVSCIIIFWLTVLDLFDSPRK 427 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Thu Nov 22 14:10:45 2001 Sequence file: T20374.fa ---------------------------------------- Sequence T20374 (444 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 284: NTSL 428: NGTN Total matches: 2 Matching pattern PS00005 PKC_PHOSPHO_SITE: 2: SLK 142: TNK 163: TSK 233: SVR 307: STR 353: TEK 385: TGR 424: SPR 435: SEK Total matches: 9 Matching pattern PS00006 CK2_PHOSPHO_SITE: 163: TSKE 286: SLTE 294: SIVE 324: SLEE 417: TVLD Total matches: 5 Matching pattern PS00008 MYRISTYL: 22: GGVETH 50: GNRKGI 75: GATLGS 79: GSIVGS 117: GGLMGL 398: GIGFGI Total matches: 6 Total no of hits in this sequence: 22 ======================================== 1314 pattern(s) searched in 1 sequence(s), 444 residues. Total no of hits in all sequences: 22. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >ER-GOLGI-traffic signal is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M minimal SH3 binding is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >EXTRACELL-M minimal furin protease cleavage site motif is the MOTIF name >T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. ;LENGTH=444; DIRECT_SEQUENCE n 1 solutions m %_RXXR 387-390 f >STATISTICS Total : 1 solutions in 1 sequences, 444 units; out of 1 sequences, 444 units >EXTRACELL-M extended furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >EXTRACELL-M zinc binding motif in MMPs is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >EXTRACELL-M g alpha binding go loco is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SV40 LrgT experimentally determined is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS Max experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >PDZ domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units >WW domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 444 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB+ PSSM from other authors IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. (444 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value ACYC Adenylyl/Guanylyl cyclase domain 22 1.9 PUM Pumilio repeat RNA binding domain 21 3.0 INSL Insulinase like Metallo protease domain 21 3.5 RASGAP RAS-type GTPase GTP hydrolysis activating protein 21 3.7 CALC Calcineurin like Phosphoesterase domain 21 4.3 ARM Armadillo repeat 21 4.4 MBL Metallo-betalactamase domain 20 4.4 CYCL cyclophilin like peptidyl prolyl isomerases 20 4.7 S1 S1 RNA binding domain 21 4.8 UBA Ubiquitin pathway associated domain 20 5.7 BRIGHT BRIGHT domain (Alpha helical DNA binding domain) 20 6.8 CYCLIN Cyclin/TFIIB domain 20 9.1 >ACYC Adenylyl/Guanylyl cyclase domain Length = 244 Score = 21.8 bits (46), Expect = 1.9 Identities = 5/41 (12%), Positives = 5/41 (12%) Query: 16 FFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIR 56 Sbjct: 94 WFHGYNEATPAEIIQILHAVNRLQAMTAKLNQKYELPFPLR 134 >PUM Pumilio repeat RNA binding domain Length = 337 Score = 21.3 bits (45), Expect = 3.0 Identities = 5/28 (17%), Positives = 5/28 (17%) Query: 25 ETHIYFLAQCLIELGHRVVVITHGYGNR 52 Sbjct: 153 PSKFGFIIDAIVEQNNIITISTHKHGCC 180 >INSL Insulinase like Metallo protease domain Length = 433 Score = 21.0 bits (44), Expect = 3.5 Identities = 23/124 (18%), Positives = 23/124 (18%), Gaps = 15/124 (12%) Query: 161 SYTSKENTVLRGKLDPNKVSTIPNAIET---SLFTP--DRNQFFNNPTTIVFLGRLVYRK 215 Sbjct: 96 AGTSKDYTYYHVEIAHPYW---KQALEVLYQLTMKATLDEEMIEKEKPIVIEELRRGKDN 152 Query: 216 GADLLCEIVPKVCARHKSVRFIIGG--DGPKRIELEEMLERFKLHER-----VVILGMLP 268 Sbjct: 153 PTTVLWEEFEKLVYKVSPYRFPIIGFEETIRKFTREKLLKFYKSFYQPRNMAVVIVGKVN 212 Query: 269 HNQV 272 Sbjct: 213 PKEV 216 Score = 19.5 bits (40), Expect = 9.3 Identities = 7/14 (50%), Positives = 7/14 (50%) Query: 55 IRYLSNGLKVYYLP 68 Sbjct: 23 IRDLPNGAKLIVKP 36 >RASGAP RAS-type GTPase GTP hydrolysis activating protein Length = 292 Score = 20.9 bits (43), Expect = 3.7 Identities = 8/14 (57%), Positives = 8/14 (57%) Query: 331 DDLVDALLKAVDRR 344 Sbjct: 25 DDLMNLLLESVDQR 38 >CALC Calcineurin like Phosphoesterase domain Length = 274 Score = 20.5 bits (42), Expect = 4.3 Identities = 9/39 (23%), Positives = 9/39 (23%) Query: 201 NPTTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIG 239 Sbjct: 54 EFDVILATGDLVQDSSDEGYIRFVEMMKPFNKPVFWIPG 92 >ARM Armadillo repeat Length = 532 Score = 20.6 bits (43), Expect = 4.4 Identities = 10/35 (28%), Positives = 10/35 (28%) Query: 269 HNQVKRVLNQGQIFINTSLTEAFCMSIVEAASCGL 303 Sbjct: 402 HDQIKYLVEQGCIKPLCDLLVCPDPRIITVCLEGL 436 >MBL Metallo-betalactamase domain Length = 256 Score = 20.5 bits (42), Expect = 4.4 Identities = 5/39 (12%), Positives = 5/39 (12%), Gaps = 3/39 (7%) Query: 229 ARHKSVRFIIGGDGPKR---IELEEMLERFKLHERVVIL 264 Sbjct: 218 PAAIKAKMWLYGYQPGPLPPALEDGFLGFVKRGQRFDLV 256 >CYCL cyclophilin like peptidyl prolyl isomerases Length = 165 Score = 20.5 bits (43), Expect = 4.7 Identities = 18/56 (32%), Positives = 18/56 (32%), Gaps = 12/56 (21%) Query: 238 IGGDGPKRIELEEMLERFKL-HERVVILGM---LPHNQVKRVLNQGQIFINTSLTE 289 Sbjct: 73 TGGKSIYGEKFED--ENFILKHTGPGILSMANAGPNT------NGSQFFICTAKTE 120 >S1 S1 RNA binding domain Length = 305 Score = 20.6 bits (43), Expect = 4.8 Identities = 19/129 (14%), Positives = 19/129 (14%), Gaps = 40/129 (31%) Query: 244 KRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAASCGL 303 Sbjct: 180 RRIQQAESMGKIAAGNIYE-------GKVAKIQPYG-VFVEIEGVTGL-----------L 220 Query: 304 HV---VSTRVGGVPEVLPIGEFISLEEPVPDDLVDALLKAVDRRE-------KGLLMDPT 353 Sbjct: 221 HVSQVSGTRVDSLNTLFAFG-----------QAISVYVQEIDEYKNRISLSTRILETYPG 269 Query: 354 EKHEAVSKM 362 Sbjct: 270 ELVEKFDEM 278 >UBA Ubiquitin pathway associated domain Length = 255 Score = 20.0 bits (41), Expect = 5.7 Identities = 15/81 (18%), Positives = 15/81 (18%), Gaps = 3/81 (3%) Query: 332 DLVDALLKAVDRREKGLLMDPTEKHEAVSKMYNWPDVAARTQVIYQ--KAVESEPTGRLG 389 Sbjct: 147 EALAPLLENISARYPQLREHIMANPEVFVSMLLEAVGDNMQDVMEGADDMVEGEDIEVTG 206 Query: 390 RLKGY-YDQGIGFGIMYIVVS 409 Sbjct: 207 EAAAAGLGQGEGEGSFQVDYT 227 >BRIGHT BRIGHT domain (Alpha helical DNA binding domain) Length = 172 Score = 19.9 bits (41), Expect = 6.8 Identities = 13/66 (19%), Positives = 13/66 (19%), Gaps = 8/66 (12%) Query: 17 FCPNAGGVETHIYFLAQCLIELGHRVVV--------ITHGYGNRKGIRYLSNGLKVYYLP 68 Sbjct: 44 RLPIMAKSVLDLYELYNLVIARGGLVDVINKKLWQEIIKGLHLPSSITSAAFTLRTQYMK 103 Query: 69 FIVAYN 74 Sbjct: 104 YLYPYE 109 >CYCLIN Cyclin/TFIIB domain Length = 317 Score = 19.5 bits (40), Expect = 9.1 Identities = 11/94 (11%), Positives = 11/94 (11%), Gaps = 6/94 (6%) Query: 274 RVLNQGQIFINTSLTEAFCMS---IVEAASCGLHVVSTRVGGVPEVLPIGEFISLEEPV- 329 Sbjct: 180 ILRKTADDFLNRIALTDAYLLYTPSQIALTA-ILSSASRAGITMESYLSESLMLKENRTC 238 Query: 330 PDDLVDALLKAVDRREKGLLMDPTEKHEAVSKMY 363 Sbjct: 239 LSQLLDIMKSMRNLVKK-YEPPRSEEVAVLKQKL 271 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 12 Number of calls to ALIGN: 13 Length of query: 444 Total length of test sequences: 20182 Effective length of test sequences: 16637.0 Effective search space size: 6828066.3 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. (444 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|585096 [110..367] Zn-dependent exopeptidases 26 0.94 gi|1174715 [19..356] Thiamin-binding 26 1.2 gi|1652715 [5..195] NAD(P)-binding Rossmann-fold domains 26 1.7 gi|1945717 [15..256] alpha/beta-Hydrolases 26 1.7 gi|398985 [18..447] PLP-dependent transferases 25 2.1 gi|2145124 [22..373] Serpins 25 2.2 gi|2414449 [42..312] alpha/beta-Hydrolases 25 2.6 gi|2597838 [186..359] Cupredoxins 25 3.0 gi|1652197 [32..277] alpha/beta-Hydrolases 24 4.1 gi|2127787 [36..400] Periplasmic binding protein-like I 24 4.5 gi|2246648 [5..196] NAD(P)-binding Rossmann-fold domains 24 4.7 gi|451954 [433..740] R1 subunit of ribonucleotide reductase,... 24 5.0 gi|1174715 [357..558] Thiamin-binding 24 5.3 gi|442927 [3..324] FAD/NAD(P)-binding domain 24 5.5 gi|1518938 [387..612] Heat shock protein 70kD (HSP70), C-ter... 24 6.7 gi|1345687 [59..410] Heme-dependent peroxidases 23 8.0 gi|2194045 [11..394] Ferritin-like 23 8.9 gi|416581 [8..168] Ribonuclease H-like motif 23 9.0 >gi|585096 [110..367] Zn-dependent exopeptidases Length = 258 Score = 26.4 bits (58), Expect = 0.94 Identities = 7/46 (15%), Positives = 7/46 (15%), Gaps = 6/46 (13%) Query: 23 GVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKVYYLP 68 Sbjct: 59 TTPIIMTFLNDYLLAL------TNQTTIRGLSMGPLYNQTTLSLVP 98 >gi|1174715 [19..356] Thiamin-binding Length = 338 Score = 26.1 bits (57), Expect = 1.2 Identities = 15/110 (13%), Positives = 15/110 (13%), Gaps = 20/110 (18%) Query: 275 VLNQGQIFINTSLTEAFCMSIVE-AASCGLHVVSTRVGGVPEVLPIGEFISLEEPVPDDL 333 Sbjct: 185 FYDHNQISIEGDTKITLCEDTAARYRAYGWHVQ--EVEGGEN--------------VVGI 228 Query: 334 VDALLKAVDRREKGLLMD-PTEKHEAVSKMYNWPDVAARTQVIYQKAVES 382 Sbjct: 229 EEAIANAKAATDRPSFISLRTIIGYPAPTLINTG--KAHGAALGEDEVAA 276 >gi|1652715 [5..195] NAD(P)-binding Rossmann-fold domains Length = 191 Score = 25.8 bits (55), Expect = 1.7 Identities = 7/52 (13%), Positives = 7/52 (13%), Gaps = 3/52 (5%) Query: 21 AGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKVYYLPFIVA 72 Sbjct: 14 NRGIGKVL---VESFLEHGAAKVYAAVRKLESAAFLVDKYGNKIVPILIDLA 62 >gi|1945717 [15..256] alpha/beta-Hydrolases Length = 242 Score = 25.5 bits (54), Expect = 1.7 Identities = 10/81 (12%), Positives = 10/81 (12%), Gaps = 16/81 (19%) Query: 39 GHRVVVITHGYGN-----RKGIRYLSNGLKVYYL----------PFIVAYNGATLGSIVG 83 Sbjct: 3 GKASIMFAPGFGCDQSVWNAVAPAFEEDHRVILFDYVGSGHSDLRAYDLNRYQTLDGYAQ 62 Query: 84 SMPWLRKVLLRENVQIIHGHS 104 Sbjct: 63 DVLDVCEALDLKETVFV-GHS 82 >gi|398985 [18..447] PLP-dependent transferases Length = 430 Score = 25.2 bits (54), Expect = 2.1 Identities = 21/183 (11%), Positives = 21/183 (11%), Gaps = 19/183 (10%) Query: 6 GPYSIA-LVSDFFCPNAGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKV 64 Sbjct: 241 DAYLLRLCLNVNKYPNWSNGIFLCQSFAKNMGLYGERVGSL---SVITPATANNGKFNPL 297 Query: 65 YYLPFIVAYNGATLGSIVGSMP-------------WLRKVLLRENVQIIHGHSTFSSLAH 111 Sbjct: 298 QQKNSLQQNIDSQLKKIVRGMYSSPPGYGSRVVNVVLSDFKLKQQWFKDVDFMVQRLHHV 357 Query: 112 ETLMIGGL--MGLRTVFTDHSLFGFADASAILTNKLVLQYSLINVDQTICVSYTSKENTV 169 Sbjct: 358 RQEMFDRLGWPDLVNFAQQHGMFYYTRFSPKQVEILRNNSFVYLTGDGRLSLSGVNDSNV 417 Query: 170 LRG 172 Sbjct: 418 DYL 420 >gi|2145124 [22..373] Serpins Length = 352 Score = 25.3 bits (55), Expect = 2.2 Identities = 22/149 (14%), Positives = 22/149 (14%), Gaps = 21/149 (14%) Query: 163 TSKENTVLRGKLDPNKVSTIPNAIE-----TSLFTPD--RNQFFNNPTTIV------FLG 209 Sbjct: 127 SGMSNVVDSTMLDDNTLWTIINTIYFKGTWQCPFDIAKTHNASFTNKYGTKTVPMMNVVT 186 Query: 210 RLVYRKGAD--LLCEIVPKVCARHKSVRFIIGGDGPKRIELE---EMLERF--KLHERVV 262 Sbjct: 187 KLQGNTITVDDEEYDMARLPYKDTNISMYLAIGDNMTHFTDSITAAKLDYWSSQLGNKMY 246 Query: 263 ILGMLPHNQVKRVLNQGQIFINTSLTEAF 291 Sbjct: 247 NL-KLPRFSIENKRDIKSIAEMIAPGMFN 274 >gi|2414449 [42..312] alpha/beta-Hydrolases Length = 271 Score = 24.9 bits (54), Expect = 2.6 Identities = 21/133 (15%), Positives = 21/133 (15%), Gaps = 29/133 (21%) Query: 103 HSTFSSLAHE---TLMIGGLMGLRTVFTDHSLFGFADAS---AILTNKLVLQYSLINV-- 154 Sbjct: 92 RSGHEKTWQYVQDALSISQYRNYDVYVTGHSL-GGALAGLCAPRIVHDGLRQSQKIKVVT 150 Query: 155 -------DQTICVSYTSKENTVLRGKLDPNKVSTIPNAIETSLFTPD------------- 194 Sbjct: 151 FGEPRVGNIEFSRAYDQLVPYSFRVVHSGDVVPHLPGCVKDLSYTPPAGSDGSMPCDPVS 210 Query: 195 RNQFFNNPTTIVF 207 Sbjct: 211 TNGGYHHAIEIWY 223 >gi|2597838 [186..359] Cupredoxins Length = 174 Score = 24.8 bits (53), Expect = 3.0 Identities = 17/51 (33%), Positives = 17/51 (33%), Gaps = 6/51 (11%) Query: 68 PFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGHSTFSSLAHETLMIGG 118 Sbjct: 49 PSHVVFNGK-VGALTGKNALTANV--GENVLIVHSQANRDSRPH---LIGG 93 >gi|1652197 [32..277] alpha/beta-Hydrolases Length = 246 Score = 24.3 bits (51), Expect = 4.1 Identities = 7/89 (7%), Positives = 7/89 (7%), Gaps = 16/89 (17%) Query: 17 FCPNAGGVETHIYFLAQCLIELGHRVVVITH-GYGNRKGIRYLSNGLKVYYLPFIVAYNG 75 Sbjct: 2 LLHGLPSQSLCWTGVMPLLAEKGLTAIAPDWLGFGFSDILD--------------KRDFA 47 Query: 76 ATLGSIVGSMPWLRKVLLRENVQIIHGHS 104 Sbjct: 48 YTTAAYEQALGEFFQSLELAKIFLV-VQG 75 >gi|2127787 [36..400] Periplasmic binding protein-like I Length = 365 Score = 24.2 bits (52), Expect = 4.5 Identities = 4/46 (8%), Positives = 4/46 (8%) Query: 233 SVRFIIGGDGPKRIELEEMLERFKLHERVVILGMLPHNQVKRVLNQ 278 Sbjct: 175 DEIPYDPNIGDWSPIIQTTTNKIAGKGNDTGVIFIGYEEVATLLSQ 220 >gi|2246648 [5..196] NAD(P)-binding Rossmann-fold domains Length = 192 Score = 24.2 bits (51), Expect = 4.7 Identities = 4/52 (7%), Positives = 4/52 (7%), Gaps = 3/52 (5%) Query: 21 AGGVETHIYFLAQCLIELGHRVVVITHGYGNRKGIRYLSNGLKVYYLPFIVA 72 Sbjct: 6 SNGLGRCW---TESVIHEYGDRVIGITRSVEAAQEMTALYPEHFIPCIADVR 54 >gi|451954 [433..740] R1 subunit of ribonucleotide reductase, C-terminal domain Length = 308 Score = 24.1 bits (52), Expect = 5.0 Identities = 10/51 (19%), Positives = 10/51 (19%), Gaps = 5/51 (9%) Query: 116 IG-GLMGLRTVFTDHSLFGFADASAILTNKLV---LQYSLINVDQTICVSY 162 Sbjct: 84 LGICVTGLHSVFMTVGL-SYAHPDARRLYRMICEHIYYTCVRTSVDCCMKG 133 >gi|1174715 [357..558] Thiamin-binding Length = 202 Score = 24.1 bits (52), Expect = 5.3 Identities = 5/23 (21%), Positives = 5/23 (21%) Query: 104 STFSSLAHETLMIGGLMGLRTVF 126 Sbjct: 109 LQFSDYMRPSVRLASLMDIDTIY 131 >gi|442927 [3..324] FAD/NAD(P)-binding domain Length = 322 Score = 23.8 bits (51), Expect = 5.5 Identities = 10/57 (17%), Positives = 10/57 (17%), Gaps = 4/57 (7%) Query: 47 HGYGNRKGIRYLSNGLKVYYLPFIVAYNGATLGSIVGSMPWLRKVLLRENVQIIHGH 103 Sbjct: 194 RGVPTKKDFGC-GDPHGVSMFPNTLHEDQVRSDAARE---WLLPNYQRPNLQVLTGQ 246 >gi|1518938 [387..612] Heat shock protein 70kD (HSP70), C-terminal, substrate-binding fragment Length = 226 Score = 23.7 bits (51), Expect = 6.7 Identities = 11/35 (31%), Positives = 11/35 (31%) Query: 163 TSKENTVLRGKLDPNKVSTIPNAIETSLFTPDRNQ 197 Sbjct: 162 ASVEDDKVGGKLSAEDKKTILDKCSESLSWLDNNH 196 >gi|1345687 [59..410] Heme-dependent peroxidases Length = 352 Score = 23.2 bits (49), Expect = 8.0 Identities = 14/39 (35%), Positives = 14/39 (35%) Query: 340 AVDRREKGLLMDPTEKHEAVSKMYNWPDVAARTQVIYQK 378 Sbjct: 291 AVDPDEKDLAPDAEDPSKKVPTMMMTTDLALRFDPEYEK 329 >gi|2194045 [11..394] Ferritin-like Length = 384 Score = 23.0 bits (49), Expect = 8.9 Identities = 6/34 (17%), Positives = 6/34 (17%) Query: 331 DDLVDALLKAVDRREKGLLMDPTEKHEAVSKMYN 364 Sbjct: 110 ARYTQRFLAAYSSEGSIRTIDPYWRDEILNKYFG 143 >gi|416581 [8..168] Ribonuclease H-like motif Length = 161 Score = 23.2 bits (49), Expect = 9.0 Identities = 14/46 (30%), Positives = 14/46 (30%), Gaps = 3/46 (6%) Query: 43 VVITHGYGNRKGIRYLSNGLKVYYLPFIVAYN--GATLGSIVGSMP 86 Sbjct: 1 IIMDNGTGYSK-LGYAGNDAPSYVFPTVIATRSAGASSGPAVSSKP 45 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 18 Number of calls to ALIGN: 18 Length of query: 444 Total length of test sequences: 256703 Effective length of test sequences: 209547.0 Effective search space size: 84727213.0 Initial X dropoff for ALIGN: 25.0 bits ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ calculation of internal repeats with prospero ***** PROSPERO v1.3 Thu Nov 22 14:11:26 2001 ***** Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford For help see http://www.well.ox.ac.uk/ariadne For usage use -help using gap penalty 11+1k using matrix BLOSUM62 printing all alignments with eval < 0.100000 using sequence1 T20374 using self-comparison ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ TIGRFAM hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR00045 TIGR00045: conserved hypothetical protein T -0.5 45 1 TIGR00118 acolac_lg: acetolactate synthase, biosynthe -511.5 78 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR00045 1/1 271 280 .. 372 381 .] -0.5 45 TIGR00118 1/1 88 378 .. 1 593 [] -511.5 78 Alignments of top-scoring domains: TIGR00045: domain 1 of 1, from 271 to 280: score -0.5, E = 45 *->NvAqvLaigq<-* +v++vL +gq T20374 271 QVKRVLNQGQ 280 TIGR00118: domain 1 of 1, from 88 to 378: score -511.5, E = 78 *->msGAeaiveSLkdeGVetVFGYPGGAiLPiYDaLYrfetdsgieHIL + + L +e+V ++ G + t s + H T20374 88 ------LRKVLLRENVQIIHG-------------H--STFSSLAH-- 111 vRHEQgAvHAADGYARASGKvGVvlaTSGPGATNlVTGIAtAYmDSvPlV T20374 - -------------------------------------------------- - VfTGQVpTslIGsDAFQEaDilGItmPiTKHS.fqVksaeDlP...riik T +IG G+ T HS f +a+ +++++ ++ T20374 112 ------ETLMIGG-------LMGLRTVFTDHSlFGFADASAILtnkLVLQ 148 eAFhIAtTGRPGPVlvDLPKDvttaeiefpyddPekvnLPGYkPtveGhp + vD +t+ + e + L+G ++ +++ T20374 149 YSL----------INVD-----QTICVSYTSK--ENTVLRG---KLDPNK 178 lQDeFvmqsIkKAaeLiekAkKPVilvGGGvIniagAseeLkelAErlqi i +A e + + +r q T20374 179 VS------------------------------TIPNAIETSLFTPDRNQF 198 PVttTLmGlGsFPedHPlsLGMLGMHGTktANlAvhEcDLlIAVGaRFDD + + P +L G+++ +DLl + T20374 199 F-------------NNPTTIVFL---GRLV---YRKGADLLCEI------ 223 RvTGNlakFAPnAKRaaaeGRGGIIHIDIDPaeIGKnVrvdIPIVGDArn + k + + K Vr I G r+ T20374 224 -----VPKVCARHK----------------------SVRFIIGGDGPKRI 246 VLeeLlkklekekalkerseeqaWleqInkWKkeyplaYmdyteegkiKP Lee+l++ + er +l +++ ++ T20374 247 ELEEMLERFKLH----ERVV---------------ILGMLPHNQVK---- 273 QqVIeeisrvtkdigreAiVTTDVGQHQMWAAqFypfkkPRkfItSGGLG rv++ GQ F ++ T20374 274 --------RVLNQ-----------GQ------IFINTS------------ 286 TMGFGlPAAiGAkVAkPeetVicitGDGSFqMnlQELsTivqYdiPVkiv + e+ F M + E +++ ++ V+ T20374 287 ---------------LTEA----------FCMSIVEAASC---GLHVV-- 306 ILNNryLGMVrQWQeLFYeeRySethmgselPDFvkLAEaYGikGirIek t g ++P + E + + T20374 307 -----------------------STRVG-GVPEVLPIGE---FISLEEPV 329 peEldeKLkEAleskrnNePVllDvvVDkseeVyPMV..aPGggLdEmig p++l ++L A+ + + l+D + + e V M + + ++ ++ i T20374 330 PDDLVDALLKAVDRR--EKGLLMDPTEK-HEAVSKMYnwPDVAARTQVIY 376 ek<-* k T20374 377 QK 378 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm-f Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- TIGR00008 infA: translation initiation factor IF-1 1.7 26 1 TIGR00490 aEF-2: translation elongation factor aEF-2 -0.2 17 1 TIGR00045 TIGR00045: conserved hypothetical protein T -0.5 33 1 TIGR00282 TIGR00282: conserved hypothetical protein T -0.5 48 1 TIGR00381 cdhD: CO dehydrogenase/acetyl-CoA synthase, -1.4 37 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- TIGR00008 1/1 208 215 .. 62 69 .] 1.7 26 TIGR00282 1/1 203 241 .. 1 39 [. -0.5 48 TIGR00045 1/1 271 280 .. 372 381 .] -0.5 33 TIGR00490 1/1 327 351 .. 701 724 .] -0.2 17 TIGR00381 1/1 337 353 .. 265 281 .. -1.4 37 Alignments of top-scoring domains: TIGR00008: domain 1 of 1, from 208 to 215: score 1.7, E = 26 *->rGRIiyRl<-* +GR +yR+ T20374 208 LGRLVYRK 215 TIGR00282: domain 1 of 1, from 203 to 241: score -0.5, E = 48 *->ikvlflGdvyGkaGrkivkenlpklknkykpdlviange<-* ++flG ++ + G +++ e +pk+ ++k i+ g+ T20374 203 TTIVFLGRLVYRKGADLLCEIVPKVCARHKSVRFIIGGD 241 TIGR00045: domain 1 of 1, from 271 to 280: score -0.5, E = 33 *->NvAqvLaigq<-* +v++vL +gq T20374 271 QVKRVLNQGQ 280 TIGR00490: domain 1 of 1, from 327 to 351: score -0.2, E = 17 *->EkvPrelqeelvkev.RkRKGLklE<-* E vP +l + l+k v+R+ KGL ++ T20374 327 EPVPDDLVDALLKAVdRREKGLLMD 351 TIGR00381: domain 1 of 1, from 337 to 353: score -1.4, E = 37 *->LLkrglkpedSIVMDPT<-* LLk++ +e MDPT T20374 337 LLKAVDRREKGLLMDPT 353 // SMART hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/iprscan/data/smart.HMMs Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- LysM Lysin motif -10.4 49 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- LysM 1/1 122 161 .. 1 45 [] -10.4 49 Alignments of top-scoring domains: LysM: domain 1 of 1, from 122 to 161: score -10.4, E = 49 *->tYtVkkGDTLssIArkygvsvkdLlklNpilnpdnlyvGQkLkip<- tV +L++ A + + +++L + + v+Q+++++ T20374 122 LRTVFTDHSLFGFADASAILTNKLVLQY-----SLINVDQTICVS 161 * T20374 - - // COG hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG0438 126.3 5.6e-34 1 COG0297 -262.0 0.11 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG0297 1/1 6 384 .. 1 556 [] -262.0 0.11 COG0438 1/1 201 384 .. 1 255 [] 126.3 5.6e-34 Alignments of top-scoring domains: COG0297: domain 1 of 1, from 6 to 384: score -262.0, E = 0.11 *->lnsqdryserMkILfvasEvtPfvKvGGLADVlgaLPkaLkklGhdV ++ys + ++s ++ +GG ++ L++ L+ lGh V T20374 6 ----GPYS-----IALVSDFFC-PNAGGVETHIYFLAQCLIELGHRV 42 rVlLPkYgriqgepieqlykvsegetvavvgreqqfdvlesyldGt.vgl +V++ Yg +++++ + + ++++ + ++ ++ + G+ vg T20374 43 VVITHGYGN-RKGIRYLS---NGLKVYYLP----FIVAYNGATLGSiVGS 84 ylidKndyyfnregnPYhDanlygypDnaeRFafFsaAalelldgldpfw + ++ + + ++ ++ ++ + + l ++ T20374 85 MPWLRKVLLRENVQ------------------IIHGHSTFSSLAHET--- 113 qPDiVHaHDWhTGLvpalLKteyrklPFfervKtVFTIHNLaYQGEmIEY ++++l+ +l+ tVFT H L G T20374 114 ------------LMIGGLM-----GLR------TVFTDHSLF--G----- 133 GEVmTFLifpahylhellglplylfhyeglefpGqinflKaGivfaDhVT F ++a +++ l+l l + +++i T20374 134 -----FADASAILTNK-LVLQYSLIN-----VDQTI-------------- 158 TVSPTYAqEIqTpeygygLeglLkarssegklsGILNGIDyeiWnPetDp VS y + +L+ + + k+s I N I ++ + P T20374 159 CVS-----------YTSKENTVLRGKLDPNKVSTIPNAIETSLFTPD--- 194 ylaanYdagsledpvlFkkKaeNKtaLqeelGLpedddaPligiVsRLte ++++ ++ + i + +RL+ T20374 195 -------RNQFF------------------------NNPTTIVFLGRLVY 213 QKGvdLlleiideLlekEFqdaqlViLGtGdPeLE.nafrnlaerhpdsg KG dLl ei + + + ++++i+G G +E +++++ + h T20374 214 RKGADLLCEIVPKVCAR-HKSVRFIIGGDGPKRIElEEMLERFKLHE--- 259 nvavligfdepLArriYAGaDfilMPSrFEPCGLtQLiaMrYGTvPIVRe +v++l ++ + r+ +++ S E++ + +a G ++ ++ T20374 260 RVVILGMLPHNQVKRVLNQGQIFINTSLTEAFCMSIVEAASCGLHVVSTR 309 TGGLaDTVvdldydeenleekgtGflFkepdaeallnalsRAla.....l +GG V++ ++ g ++ +ep + +l +al +A+ +++++l T20374 310 VGG----VPE--VLPI-----GEFISLEEPVPDDLVDALLKAVDrrekgL 348 YrqelNEICmFmQYIRYCPHpdewqnlvtraMaNCYYHVFadfSWdkSPA ++ p e+ v+++++ W + A T20374 349 LMD-----------------PTEKHEAVSKMYN-----------WPDV-A 369 keYvelYegllaktrd<-* + + +Y++++ + ++ T20374 370 ARTQVIYQKAVES-EP 384 COG0438: domain 1 of 1, from 201 to 384: score 126.3, E = 5.6e-34 *->dkpvilfvGRlvpeKgldllieafaklkeeipellpdlklvivGgts ++ +i+f+GRlv++Kg dll e+++k+++++ ++++i G T20374 201 NPTTIVFLGRLVYRKGADLLCEIVPKVCARH----KSVRFIIGG--- 240 yiaaeacdGpeeerlrlleklakklglednVeflGfvpdprvldeelpel dGp++ + le++ ++ +l ++V+ lG +p + ++ ++ T20374 241 -------DGPKRIE---LEEMLERFKLHERVVILGMLP-----HNQVKRV 275 lkaadvfvlPSrysekrgedrEgfglvllEAmAaGtPViatdvgslelga l++ +f+ +S++ E+f+++++EA ++G+ V++t+vg T20374 276 LNQGQIFINTSLT--------EAFCMSIVEAASCGLHVVSTRVG------ 311 neereladkGipEvvedgarylfgenGregkrrlnlllvdpgdeddidsi G+pEv+ + G++ + ++ +d T20374 312 ---------GVPEVL-P--------IGEF-------ISLEEPVPD----- 331 ealaeaierlledpelreregvsllgrearrrvaerfswekiakrllkly +l++a++++ + +e+ ++ e+ ++v++ ++w +a r++ +y T20374 332 -DLVDALLKAVDRREKGLLMD----PTEKHEAVSKMYNWPDVAARTQVIY 376 eellekre<-* ++++e ++ T20374 377 QKAVESEP 384 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm-f Sequence file: T20374.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T20374 hypothetical protein D2085.6 - Caenorhabditis elegans. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG0438 124.6 7.6e-38 1 COG0564 2.0 9.9 1 COG0212 0.6 40 1 COG0840 0.5 52 1 COG0334 0.1 29 1 COG3201 -0.3 54 1 COG2823 -0.6 90 1 COG2119 -1.0 81 1 COG0524 -1.1 88 1 COG2403 -1.6 87 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG0334 1/1 29 45 .. 236 253 .. 0.1 29 COG2403 1/1 31 47 .. 161 177 .. -1.6 87 COG0524 1/1 18 48 .. 224 253 .. -1.1 88 COG3201 1/1 77 101 .. 233 257 .] -0.3 54 COG2119 1/1 119 133 .. 244 258 .] -1.0 81 COG0840 1/1 247 257 .. 383 393 .] 0.5 52 COG2823 1/1 272 284 .. 198 210 .] -0.6 90 COG0564 1/1 319 345 .. 315 341 .] 2.0 9.9 COG0438 1/1 201 384 .. 1 255 [] 124.6 7.6e-38 COG0212 1/1 387 397 .. 149 159 .. 0.6 40 Alignments of top-scoring domains: COG0334: domain 1 of 1, from 29 to 45: score 0.1, E = 29 *->qyaAeklleesGAkVVav<-* +++A++l+e+ G++VV++ T20374 29 YFLAQCLIEL-GHRVVVI 45 COG2403: domain 1 of 1, from 31 to 47: score -1.6, E = 87 *->vAqlLrelGyrVvavRH<-* +Aq L elG rVv++ H T20374 31 LAQCLIELGHRVVVITH 47 COG0524: domain 1 of 1, from 18 to 48: score -1.1, E = 88 *->desad.l.r.aeaaaarlllnekgaklVvvTlG<-* +++a +++++++ a++l e+g ++Vv+T+G T20374 18 CPNAGgVeThIYFLAQCLI--ELGHRVVVITHG 48 COG3201: domain 1 of 1, from 77 to 101: score -0.3, E = 54 *->tflailglriWlrdaalResrAlkq<-* t+ i g+ Wlr lRe++ + T20374 77 TLGSIVGSMPWLRKVLLRENVQIIH 101 COG2119: domain 1 of 1, from 119 to 133: score -1.0, E = 81 *->lfALlllwdvaegvs<-* l++L++ +++ ++++ T20374 119 LMGLRTVFTDHSLFG 133 COG0840: domain 1 of 1, from 247 to 257: score 0.5, E = 52 *->eLqelverFkv<-* eL+e++erFk+ T20374 247 ELEEMLERFKL 257 COG2823: domain 1 of 1, from 272 to 284: score -0.6, E = 90 *->VKkVvklfkkyvn<-* VK+V + +++++n T20374 272 VKRVLNQGQIFIN 284 COG0564: domain 1 of 1, from 319 to 345: score 2.0, E = 9.9 *->ngeemefeaplpedflellvkllkeei<-* +ge++ +e p+p+d++ +l k +++ T20374 319 IGEFISLEEPVPDDLVDALLKAVDRRE 345 COG0438: domain 1 of 1, from 201 to 384: score 124.6, E = 7.6e-38 *->dkpvilfvGRlvpeKgldllieafaklkeeipellpdlklvivGgts ++ +i+f+GRlv++Kg dll e+++k+++++ ++++i G T20374 201 NPTTIVFLGRLVYRKGADLLCEIVPKVCARH----KSVRFIIGG--- 240 yiaaeacdGpeeerlrlleklakklglednVeflGfvpdprvldeelpel dGp++ + le++ ++ +l ++V+ lG +p + ++ ++ T20374 241 -------DGPKRIE---LEEMLERFKLHERVVILGMLP-----HNQVKRV 275 lkaadvfvlPSrysekrgedrEgfglvllEAmAaGtPViatdvgslelga l++ +f+ +S++ E+f+++++EA ++G+ V++t+vg T20374 276 LNQGQIFINTSLT--------EAFCMSIVEAASCGLHVVSTRVG------ 311 neereladkGipEvvedgarylfgenGregkrrlnlllvdpgdeddidsi G+pEv+ + G++ + ++ +d T20374 312 ---------GVPEVL-P--------IGEF-------ISLEEPVPD----- 331 ealaeaierlledpelreregvsllgrearrrvaerfswekiakrllkly +l++a++++ + +e+ ++ e+ ++v++ ++w +a r++ +y T20374 332 -DLVDALLKAVDRREKGLLMD----PTEKHEAVSKMYNWPDVAARTQVIY 376 eellekre<-* ++++e ++ T20374 377 QKAVESEP 384 COG0212: domain 1 of 1, from 387 to 397: score 0.6, E = 40 *->RLGyGgGYYDR<-* RLG+ +GYYD+ T20374 387 RLGRLKGYYDQ 397 //