analysis of sequence from T00731.fa ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR PPRESGEA ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > T00731 . . . . . 1 MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFS 50 ___HHHHHHHHHH________EEE___EEEEEEE______HHHHHHHHHH . . . . . 51 LLGVSRTVKRLGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYG 100 H_____EEEE_____HHHHHHHH___HHHHH____EEEE___________ . . . . . 101 DNVEVDYRGYEVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGH 150 ___________EEEHHHHHHH__________HHHHH_____EEEEEE___ . . . . . 151 GGDEFLKFQDAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQ 200 ____EEE_HHHHHHHH_HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH . . . . . 201 LQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTLAFFERLNIYD 250 H_____EEE_________EEEEE_____________HHHHHHHHHH____ . . . . . 251 NASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD 300 ______________HHHHHHHH_______EEEEEE_____HHHHHHH___ . . . . . 301 SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVY 350 __HHHHHHHHHHHHHH_________HHHHHHHHH________HHHHHHEE . . . . . 351 TLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIE 400 EE_____EEEHHHHHHHH___EEEE_____HHHHHHHHHHHHH_____EE . . 401 IPCYVVVAEAKMPIFTNAGRPPRESGEA 428 E__EEEEEE___________________ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 29.0 % beta-contents : 44.0 % coil-contents : 27.0 % class : mixed method : 2 alpha-contents : 33.1 % beta-contents : 31.2 % coil-contents : 35.7 % class : mixed ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -17.80 -0.33 -0.56 -0.03 -4.00 0.00 0.00 0.00 -1.47 -10.02 -3.65 -12.00 -12.00 -8.00 -12.00 0.00 -81.86 2.19 0.00 0.00 0.00 0.00 0.00 -24.00 0.00 -0.44 -11.85 -3.65 -12.00 -12.00 0.00 -12.00 0.00 -73.75 ID: T00731 AC: xxx Len: 428 1:I 417 Sc: -73.75 Pv: 6.472730e-01 NO_GPI_SITE GPI: learning from protozoa -13.59 0.00 -0.08 0.00 -4.00 0.00 -24.00 0.00 -0.06 -10.01 -12.33 -12.00 -12.00 0.00 -12.00 0.00 -100.07 -18.10 -0.55 0.00 0.00 -4.00 0.00 -28.00 0.00 0.00 -10.01 -12.33 -12.00 0.00 0.00 -12.00 0.00 -96.99 ID: T00731 AC: xxx Len: 428 1:I 418 Sc: -96.99 Pv: 8.183256e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T00731 0.765 399 Y 0.708 399 Y 0.968 5 Y 0.200 N # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T00731 0.472 209 N 0.305 19 N 0.992 9 Y 0.882 Y # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? T00731 0.514 385 Y 0.431 61 Y 0.991 45 Y 0.725 Y ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. 1-379 MKILTLVMLLCYSFVSSTGDTTIHTNNWAV LVCTSRFCSLHSLVLTFIFSLLGVSRTVKR LGIPDERIILMLADDMACNARNEYPAQVFN NENHKLNLYGDNVEVDYRGYEVTVENFLRV LTGRHENAVPRSKRLLSDEGSHILLYMTGH GGDEFLKFQDAEELQSHDLADAVKQMKEKR RFKELMIMVDTCQAATLFNQLQSPGVLAIG SSLKGENSYSHHLDSDIGVSVVDRFTYYTL AFFERLNIYDNASLNSLFRSYDPRLLMSTA YYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLK EELENTNIPNDELIAEVTVYTLFPGLSYFG LSTLLRYMNLSRVRVLSMI ddvfafwlvfvllld 380-394 395-428 STNRIEIPCYVVVAEAKMPIFTNAGRPPRE SGEA low complexity regions: SEG 25 3.0 3.3 >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. 1-428 MKILTLVMLLCYSFVSSTGDTTIHTNNWAV LVCTSRFCSLHSLVLTFIFSLLGVSRTVKR LGIPDERIILMLADDMACNARNEYPAQVFN NENHKLNLYGDNVEVDYRGYEVTVENFLRV LTGRHENAVPRSKRLLSDEGSHILLYMTGH GGDEFLKFQDAEELQSHDLADAVKQMKEKR RFKELMIMVDTCQAATLFNQLQSPGVLAIG SSLKGENSYSHHLDSDIGVSVVDRFTYYTL AFFERLNIYDNASLNSLFRSYDPRLLMSTA YYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLK EELENTNIPNDELIAEVTVYTLFPGLSYFG LSTLLRYMNLSRVRVLSMIDDVFAFWLVFV LLLDSTNRIEIPCYVVVAEAKMPIFTNAGR PPRESGEA low complexity regions: SEG 45 3.4 3.75 >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. 1-428 MKILTLVMLLCYSFVSSTGDTTIHTNNWAV LVCTSRFCSLHSLVLTFIFSLLGVSRTVKR LGIPDERIILMLADDMACNARNEYPAQVFN NENHKLNLYGDNVEVDYRGYEVTVENFLRV LTGRHENAVPRSKRLLSDEGSHILLYMTGH GGDEFLKFQDAEELQSHDLADAVKQMKEKR RFKELMIMVDTCQAATLFNQLQSPGVLAIG SSLKGENSYSHHLDSDIGVSVVDRFTYYTL AFFERLNIYDNASLNSLFRSYDPRLLMSTA YYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLK EELENTNIPNDELIAEVTVYTLFPGLSYFG LSTLLRYMNLSRVRVLSMIDDVFAFWLVFV LLLDSTNRIEIPCYVVVAEAKMPIFTNAGR PPRESGEA low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFSLLGVSRTVKR LGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRV LTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR RFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTL AFFERLNIYDNASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVYTLFPGLSYFG LSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIEIPCYVVVAEAKMPIFTNAGR PPRESGEA 1 - 428 MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR PPRESGEA low complexity regions: DUST >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFSLLGVSRTVKR LGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRV LTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR RFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTL AFFERLNIYDNASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVYTLFPGLSYFG LSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIEIPCYVVVAEAKMPIFTNAGR PPRESGEA ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for T00731 sequence: 428 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~3 3333333333 3333333333 * 21 M'95 -w border ---------- ---------- ---------- ---------b cdefgabcde fgabcdefga * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~1 1111111111 1111111111 * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~1 1111111111 1111111111 * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~2555 5555555555 * 14 M'95 -w local . | . | . | . | . | . 240 RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL 2~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border b--------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar 1~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class 1111111~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. 5222~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 360 SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~1111 1111111111 11111111~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ------bcde fgabcdefga bcdefgab-- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~4444 4444444444 444444431~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~111 1111111111 11111111~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~11111111 111111~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 420 LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . PPRESGEA ~~~~~~~~ -------- ~~~~~~~~ ~~~~~~~~ ~~~~~~~~ ~~~~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** T00731.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: T00731.fa.___inter___ (1 sequences) MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFS LLGVSRTVKRLGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYG DNVEVDYRGYEVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGH GGDEFLKFQDAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQ LQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTLAFFERLNIYD NASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVY TLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIE IPCYVVVAEAKMPIFTNAGRPPRESGEA (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 4 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 1 21 1.555 Certain 2 37 57 1.664 Certain 3 193 213 0.619 Putative 4 345 365 1.446 Certain 5 375 395 0.997 Putative ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 4 5 Loop length 0 15 287 9 33 K+R profile 2.00 + 4.00 1.00 3.00 CYT-EXT prof - 0.50 - - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 2.00 Tm probability: 0.99 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: 0.50 -> Orientation: N-out ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 2 3 4 Loop length 0 15 135 131 63 K+R profile 2.00 + + 1.00 + CYT-EXT prof - -0.28 0.93 - 0.80 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.05 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.16 -> Orientation: N-in ---------------------------------------------------------------------- Structure 3 Transmembrane segments included in this structure: Segment 1 2 4 Loop length 0 15 287 63 K+R profile 2.00 + 1.00 + CYT-EXT prof - 0.50 - 0.93 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.43 -> Orientation: N-in ---------------------------------------------------------------------- Structure 4 Transmembrane segments included in this structure: Segment 1 2 3 4 5 Loop length 0 15 135 131 9 33 K+R profile 2.00 + 3.00 1.00 + 4.00 CYT-EXT prof - -0.28 - - 0.80 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 0.00 Tm probability: 0.05 -> Orientation: undecided Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -1.09 -> Orientation: N-in ---------------------------------------------------------------------- "T00731" 428 1 21 #t 1.55521 37 57 #t 1.66354 193 213 #f 0.61875 345 365 #t 1.44583 375 395 #f 0.996875 ************************************ *TOPPREDM with prokaryotic function* ************************************ T00731.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: T00731.fa.___inter___ (1 sequences) MKILTLVMLLCYSFVSSTGDTTIHTNNWAVLVCTSRFCSLHSLVLTFIFS LLGVSRTVKRLGIPDERIILMLADDMACNARNEYPAQVFNNENHKLNLYG DNVEVDYRGYEVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGH GGDEFLKFQDAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQ LQSPGVLAIGSSLKGENSYSHHLDSDIGVSVVDRFTYYTLAFFERLNIYD NASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTD SAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVY TLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIE IPCYVVVAEAKMPIFTNAGRPPRESGEA (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 4 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 1 21 1.555 Certain 2 37 57 1.664 Certain 3 193 213 0.619 Putative 4 345 365 1.446 Certain 5 375 395 0.997 Putative ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 3 4 5 Loop length 0 15 135 131 9 33 K+R profile 1.00 + 3.00 1.00 + 4.00 CYT-EXT prof - -0.28 - - 0.80 - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: -1.00 Tm probability: 0.05 -> Orientation: N-out Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -1.09 -> Orientation: N-in ---------------------------------------------------------------------- Structure 2 Transmembrane segments included in this structure: Segment 1 2 4 5 Loop length 0 15 287 9 33 K+R profile 1.00 + 4.00 1.00 3.00 CYT-EXT prof - 0.50 - - - For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 1.00 Tm probability: 0.99 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: 0.50 -> Orientation: N-out ---------------------------------------------------------------------- Structure 3 Transmembrane segments included in this structure: Segment 1 2 3 4 Loop length 0 15 135 131 63 K+R profile 1.00 + + 1.00 + CYT-EXT prof - -0.28 0.93 - 0.80 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 0.00 Tm probability: 0.05 -> Orientation: undecided Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.16 -> Orientation: N-in ---------------------------------------------------------------------- Structure 4 Transmembrane segments included in this structure: Segment 1 2 4 Loop length 0 15 287 63 K+R profile 1.00 + 1.00 + CYT-EXT prof - 0.50 - 0.93 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 0.00 Tm probability: 1.00 -> Orientation: undecided Charge-difference over N-terminal Tm (+-15 residues): 2.00 (NEG-POS)/(NEG+POS): -1.0000 NEG: 0.0000 POS: 1.0000 -> Orientation: N-in CYT-EXT difference: -0.43 -> Orientation: N-in ---------------------------------------------------------------------- "T00731" 428 1 21 #t 1.55521 37 57 #t 1.66354 193 213 #f 0.61875 345 365 #t 1.44583 375 395 #f 0.996875 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ SAPS. Version of April 11, 1996. Date run: Thu Feb 21 12:37:25 2002 File: /people/b_eisen/T00731.fa.___saps___ ID T00731 DE hypothetical protein F22O13.26 - Arabidopsis thaliana. number of residues: 428; molecular weight: 48.8 kdal 1 MKILTLVMLL CYSFVSSTGD TTIHTNNWAV LVCTSRFCSL HSLVLTFIFS LLGVSRTVKR 61 LGIPDERIIL MLADDMACNA RNEYPAQVFN NENHKLNLYG DNVEVDYRGY EVTVENFLRV 121 LTGRHENAVP RSKRLLSDEG SHILLYMTGH GGDEFLKFQD AEELQSHDLA DAVKQMKEKR 181 RFKELMIMVD TCQAATLFNQ LQSPGVLAIG SSLKGENSYS HHLDSDIGVS VVDRFTYYTL 241 AFFERLNIYD NASLNSLFRS YDPRLLMSTA YYRTDLYQPH LVEVPVTNFF GSVMETIHTD 301 SAYKAFSSKI SERKINSEMP FNQLSEHDLK EELENTNIPN DELIAEVTVY TLFPGLSYFG 361 LSTLLRYMNL SRVRVLSMID DVFAFWLVFV LLLDSTNRIE IPCYVVVAEA KMPIFTNAGR 421 PPRESGEA -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 23( 5.4%); C : 6( 1.4%); D : 24( 5.6%); E : 29( 6.8%); F : 24( 5.6%) G : 19( 4.4%); H : 12( 2.8%); I : 20( 4.7%); K : 15( 3.5%); L : 53(12.4%) M : 14( 3.3%); N : 25( 5.8%); P : 14( 3.3%); Q : 9( 2.1%); R : 24( 5.6%) S : 36( 8.4%); T : 26( 6.1%); V : 34( 7.9%); W : 2( 0.5%); Y : 19( 4.4%) KR : 39 ( 9.1%); ED : 53 ( 12.4%); AGP : 56 ( 13.1%); KRED : 92 ( 21.5%); KR-ED : -14 ( -3.3%); FIKMNY : 117 ( 27.3%); LVIFM : 145 ( 33.9%); ST : 62 ( 14.5%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 0+00000000 000000000- 0000000000 00000+0000 0000000000 00000+00++ 61 0000--+000 000--00000 +0-0000000 0-00+00000 -00-0-0+00 -000-000+0 121 000+0-0000 +0++000--0 0000000000 00--00+00- 0--0000-00 -00+00+-++ 181 +0+-00000- 0000000000 0000000000 000+0-0000 000-0-0000 00-+000000 241 000-+0000- 00000000+0 0-0+000000 00+0-00000 00-0000000 0000-0000- 301 000+0000+0 0-++000-00 00000-0-0+ --0-000000 --000-0000 0000000000 361 00000+0000 0+0+00000- -000000000 000-000+0- 00000000-0 +00000000+ 421 00+-00-0 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 11/45 or 14/60): none Negative charge clusters (cmin = 10/30 or 14/45 or 17/60): none Mixed charge clusters (cmin = 15/30 or 20/45 or 25/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. ________________________________ High scoring uncharged segments: score= 1.00 frequency= 0.785 ( LAGSVTIPNFQYHMCW ) score= 0.00 frequency= 0.000 ( BZX ) score= -8.00 frequency= 0.215 ( KEDR ) Expected score/letter: -0.935 M_0.01= 42.90; M_0.05= 34.59 1) From 3 to 55: length= 53, score=35.00 * 3 ILTLVMLLCY SFVSSTGDTT IHTNNWAVLV CTSRFCSLHS LVLTFIFSLL 53 GVS L: 10(18.9%); S: 8(15.1%); V: 6(11.3%); T: 7(13.2%); C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 4 | 5 | 7 | 38 | 9 | 10 | 13 | 11 | 12 | 15 | 7 | 9 | lmin1 6 | 6 | 9 | 46 | 11 | 12 | 15 | 13 | 14 | 18 | 8 | 11 | lmin2 7 | 8 | 10 | 51 | 12 | 13 | 17 | 15 | 16 | 20 | 10 | 12 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 1, at 179; - runs >= 3: 0 * runs >= 5: 1, at 177; 0 runs >= 25: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. There are no high scoring transmembrane segments. 2. SPACINGS OF C. H2N-10-C-21-C-4-C-39-C-113-C-210-C-25-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-10-C-12-H-8-C-4-C-2-H-36-C-15-H-30-H-16-H-7-H-16-H-24-C-28-H-H-57-H-17-H-28-H-75-C-25-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 Aligned matching blocks: [ 22- 25] TIHT [ 296- 299] TIHT ______________________________ [ 111- 114] EVTV [ 346- 349] EVTV ______________________________ [ 198- 201] FNQL [ 321- 324] FNQL B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 29 (Expected range: 10-- 40) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 6 (6-10) 9 (11-20) 10 (>=21) 5 3. Clusters of amino acid multiplets (cmin = 12/30 or 15/45 or 18/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 12 (Expected range: 0-- 18) 4 +plets (f+: 9.1%), 8 -plets (f-: 12.4%) Total number of charge altplets: 8 (Critical number: 21) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 2 (6-10) 3 (11-20) 3 (>=21) 5 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 43- 78 9 L........ 4 4 0 318- 349 8 E....... 4 4 0 343- 378 9 L........ 4 4 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core:10) Location Period Element Copies Core Errors 23- 64 7 i.0.0.0 6 6 /0/./2/./1/./2/ 65- 109 9 -.0.000.0 5 5 /0/./1/./0/1/0/./0/ 168- 185 3 *.. 6 6 0 343- 423 9 i......0. 8 6 /1/././././././2/./ 361- 393 3 i.. 10 8 1 368- 397 5 i.... 6 6 0 375- 422 8 i..0.... 6 6 /0/././2/././././ 387- 393 1 i 7 7 0 -------------------------------------------------------------------------------- SPACING ANALYSIS. There are no unusual spacings. ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Peptidase_C13 Peptidase C13 family 627.4 8.1e-185 1 NodD_C_term NodD transcription activator carboxyl t 7.5 0.39 1 Birna_VP3 Birnavirus VP3 protein 3.9 3.5 1 MATH MATH domain 1.8 34 1 Peptidase_A3 Cauliflower mosaic virus peptidase (A3) 0.5 66 1 SAP SAP domain -3.9 53 1 DUF190 Uncharacterized ACR, COG1993 -42.9 56 1 CMAS Cyclopropane-fatty-acyl-phospholipid sy -106.7 32 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- Birna_VP3 1/1 165 179 .. 244 258 .] 3.9 3.5 DUF190 1/1 113 195 .. 1 103 [] -42.9 56 CMAS 1/1 103 239 .. 1 174 [] -106.7 32 NodD_C_term 1/1 263 282 .. 1 20 [. 7.5 0.39 Peptidase_C13 1/1 2 329 .. 1 364 [] 627.4 8.1e-185 Peptidase_A3 1/1 308 340 .. 172 207 .. 0.5 66 MATH 1/1 331 351 .. 136 159 .] 1.8 34 SAP 1/1 321 351 .. 1 35 [] -3.9 53 Alignments of top-scoring domains: Birna_VP3: domain 1 of 1, from 165 to 179: score 3.9, E = 3.5 *->QmkdLrhlarqmkrr<-* Q dL ++++qmk + T00731 165 QSHDLADAVKQMKEK 179 DUF190: domain 1 of 1, from 113 to 195: score -42.9, E = 56 *->vkkklLrIYtsEddkfEGkplYkalverLkeSeGirGATVlrGIaGF + + Lr+ t+ +E v r k+ +G l G T00731 113 TVENFLRVLTGR---HE------NAVPRSKRLLSDEGSHILLYMTGH 150 GkkkevhsedlfrLsveLPVvvEvVDeeekIkrvLeeikel..iknhGLI G++ d + L + + ++ ++ ke++++k +I T00731 151 GGDEFLKFQDAEELQS------------HDLADAVKQMKEKrrFKE-LMI 187 TlEdvkVl<-* + ++++ T00731 188 MVDTCQAA 195 CMAS: domain 1 of 1, from 103 to 239: score -106.7, E = 32 *->kevlLqDwedfdepvDrIVSvGaFEHvGGhenYdtFFkklyrilpad +ev+++++e e++ r+ + G +++ +++r+l ++ T00731 103 VEVDYRGYEVTVENFLRVLT-------G---RHENAVPRSKRLLSDE 139 GlmLLHtItslhpkelserGlkltmslaRFlkFIdkyIFPGGeLPs...i G H + + e FlkF d + eL s++ T00731 140 GS---HILLYMTGHGGDE-----------FLKFQDAE-----ELQShdlA 170 emIvesaqeaGFt..vedvqsLrpHYAkTLdlWaenLqank..deAialg ++ + ++ F++ +++v+ + a++L ++++++++ a+g T00731 171 DAVKQMKEKRRFKelMIMVDTCQ----------AATLFNQLqsPGVLAIG 210 qsEevyrmymlYLtGCakaFRkGyidvhQftltK<-* s + + y + L +G v ft+ T00731 211 SSLKGENSYSHHLDS-----DIGVSVVDRFTYYT 239 NodD_C_term: domain 1 of 1, from 263 to 282: score 7.5, E = 0.39 *->pelfmssaHprakLFeerlV<-* p l ms+a+ r L++ +lV T00731 263 PRLLMSTAYYRTDLYQPHLV 282 Peptidase_C13: domain 1 of 1, from 2 to 329: score 627.4, E = 8.1e-185 *->avfllvvLlilavvaaRdnfgdnislpsEevkffrDddghTnnWAVL ++++lv+Ll+++ +++s++ ++++ hTnnWAVL T00731 2 KILTLVMLLCYS---------FVSSTG--DTTI------HTNNWAVL 31 VAGSnGwfNYRHqA.fifDVChaYqllkrlGipDEnIIvmmyDDIAcNer V++S++++ ++ + +fif+++++++++krlGipDE+II+m++DD+AcN+r T00731 32 VCTSRFCSLHSLVLtFIFSLLGVSRTVKRLGIPDERIILMLADDMACNAR 81 NPrPGvviNhpnngtDvYggdVpvDYrGeeVTveNFlrVLtGdksavtgg N++P++v+N++n+++++Yg++V+vDYrG+eVTveNFlrVLtG++++++++ T00731 82 NEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRVLTGRHENAVPR 131 SGKvLlSdpnDhIFIYyTDHGGpGvLkFPdseeLyakDLadalkkmhekk +K+LlSd+++hI++Y+T+HGG+++LkF+d+eeL+++DLada+k+m+ek+ T00731 132 -SKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR 180 rYkeLvfyiEACeSGSmFegllspdLNIyAtTASnagEsSYstycDgdip r+keL++++++C+++++F++l+sp +++A+++S++gE+SYs+++D+di+ T00731 181 RFKELMIMVDTCQAATLFNQLQSP--GVLAIGSSLKGENSYSHHLDSDIG 228 sPPpvyvtcLgDlYSvaWlEdsekHnlskeTLqqqYksvkkrtclynysy v+v++++++Y++a++E++++ +++++L+++++s+++r+ + T00731 229 ----VSVVDRFTYYTLAFFERLNI--YDNASLNSLFRSYDPRLLM----- 267 GSHVmqygDlyisklklvlftgffpavsNftivdepplrkplevvnqrDa S++++++Dly+++l++v++t+ff++v++ ti+++++ ++++++++++ T00731 268 -STAYYRTDLYQPHLVEVPVTNFFGSVME-TIHTDSA---YKAFSSKISE 312 dLhtlwrkyqlanngsek<-* +++++++++++ ++ +++ T00731 313 RKINSEMPFNQLSE-HDL 329 Peptidase_A3: domain 1 of 1, from 308 to 340: score 0.5, E = 66 *->cslnpgdelgEeeklfntiivkiqlIEpLlEkNVcS<-* +++++ +++ +e+ fn+++ + +l+E+L N+++ T00731 308 SKISE-RKIN-SEMPFNQLS-EHDLKEELENTNIPN 340 MATH: domain 1 of 1, from 331 to 351: score 1.8, E = 34 *->ddLeddyngylvdDsiiiEaeVkI<-* ++Le+ + +++D +i E++V T00731 331 EELENTN---IPNDELIAEVTVYT 351 SAP: domain 1 of 1, from 321 to 351: score -3.9, E = 53 *->lskLkVseLKeeLkkrGLstsGkKaeLveRLkeal<-* ++ L+ +LKeeL+ +++ eL+ + + T00731 321 FNQLSEHDLKEELENTNIPND----ELIAEVTVYT 351 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Peptidase_C13 Peptidase C13 family 625.6 2.8e-184 1 NodD_C_term NodD transcription activator carboxyl t 7.5 0.39 1 Birna_VP3 Birnavirus VP3 protein 3.9 3.5 1 LEM LEM domain 3.6 25 1 MATH MATH domain 1.8 34 1 DUF140 Domain of unknown function DUF140 1.0 36 1 UCR_hinge Ubiquinol-cytochrome C reductase hinge 0.9 75 1 TP_methylase Tetrapyrrole (Corrin/Porphyrin) Methyla 0.5 41 1 Peptidase_A3 Cauliflower mosaic virus peptidase (A3) 0.5 66 1 Nitrophorin Nitrophorin -0.2 52 1 CMAS Cyclopropane-fatty-acyl-phospholipid sy -0.3 80 1 EAV_env_prot Equine arteritis virus small envelope g -1.7 89 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- EAV_env_prot 1/1 35 50 .. 1 17 [. -1.7 89 DUF140 1/1 36 54 .. 248 266 .] 1.0 36 TP_methylase 1/1 58 77 .. 207 226 .] 0.5 41 CMAS 1/1 103 111 .. 1 9 [. -0.3 80 Birna_VP3 1/1 165 179 .. 244 258 .] 3.9 3.5 UCR_hinge 1/1 189 202 .. 52 65 .] 0.9 75 Nitrophorin 1/1 249 257 .. 173 181 .] -0.2 52 NodD_C_term 1/1 263 282 .. 1 20 [. 7.5 0.39 Peptidase_C13 1/1 2 329 .. 1 364 [] 625.6 2.8e-184 LEM 1/1 318 339 .. 1 22 [. 3.6 25 Peptidase_A3 1/1 308 340 .. 172 207 .. 0.5 66 MATH 1/1 331 351 .. 136 159 .] 1.8 34 Alignments of top-scoring domains: EAV_env_prot: domain 1 of 1, from 35 to 50: score -1.7, E = 89 *->FsflCylHWLLLLcfFs<-* s C lH L+L ++Fs T00731 35 -SRFCSLHSLVLTFIFS 50 DUF140: domain 1 of 1, from 36 to 54: score 1.0, E = 36 *->VtsllvifildfvlTaimf<-* ++++l ++l+f+++++++ T00731 36 RFCSLHSLVLTFIFSLLGV 54 TP_methylase: domain 1 of 1, from 58 to 77: score 0.5, E = 41 *->venatkpderilrgtLgeia<-* v++++ pderi+ + + ++a T00731 58 VKRLGIPDERIILMLADDMA 77 CMAS: domain 1 of 1, from 103 to 111: score -0.3, E = 80 *->kevlLqDwe<-* +ev+++++e T00731 103 VEVDYRGYE 111 Birna_VP3: domain 1 of 1, from 165 to 179: score 3.9, E = 3.5 *->QmkdLrhlarqmkrr<-* Q dL ++++qmk + T00731 165 QSHDLADAVKQMKEK 179 UCR_hinge: domain 1 of 1, from 189 to 202: score 0.9, E = 75 *->lDhCvaaKlFdsLK<-* +D C aa lF+ L T00731 189 VDTCQAATLFNQLQ 202 Nitrophorin: domain 1 of 1, from 249 to 257: score -0.2, E = 52 *->YDdvqltSL<-* YD+ +l SL T00731 249 YDNASLNSL 257 NodD_C_term: domain 1 of 1, from 263 to 282: score 7.5, E = 0.39 *->pelfmssaHprakLFeerlV<-* p l ms+a+ r L++ +lV T00731 263 PRLLMSTAYYRTDLYQPHLV 282 Peptidase_C13: domain 1 of 1, from 2 to 329: score 625.6, E = 2.8e-184 *->avfllvvLlilavvaaRdnfgdnislpsEevkffrDddghTnnWAVL ++++lv+Ll+++ +++s++ ++++ hTnnWAVL T00731 2 KILTLVMLLCYS---------FVSSTG--DTTI------HTNNWAVL 31 VAGSnGwfNYRHqA.fifDVChaYqllkrlGipDEnIIvmmyDDIAcNer V++S++++ ++ + +fif+++++++++krlGipDE+II+m++DD+AcN+r T00731 32 VCTSRFCSLHSLVLtFIFSLLGVSRTVKRLGIPDERIILMLADDMACNAR 81 NPrPGvviNhpnngtDvYggdVpvDYrGeeVTveNFlrVLtGdksavtgg N++P++v+N++n+++++Yg++V+vDYrG+eVTveNFlrVLtG++++++++ T00731 82 NEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRVLTGRHENAVPR 131 SGKvLlSdpnDhIFIYyTDHGGpGvLkFPdseeLyakDLadalkkmhekk +K+LlSd+++hI++Y+T+HGG+++LkF+d+eeL+++DLada+k+m+ek+ T00731 132 -SKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVKQMKEKR 180 rYkeLvfyiEACeSGSmFegllspdLNIyAtTASnagEsSYstycDgdip r+keL++++++C+++++F++l+sp +++A+++S++gE+SYs+++D+di+ T00731 181 RFKELMIMVDTCQAATLFNQLQSP--GVLAIGSSLKGENSYSHHLDSDIG 228 sPPpvyvtcLgDlYSvaWlEdsekHnlskeTLqqqYksvkkrtclynysy v+v++++++Y++a++E++++ +++++L+++++s+++r+ + T00731 229 ----VSVVDRFTYYTLAFFERLNI--YDNASLNSLFRSYDPRLLM----- 267 GSHVmqygDlyisklklvlftgffpavsNftivdepplrkplevvnqrDa S++++++Dly+++l++v++t+ff++v++ ti+++++ ++++++++++ T00731 268 -STAYYRTDLYQPHLVEVPVTNFFGSVME-TIHTDSA---YKAFSSKISE 312 dLhtlwrkyqlanngsek<-* +++++++++++ ++ +++ T00731 313 RKINSEMPFNQLSE-HDL 329 LEM: domain 1 of 1, from 318 to 339: score 3.6, E = 25 *->mldvaqLsDaELrseLrkyGis<-* ++ qLs+ +L++eL +i+ T00731 318 EMPFNQLSEHDLKEELENTNIP 339 Peptidase_A3: domain 1 of 1, from 308 to 340: score 0.5, E = 66 *->cslnpgdelgEeeklfntiivkiqlIEpLlEkNVcS<-* +++++ +++ +e+ fn+++ + +l+E+L N+++ T00731 308 SKISE-RKIN-SEMPFNQLS-EHDLKEELENTNIPN 340 MATH: domain 1 of 1, from 331 to 351: score 1.8, E = 34 *->ddLeddyngylvdDsiiiEaeVkI<-* ++Le+ + +++D +i E++V T00731 331 EELENTN---IPNDELIAEVTVYT 351 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Thu Feb 21 12:39:30 2002 Sequence file: T00731.fa ---------------------------------------- Sequence T00731 (428 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 251: NASL 369: NLSR Total matches: 2 Matching pattern PS00005 PKC_PHOSPHO_SITE: 34: TSR 57: TVK 122: TGR 132: SKR 212: SLK 307: SSK 311: SER 396: TNR Total matches: 8 Matching pattern PS00006 CK2_PHOSPHO_SITE: 17: STGD 230: SVVD 292: SVME 325: SEHD 377: SMID Total matches: 5 Matching pattern PS00008 MYRISTYL: 53: GVSRTV Total matches: 1 Total no of hits in this sequence: 16 ======================================== 1314 pattern(s) searched in 1 sequence(s), 428 residues. Total no of hits in all sequences: 16. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >ER-GOLGI-traffic signal is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M minimal SH3 binding is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. ;LENGTH=428; DIRECT_SEQUENCE n 1 solutions m %_RSXXP 259-263 f >STATISTICS Total : 1 solutions in 1 sequences, 428 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >EXTRACELL-M minimal furin protease cleavage site motif is the MOTIF name >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. ;LENGTH=428; DIRECT_SEQUENCE n 2 solutions m %_RXXR 131-134 f m %_RXXR 420-423 f >STATISTICS Total : 2 solutions in 1 sequences, 428 units; out of 1 sequences, 428 units >EXTRACELL-M extended furin protease cleavage site motif is the MOTIF name >T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. ;LENGTH=428; DIRECT_SEQUENCE n 1 solutions m %_RX 131-132 %_K 133-133 %_R 134-134 f >STATISTICS Total : 1 solutions in 1 sequences, 428 units; out of 1 sequences, 428 units >EXTRACELL-M zinc binding motif in MMPs is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >EXTRACELL-M g alpha binding go loco is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SV40 LrgT experimentally determined is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS Max experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >PDZ domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units >WW domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 428 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB+ PSSM from other authors IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. (428 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value CYCLIN Cyclin/TFIIB domain 25 0.14 MATH The Meprin associated TRAF homology domain 22 1.2 SEC14D Sec14 related lipid binding domain 21 3.6 INSL Insulinase like Metallo protease domain 21 3.7 AN1 AN1 like cysteine rich zinc coordinating domain 21 3.9 CATH Cathepsin like protease domain 21 4.6 KIN Protein kinase domain 20 5.6 UBHYD Ubiquitin C-terminal hydrolase domain 20 5.7 FKBP FK506 binding protein (Peptidyl prolyl isomerase) 20 5.9 CYCL cyclophilin like peptidyl prolyl isomerases 20 6.5 14-3-3 14-3-3 protein alpha Helical domain 20 9.1 >CYCLIN Cyclin/TFIIB domain Length = 317 Score = 25.4 bits (55), Expect = 0.14 Identities = 8/44 (18%), Positives = 8/44 (18%) Query: 229 VSVVDRFTYYTLAFFERLNIYDNASLNSLFRSYDPRLLMSTAYY 272 Sbjct: 61 FCSVFKPAMPRSVVGTACMYFKRFYLNNSVMEYHPRIIMLTCAF 104 >MATH The Meprin associated TRAF homology domain Length = 209 Score = 22.1 bits (47), Expect = 1.2 Identities = 7/34 (20%), Positives = 7/34 (20%) Query: 321 FNQLSEHDLKEELENTNIPNDELIAEVTVYTLFP 354 Sbjct: 133 FKKFIRRDFLLDEANGLLPDDKLTLFCEVSVVQD 166 >SEC14D Sec14 related lipid binding domain Length = 248 Score = 21.0 bits (44), Expect = 3.6 Identities = 15/91 (16%), Positives = 15/91 (16%), Gaps = 10/91 (10%) Query: 111 EVTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLA 170 Sbjct: 101 EITFDEILQAYCFILEKLLENEETQI--NGFCIIENFKG------FTMQQAASLRTSDLR 152 Query: 171 DAVKQMKEKRRFKELMIMVDTCQAATLFNQL 201 Sbjct: 153 KMVDMLQD--SFPARFKAIHFIHQPWYFTTT 181 >INSL Insulinase like Metallo protease domain Length = 433 Score = 20.6 bits (43), Expect = 3.7 Identities = 8/76 (10%), Positives = 8/76 (10%), Gaps = 12/76 (15%) Query: 104 EVDYRGYEVTV-ENFLRVLTGRHENAVPRSKRLLSDEGS-----------HILLYMTGHG 151 Sbjct: 15 VLTAQELYIRDLPNGAKLIVKPRDDTEAVALHVWFRVGSVYEKYDEKGMAHFLEHMLFNG 74 Query: 152 GDEFLKFQDAEELQSH 167 Sbjct: 75 TEKYKYGEIDRIIESL 90 >AN1 AN1 like cysteine rich zinc coordinating domain Length = 57 Score = 20.8 bits (43), Expect = 3.9 Identities = 5/11 (45%), Positives = 5/11 (45%) Query: 33 CTSRFCSLHSL 43 Sbjct: 27 CSRRYCLSHHL 37 >CATH Cathepsin like protease domain Length = 371 Score = 20.6 bits (43), Expect = 4.6 Identities = 9/89 (10%), Positives = 9/89 (10%), Gaps = 13/89 (14%) Query: 15 VSSTGDTTIHTNNWAVLVCT-SRFCSLHSLVLTFIFSLLGVSRTVKRLGIPDERIILMLA 73 Sbjct: 235 VDVDNGLTVCKDGCEAIVDTGTSLITGPTDEIKQLQKAIGAKPIIKGQYMLP-------- 286 Query: 74 DDMACNARNEYPAQVFNNENHKLNLYGDN 102 Sbjct: 287 ----CDKLSSLPNVNLVLGGKSYALTPNQ 311 >KIN Protein kinase domain Length = 313 Score = 20.0 bits (41), Expect = 5.6 Identities = 11/51 (21%), Positives = 11/51 (21%) Query: 290 FGSVMETIHTDSAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPN 340 Sbjct: 42 YGVVCSAKDNLTGEKVAIKKISKAFDNLKDTKRTLREIHLLRHFKHENLIS 92 Score = 20.0 bits (41), Expect = 5.9 Identities = 14/83 (16%), Positives = 14/83 (16%), Gaps = 9/83 (10%) Query: 162 EELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGEN---- 217 Sbjct: 112 SELMDTDLHQ---IITSPQPLSDDHCQYFVYQMLRGLKHIHSANV--LHRDLKPSNLLIN 166 Query: 218 SYSHHLDSDIGVSVVDRFTYYTL 240 Sbjct: 167 EDCLLKICDLGLARVEDATHQGF 189 >UBHYD Ubiquitin C-terminal hydrolase domain Length = 884 Score = 19.9 bits (41), Expect = 5.7 Identities = 10/53 (18%), Positives = 10/53 (18%), Gaps = 6/53 (11%) Query: 283 EVPVTNFFGSVMETI--HTDSAYKAFSSKISERKINSEMPFNQLSEHDLKEEL 333 Sbjct: 274 EAIEHNYGGHDDDLSVRHCTNAYMLV----YIRESKLSEVLQAVTDHDIPQQL 322 >FKBP FK506 binding protein (Peptidyl prolyl isomerase) Length = 149 Score = 20.2 bits (42), Expect = 5.9 Identities = 2/11 (18%), Positives = 2/11 (18%) Query: 100 GDNVEVDYRGY 110 Sbjct: 8 NSAVLVHFTLK 18 >CYCL cyclophilin like peptidyl prolyl isomerases Length = 165 Score = 20.1 bits (42), Expect = 6.5 Identities = 9/28 (32%), Positives = 9/28 (32%), Gaps = 7/28 (25%) Query: 96 LNLYGDNVEVDYRGYEVTVENFLRVLTG 123 Sbjct: 22 FELFADKV-------PKTAENFRALSTG 42 >14-3-3 14-3-3 protein alpha Helical domain Length = 270 Score = 19.5 bits (40), Expect = 9.1 Identities = 8/22 (36%), Positives = 8/22 (36%), Gaps = 1/22 (4%) Query: 2 KILTLVM-LLCYSFVSSTGDTT 22 Sbjct: 217 KDSTLIMQLLRDNLTLWTSDAE 238 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 11 Number of calls to ALIGN: 12 Length of query: 428 Total length of test sequences: 20182 Effective length of test sequences: 16435.0 Effective search space size: 6453006.9 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. (428 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|478310 [306..563] beta/alpha (TIM)-barrel 31 0.045 gi|1790450 [741..896] Flavodoxin-like 26 0.86 gi|999515 [1..176] NAD(P)-binding Rossmann-fold domains 26 1.2 gi|2088870 [834..1045] (Phosphotyrosine) protein phosphatase... 25 2.1 gi|730428 [2..438] P-loop containing nucleotide triphosphate... 25 2.4 gi|1942733 [1..168] Lysozyme-like 25 2.7 gi|2194029 [140..303] Ferredoxin reductase-like, C-terminal ... 24 3.6 gi|1172572 [1..540] Phosphoenolpyruvate carboxykinase (ATP-o... 24 4.7 gi|2707940 [1..184] Ribonuclease H-like motif 24 5.0 gi|555731 [3..383] Serpins 24 5.7 gi|1817676 [157..389] Protein kinases (PK), catalytic core 24 6.3 gi|1902913 [26..315] Protein kinases (PK), catalytic core 24 7.5 gi|1708972 [46..317] FAD/NAD(P)-binding domain 23 9.5 >gi|478310 [306..563] beta/alpha (TIM)-barrel Length = 258 Score = 30.8 bits (69), Expect = 0.045 Identities = 13/138 (9%), Positives = 13/138 (9%), Gaps = 14/138 (10%) Query: 247 NIYDNASLNSLFRSYDPRLLMSTAYYRTDLYQPHLVEVPVTNFFGSVMETIHTDSAYKAF 306 Sbjct: 127 QSTQGGYFQTALNVKDILTVVNMQYYNSGTMLGC-----DGKVYAQGTVDFLTALACIQL 181 Query: 307 SSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTVYTLFPGLSYFGLSTLLR 366 Sbjct: 182 EGGLAPSQVGLGLPASTRA-------AGGGYVSPSVVNAALD--CLTKATNCGSFKPSKT 232 Query: 367 YMNLSRVRVLSMIDDVFA 384 Sbjct: 233 YPDLRGAMTWSTNWDATA 250 >gi|1790450 [741..896] Flavodoxin-like Length = 156 Score = 26.5 bits (58), Expect = 0.86 Identities = 15/101 (14%), Positives = 15/101 (14%), Gaps = 20/101 (19%) Query: 98 LYGDN-VEVDYRGYEVTVENFLRV-------------LTGRHENAVPRSKRLLSDEGSHI 143 Sbjct: 29 VLQCNNYEIVDLGVMVPAEKILRTAKEVNADLIGLSGLITPSLDEMVNVAKEMERQGFTI 88 Query: 144 LLYMTGHGGDEFLKFQDAEELQSH------DLADAVKQMKE 178 Sbjct: 89 PLLIGGATTSKAHTAVKIEQNYSGPTVYVQNASRTVGVVAA 129 >gi|999515 [1..176] NAD(P)-binding Rossmann-fold domains Length = 176 Score = 25.9 bits (56), Expect = 1.2 Identities = 26/94 (27%), Positives = 26/94 (27%), Gaps = 19/94 (20%) Query: 140 GSHILLYMTGHG------------GDEFL--KFQDAEELQSHDLADAVKQMKEKRRFKEL 185 Sbjct: 14 GQNLILNMNDHGFVVCAFNRTVSKVDDFLANEAKGTKVLGAHSLEEMVSKLKKPRR---I 70 Query: 186 MIMVDTCQAATLFNQLQSPGVLAIGS-SLKGENS 218 Sbjct: 71 ILLVKAGQAVDNFIEKLVP-LLDIGDIIIDGGNS 103 >gi|2088870 [834..1045] (Phosphotyrosine) protein phosphatases II Length = 212 Score = 25.4 bits (55), Expect = 2.1 Identities = 9/66 (13%), Positives = 9/66 (13%), Gaps = 2/66 (3%) Query: 67 RIILMLADDM-ACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVENFLRVLTGRH 125 Sbjct: 49 TQLVMMCDFDEKCPGKTNSCARYYPESVGESMKFK-NLTVECKSKIAEKDFETRELEVKF 107 Query: 126 ENAVPR 131 Sbjct: 108 DGHEPH 113 >gi|730428 [2..438] P-loop containing nucleotide triphosphate hydrolases Length = 437 Score = 24.9 bits (54), Expect = 2.4 Identities = 19/90 (21%), Positives = 19/90 (21%), Gaps = 11/90 (12%) Query: 67 RIILMLADDMACNAR-----NEYPAQVFNNENHKLNL------YGDNVEVDYRGYEVTVE 115 Sbjct: 18 KIVDLLTEDAKYVVRYQGGHNAGHTLVIDGEKTVLHLIPSGILRDNVKCVIGNGVVLSPE 77 Query: 116 NFLRVLTGRHENAVPRSKRLLSDEGSHILL 145 Sbjct: 78 ALLKEMKPLEERGIPVRERLFISEACPLIL 107 >gi|1942733 [1..168] Lysozyme-like Length = 168 Score = 24.8 bits (53), Expect = 2.7 Identities = 16/58 (27%), Positives = 16/58 (27%), Gaps = 3/58 (5%) Query: 160 DAEELQSHDLADAVKQMKEKRRFKELMIMVDTCQAATLFN---QLQSPGVLAIGSSLK 214 Sbjct: 62 EAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLR 119 >gi|2194029 [140..303] Ferredoxin reductase-like, C-terminal NADP-linked domain Length = 164 Score = 24.5 bits (53), Expect = 3.6 Identities = 10/55 (18%), Positives = 10/55 (18%), Gaps = 5/55 (9%) Query: 133 KRLLSDEGSHILLYMTGHGG--DEFLK-FQDAEELQSHDLADAVKQMKEKRRFKE 184 Sbjct: 109 WQLIKNQKTHT--YICGLRGMEEGIDAALSAAAAKEGVTWSDYQKDLKKAGRWHV 161 >gi|1172572 [1..540] Phosphoenolpyruvate carboxykinase (ATP-oxaloacetate carboxy-liase) Length = 540 Score = 24.2 bits (52), Expect = 4.7 Identities = 20/90 (22%), Positives = 20/90 (22%), Gaps = 7/90 (7%) Query: 98 LYGDNVEVDYRGYE---VTVENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMT--GHGG 152 Sbjct: 32 LYQEELDPSLTGYERGVLTNLGAVAVDTGIFTGRSPKDKYIVRDDTTRDTFWWADKGKGK 91 Query: 153 DEFLKFQDAEELQSHDLADAVKQMKEKRRF 182 Sbjct: 92 NDNKPL--SPETWQHLKGLVTRQLSGKRLF 119 >gi|2707940 [1..184] Ribonuclease H-like motif Length = 184 Score = 24.1 bits (52), Expect = 5.0 Identities = 7/43 (16%), Positives = 7/43 (16%), Gaps = 6/43 (13%) Query: 73 ADDMACNARNEYPAQVFNNENHKLNLYGDNVEVDYRGYEVTVE 115 Sbjct: 77 LDEYLNRLKPHYSVRLIKIGS------GLNETVSIGNFGGTVK 113 >gi|555731 [3..383] Serpins Length = 381 Score = 23.8 bits (51), Expect = 5.7 Identities = 10/50 (20%), Positives = 10/50 (20%), Gaps = 1/50 (2%) Query: 292 SVMETIHTDSAYKAFSSKISERKINSEMP-FNQLSEHDLKEELENTNIPN 340 Sbjct: 247 GAIEVLNGNKILSHYVDKLEETSVSLKMPKFTLTKKLQLVGTLKSIGIKN 296 >gi|1817676 [157..389] Protein kinases (PK), catalytic core Length = 233 Score = 23.5 bits (49), Expect = 6.3 Identities = 11/112 (9%), Positives = 11/112 (9%), Gaps = 1/112 (0%) Query: 290 FGSVMETIHTDSAYKAFSSKISERKINSEMPFNQLSEHDLKEELENTNIPNDELIAEVTV 349 Sbjct: 6 LGWIYLALDRNVNGRPVVLKGLVHSGDAEAQAMAMAERQFLAEVVHPSIVQIF-NFVEHT 64 Query: 350 YTLFPGLSYFGLSTLLRYMNLSRVRVLSMIDDVFAFWLVFVLLLDSTNRIEI 401 Sbjct: 65 DRHGDPVGYIVMEYVGGQSLKRSKGQKLPVAEAIAYLLEILPALSYLHSIGL 116 >gi|1902913 [26..315] Protein kinases (PK), catalytic core Length = 290 Score = 23.5 bits (49), Expect = 7.5 Identities = 19/147 (12%), Positives = 19/147 (12%), Gaps = 10/147 (6%) Query: 115 ENFLRVLTGRHENAVPRSKRLLSDEGSHILLYMTGHGGDEFLKFQDAEELQSHDLADAVK 174 Sbjct: 56 AIIQEVCFLKKLSGHPNIVQFC-----SAASIGKEESDTGQAEFLLLTELCKGQLVEFLR 110 Query: 175 QMKEKRRFKELMIMVDTCQAATLFNQLQSPGVLAIGSSLKGENSYS----HHLDSDIGVS 230 Sbjct: 111 RVECKGPLSCDSILKIFYQTCRAVQHMHRQKPPIIHRDLKVENLLLSNQGTIKLCDFGSA 170 Query: 231 -VVDRFTYYTLAFFERLNIYDNASLNS 256 Sbjct: 171 TTISHYPDYSWSAQKRAMVEEEITRNT 197 >gi|1708972 [46..317] FAD/NAD(P)-binding domain Length = 272 Score = 23.0 bits (48), Expect = 9.5 Identities = 7/36 (19%), Positives = 7/36 (19%), Gaps = 4/36 (11%) Query: 29 AVLVCTSRFCSLHSLVLTFIFSLLGVSRTVKRLGIP 64 Sbjct: 235 EVILSAGPIGSPQLLLLSGV----GPESYLTSLNIS 266 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 13 Number of calls to ALIGN: 13 Length of query: 428 Total length of test sequences: 256703 Effective length of test sequences: 207231.0 Effective search space size: 80033993.1 Initial X dropoff for ALIGN: 25.0 bits ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ calculation of internal repeats with prospero ***** PROSPERO v1.3 Thu Feb 21 12:40:10 2002 ***** Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford For help see http://www.well.ox.ac.uk/ariadne For usage use -help using gap penalty 11+1k using matrix BLOSUM62 printing all alignments with eval < 0.100000 using sequence1 T00731 using self-comparison ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ TIGRFAM hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm-f Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // SMART hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/iprscan/data/smart.HMMs Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- SAP Putative DNA-binding (bihelical) motif predi -6.1 85 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- SAP 1/1 321 351 .. 1 35 [] -6.1 85 Alignments of top-scoring domains: SAP: domain 1 of 1, from 321 to 351: score -6.1, E = 85 *->lskLkVseLkdeLkkrGLstsGrKaeLvkRLleal<-* ++ L+ +Lk+eL+ + + + eL++ ++ ++ T00731 321 FNQLSEHDLKEELENTNIPND----ELIAEVTVYT 351 // COG hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG2965 -49.1 26 1 COG2143 -96.1 76 1 COG1985 -112.1 69 1 COG2232 -206.6 54 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG2965 1/1 43 144 .. 1 103 [] -49.1 26 COG2143 1/1 28 181 .. 1 242 [] -96.1 76 COG1985 1/1 43 237 .. 1 234 [] -112.1 69 COG2232 1/1 1 336 [. 1 399 [] -206.6 54 Alignments of top-scoring domains: COG2965: domain 1 of 1, from 43 to 144: score -49.1, E = 26 *->mGmtNrvsLsGvVekapvrrkSPSGIphcdfiLeHRStQeEaGfqRq + +t sL G V++ +r+ GIp +iL + + + T00731 43 LVLTFIFSLLG-VSRTVKRL----GIPDERIIL-----MLADDMACN 79 vwlemPVriSGrqae.........eltqsitqGSkIlVeGFlaqhkrr.. + e P+++ +++ + + +++ e + G ++Ve Fl + r++ T00731 80 ARNEYPAQVFNNENHklnlygdnvEVDY---RGYEVTVENFLRVLTGRhe 126 sGlpk.LvLhAeQiekID<-* + +p++ +L + + I T00731 127 NAVPRsKRLLSDEGSHIL 144 COG2143: domain 1 of 1, from 28 to 181: score -96.1, E = 76 *->mFSLSYvmRvl.lilLliislFllAcksdNKDKLDENLLSSGsqSSK L +R+ +l L + ++F+l s T00731 28 WAVLVCTSRFCsLHSLVLTFIFSLLGVSR------------------ 56 ELfekksnldKKSYAGLEDlvedlksikpedKYlllmFeseeCiYCeklK ++ +p ++ l+ C + T00731 57 ---------------------TVKRLGIPDERIILMLADDMACNARNEYP 85 KdvfnkkrlrEylkehFsiveldikdsk.pvkfkvGdkg.NdEKeeklSe vfn + + l + + ve+d+ +++ +v + ++ + E S T00731 86 AQVFNNENHKLNLYG--DNVEVDYRGYEvTVENFLRVLTgRHENAVPRS- 132 kELArkfkVrsTPtfvFfDkkGkkIlelPGYlPpeeFllvlkYVaeekyk ++D +l++ G + ++ T00731 133 -------------KRLLSDEGSHILLYMTGH-------------GGDEFL 156 dtktYLKKDDPFVGEPLiiEiFKEdeDfvkklkedikkkdtlskekrr<- ++++ e + +d++k kekrr T00731 157 KFQD--------------------AEELQSHDLADAVKQM---KEKRR 181 * T00731 - - COG1985: domain 1 of 1, from 43 to 237: score -112.1, E = 69 *->rgrPfVilKlAmSLDGKtAtasGeSkwItgeeaRadVhrlRaesdAI +l + SL G +s+ k+ ++ R+ ++ T00731 43 -----LVLTFIFSLLG----VSRTVKRLGIPDERIILM--------- 71 lVGsgTVLaDn...........................PsLtvRwaelpe LaD+ + +++ + + ++++++ + +++ + R++e+ T00731 72 -------LADDmacnarneypaqvfnnenhklnlygdnVEVDYRGYEVTV 114 gtqryargasrqPlRVvlDsr.lrvppearvldtgeAptlvvtterapee + lRV ++++ vp + r+l ++ +++l+ t++ T00731 115 ----------ENFLRVLTGRHeNAVPRSKRLLSDEGSHILLYMTGH---- 150 rekkekledvgvevvvagdgrVDlkkllelLaerg.insvmVEGGgtLag e l+ +e+++++d l+ ++++ +e+ +++++m+ T00731 151 -GGDEFLKFQDAEELQSHD----LADAVKQMKEKRrFKELMI-------- 187 sflkegLVDElilyiAPkilGGddartlvdglgfrkladalqlakikeve +++ + A ++ ++ +g+ ++ + l+ + + T00731 188 -MVDT--------CQAATLFN------QLQSPGVLAIGSSLKG-ENSYSH 221 qiGpdlkvtarvkpke<-* +++ d+ v +++ ++ T00731 222 HLDSDIGVSVVDRFTY 237 COG2232: domain 1 of 1, from 1 to 336: score -206.6, E = 54 *->mNNFtLFLFSCLYFisknekvLVlGvNtRpVveSakklGFeVYSvsy m tL + C F+s + +N V ++ v T00731 1 MKILTLVMLLCYSFVSSTGDTTIHTNNW-AVLVCTSRFCSLHSLVLT 46 YvdaDLkaytERRcklversdeslGRlkENydeekLleiaedlaeevDai ++ L + v r + + de +l a+d+a + T00731 47 FIFSLLGVS-----RTVKRLG--IP------DERIILMLADDMACNARNE 83 vvlsgafefetekVrGndNViGNGPKkvdevsnkYkkyk.rvkNLkfkip + +f e+ k + d v+ Y+ y+ v N + + T00731 84 YP-AQVFNNENHKLNLYG----------DNVEVDYRGYEvTVEN-FLRVL 121 eTklikdklelyell.eeGekKyIlKPVvGaGGeeVvkieendkdfllqe + +++ ll+ eG Il G GG +fl+ + T00731 122 TGRHENAVPRSKRLLsDEGSH--ILLYMTGHGG----------DEFLKFQ 159 yikGvPvsasvlarGesalavlisRnifatfkkqiiskFvYAGNmTPFiv + +l a av + +fk+ +i m T00731 160 DAE-------ELQSHDLADAVKQM-KEKRRFKELMI--------MVD--- 190 eeelskeleeLaseviesf..eLkGssGVDfvl.kdkelYiveiNPRiqG + L s ++ +++LkG + l+ d + +v+ R+ T00731 191 TCQAATLFNQLQSPGVLAIgsSLKGENSYSHHLdSDIGVSVVD---RF-- 235 tyesvEaSldvNLvkvhleAfdgklaekvkPrky....avkrILFApadv ty + N+ + +A + l +Pr ++ + + L+ p T00731 236 TYYTLAFFERLNIYD---NASLNSLFRSYDPRLLmstaYYRTDLYQP--- 279 kikenlakrdFvhDvPkkgavie..kgePLvtVLAkenskeaveslae.e e + F g v e+ + + s ++s + T00731 280 HLVEVPVTN-FF------GSVMEtiHTDSAYKAFSSKISERKINSEMPfN 322 vlerekkkldleri<-* l+ ++k++le+ T00731 323 QLSEHDLKEELENT 336 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm-f Sequence file: T00731.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: T00731 hypothetical protein F22O13.26 - Arabidopsis thaliana. Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG1053 2.8 1.4 1 COG1235 2.7 8.9 1 COG1182 2.5 25 1 COG1131 2.4 8.3 1 COG2241 0.1 37 1 COG2258 0.1 58 1 COG0285 0.0 34 1 COG0462 -0.8 69 1 COG2920 -0.9 98 1 COG3274 -1.1 59 1 COG2414 -1.8 78 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG2414 1/1 57 68 .. 674 685 .] -1.8 78 COG2241 1/1 60 78 .. 216 234 .. 0.1 37 COG2920 1/1 90 110 .. 1 21 [. -0.9 98 COG1235 1/1 99 115 .. 299 315 .] 2.7 8.9 COG0285 1/1 113 127 .. 472 486 .] 0.0 34 COG2258 1/1 100 128 .. 181 215 .. 0.1 58 COG3274 1/1 129 137 .. 357 365 .] -1.1 59 COG1131 1/1 92 154 .. 267 325 .] 2.4 8.3 COG1053 1/1 149 202 .. 458 511 .. 2.8 1.4 COG1182 1/1 206 223 .. 1 21 [. 2.5 25 COG0462 1/1 246 259 .. 315 328 .] -0.8 69 Alignments of top-scoring domains: COG2414: domain 1 of 1, from 57 to 68: score -1.8, E = 78 *->tLkeLGledeva<-* t k+LG++de++ T00731 57 TVKRLGIPDERI 68 COG2241: domain 1 of 1, from 60 to 78: score 0.1, E = 37 *->rLtapdERitagtLkdlal<-* rL+ pdERi +++d a+ T00731 60 RLGIPDERIILMLADDMAC 78 COG2920: domain 1 of 1, from 90 to 110: score -0.9, E = 98 *->knvmtmLeyeGkeietDkdGY<-* n +L++ G ++e D GY T00731 90 NNENHKLNLYGDNVEVDYRGY 110 COG1235: domain 1 of 1, from 99 to 115: score 2.7, E = 8.9 *->laeevevaydgmeiyli<-* ++++vev+y+g+e+ + T00731 99 YGDNVEVDYRGYEVTVE 115 COG0285: domain 1 of 1, from 113 to 127: score 0.0, E = 34 *->lvgevlellqrkkdk<-* +v ++l++l++++++ T00731 113 TVENFLRVLTGRHEN 127 COG2258: domain 1 of 1, from 100 to 128: score 0.1, E = 58 *->GDplklverprepapTvlelnrllfsPHqikpknp<-* GD++++++r +++Tv r+l++ + +n+ T00731 100 GDNVEVDYRG--YEVTVENFLRVLTG----RHENA 128 COG3274: domain 1 of 1, from 129 to 137: score -1.1, E = 59 *->iprsnkLvs<-* +prs++L+s T00731 129 VPRSKRLLS 137 COG1131: domain 1 of 1, from 92 to 154: score 2.4, E = 8.3 *->lvglkgveevvglgvgleveveeggnkvlvevd.ae.av.ellalli ++ k+ ++++ + v+++ +++++n ++v++ ++e+av+ +ll+ T00731 92 ENH-KLNLYGDNVEVDYRGYEVTVENFLRVLTGrHEnAVpRSKRLLS 137 .eginvlsi.veepsLE<-* +eg +l + + + E T00731 138 dEGSHILLYmTGHGGDE 154 COG1053: domain 1 of 1, from 149 to 202: score 2.8, E = 1.4 *->GryaaeyakeaspskeaeseaeeerakkkeeeerldeLlkaeG.env G + e++k +++++e++s++++++ k+++e++r++eL+ + ++ + T00731 149 GHGGDEFLK-FQDAEELQSHDLADAVKQMKEKRRFKELMIMVDtCQA 194 aairkelq<-* a + lq T00731 195 ATLFNQLQ 202 COG1182: domain 1 of 1, from 206 to 223: score 2.5, E = 25 *->MskVLviksSirgeeSvSrqL<-* VL+i sS ge+S+S+ L T00731 206 ---VLAIGSSLKGENSYSHHL 223 COG0462: domain 1 of 1, from 246 to 259: score -0.8, E = 69 *->rrihngeSVSsLFd<-* ++i+ + S+ sLF+ T00731 246 LNIYDNASLNSLFR 259 //