analysis of sequence from tem31_2 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] AASKERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEIITAEQLMEYLG DYMLDAKPKEISEIQRLNYEQNMSDAMAILHKLQTGLDVNVRFTGVRVFEYTPECIVFDLLDIPLYHGWL VDPQIDDIVKAVGNCSYNQLVEKIISCKQSDNSELVSEGFVAEQFLNNTATQLTYHGLCELTSTVQEGEL CVFFRNNHFSTMTKYKGQLYLLVTDQGFLTEEKVVWESLHNVDGDGNFCDSEFHLRPPSDPETVYKGQQD QIDQDYLMALSLQQEQQSQEINWEQIPEGISDLELAKKLQEEEDRRASQYYQEQEQAAAAAAAASTQAQG QPAQASPSSGRQSGNSERKRKEPREKDKEKEKEKNSCVIL ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > tem31.2t_gi|6330169|dbj|BAA86478.1| . . . . . 1 AASKERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWK 50 ___________EEEEHHHHHHH____EEEE_______HHHHHHHHHHHHH . . . . . 51 VKLPPMMEIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAIL 100 H____HHHHHHHHHHHHHHH_________HHHHHHHHHHHHH_HHHHHHH . . . . . 101 HKLQTGLDVNVRFTGVRVFEYTPECIVFDLLDIPLYHGWLVDPQIDDIVK 150 HHHH____EEEEEE_EEEEE____EEEEEE_________________EEE . . . . . 151 AVGNCSYNQLVEKIISCKQSDNSELVSEGFVAEQFLNNTATQLTYHGLCE 200 EE_____HHHHHHHHHH____________HHHHHHH______________ . . . . . 201 LTSTVQEGELCVFFRNNHFSTMTKYKGQLYLLVTDQGFLTEEKVVWESLH 250 ________EEEEEEE____________EEEEEE_________EEEE____ . . . . . 251 NVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQE 300 ___________EEE______________HHHHHHHHHHHHHHHHHHHH__ . . . . . 301 INWEQIPEGISDLELAKKLQEEEDRRASQYYQEQEQAAAAAAAASTQAQG 350 ___________HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH__ . . . . 351 QPAQASPSSGRQSGNSERKRKEPREKDKEKEKEKNSCVIL 390 _______________________HHHHHHHHHHHH_____ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 64.1 % beta-contents : 3.8 % coil-contents : 32.1 % class : alpha method : 2 alpha-contents : 55.7 % beta-contents : 0.0 % coil-contents : 44.3 % class : alpha ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -20.34 -1.94 -2.54 -1.25 -4.00 0.00 0.00 0.00 -1.26 -3.47 -1.62 0.00 -12.00 0.00 -12.00 0.00 -60.41 -5.73 -1.81 -2.53 -4.08 -4.00 0.00 0.00 0.00 -2.20 -3.23 -1.62 0.00 -12.00 0.00 -12.00 0.00 -49.18 ID: tem31.2t_gi|6330169|dbj|BAA86478.1| AC: xxx Len: 350 1:I 327 Sc: -49.18 Pv: 2.160359e-01 NO_GPI_SITE GPI: learning from protozoa -11.53 -3.98 -5.11 -1.33 -4.00 0.00 0.00 0.00 -1.08 -3.48 -6.52 0.00 -12.00 0.00 -12.00 0.00 -61.04 -13.16 -2.03 -0.73 -1.49 -4.00 0.00 0.00 0.00 -0.97 -2.93 -6.52 0.00 -12.00 0.00 -12.00 0.00 -55.82 ID: tem31.2t_gi|6330169|dbj|BAA86478.1| AC: xxx Len: 350 1:I 327 Sc: -55.82 Pv: 1.850654e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem31.2t_gi 0.289 351 N 0.253 63 N 0.946 47 Y 0.268 N # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem31.2t_gi 0.583 351 Y 0.192 356 N 0.773 224 N 0.093 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? tem31.2t_gi 0.392 356 N 0.305 356 N 0.961 47 Y 0.134 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] 1-37 AASKERFPGQSVYHIKWIQWKEENTPIITQ NENGPCP llailnvlll 38-47 48-289 AWKVKLPPMMEIITAEQLMEYLGDYMLDAK PKEISEIQRLNYEQNMSDAMAILHKLQTGL DVNVRFTGVRVFEYTPECIVFDLLDIPLYH GWLVDPQIDDIVKAVGNCSYNQLVEKIISC KQSDNSELVSEGFVAEQFLNNTATQLTYHG LCELTSTVQEGELCVFFRNNHFSTMTKYKG QLYLLVTDQGFLTEEKVVWESLHNVDGDGN FCDSEFHLRPPSDPETVYKGQQDQIDQDYL MA lslqqeqqsqe 290-300 301-326 INWEQIPEGISDLELAKKLQEEEDRR asqyyqeqeqaaaaaaaastqaqgqpaqas 327-360 pssg 361-366 RQSGNS erkrkeprekdkekekek 367-384 385-390 NSCVIL low complexity regions: SEG 25 3.0 3.3 >tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] 1-277 AASKERFPGQSVYHIKWIQWKEENTPIITQ NENGPCPLLAILNVLLLAWKVKLPPMMEII TAEQLMEYLGDYMLDAKPKEISEIQRLNYE QNMSDAMAILHKLQTGLDVNVRFTGVRVFE YTPECIVFDLLDIPLYHGWLVDPQIDDIVK AVGNCSYNQLVEKIISCKQSDNSELVSEGF VAEQFLNNTATQLTYHGLCELTSTVQEGEL CVFFRNNHFSTMTKYKGQLYLLVTDQGFLT EEKVVWESLHNVDGDGNFCDSEFHLRPPSD PETVYKG qqdqidqdylmalslqqeqqsqeinweqip 278-386 egisdlelakklqeeedrrasqyyqeqeqa aaaaaaastqaqgqpaqaspssgrqsgnse rkrkeprekdkekekekns 387-390 CVIL low complexity regions: SEG 45 3.4 3.75 >tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] 1-274 AASKERFPGQSVYHIKWIQWKEENTPIITQ NENGPCPLLAILNVLLLAWKVKLPPMMEII TAEQLMEYLGDYMLDAKPKEISEIQRLNYE QNMSDAMAILHKLQTGLDVNVRFTGVRVFE YTPECIVFDLLDIPLYHGWLVDPQIDDIVK AVGNCSYNQLVEKIISCKQSDNSELVSEGF VAEQFLNNTATQLTYHGLCELTSTVQEGEL CVFFRNNHFSTMTKYKGQLYLLVTDQGFLT EEKVVWESLHNVDGDGNFCDSEFHLRPPSD PETV ykgqqdqidqdylmalslqqeqqsqeinwe 275-384 qipegisdlelakklqeeedrrasqyyqeq eqaaaaaaaastqaqgqpaqaspssgrqsg nserkrkeprekdkekekek 385-390 NSCVIL low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] AASKERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEII TAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHKLQTGLDVNVRFTGVRVFE YTPECIVFDLLDIPLYHGWLVDPQIDDIVKAVGNCSYNQLVEKIISCKQSDNSELVSEGF VAEQFLNNTATQLTYHGLCELTSTVQEGELCVFFRNNHFSTMTKYKGQLYLLVTDQGFLT EEKVVWESLHNVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQE INWEQIPEGISDLELAKKLqeeedrrasqyyqeqeqaaaaaaaastqaqgqpaqaspssg rqsgnserkrkeprekdkekekeknSCVIL 1 - 319 AASKERFPGQ SVYHIKWIQW KEENTPIITQ NENGPCPLLA ILNVLLLAWK VKLPPMMEII TAEQLMEYLG DYMLDAKPKE ISEIQRLNYE QNMSDAMAIL HKLQTGLDVN VRFTGVRVFE YTPECIVFDL LDIPLYHGWL VDPQIDDIVK AVGNCSYNQL VEKIISCKQS DNSELVSEGF VAEQFLNNTA TQLTYHGLCE LTSTVQEGEL CVFFRNNHFS TMTKYKGQLY LLVTDQGFLT EEKVVWESLH NVDGDGNFCD SEFHLRPPSD PETVYKGQQD QIDQDYLMAL SLQQEQQSQE INWEQIPEGI SDLELAKKL 320 - 385 q eeedrrasqy yqeqeqaaaa aaaastqaqg qpaqaspssg rqsgnserkr keprekdke k ekekn 386 - 390 SCVIL low complexity regions: DUST >tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] AASKERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEII TAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHKLQTGLDVNVRFTGVRVFE YTPECIVFDLLDIPLYHGWLVDPQIDDIVKAVGNCSYNQLVEKIISCKQSDNSELVSEGF VAEQFLNNTATQLTYHGLCELTSTVQEGELCVFFRNNHFSTMTKYKGQLYLLVTDQGFLT EEKVVWESLHNVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQE INWEQIPEGISDLELAKKLQEEEDRRASQYYQEQEQNNNNNNNNSTQAQGQPAQASPSSG RQSGNSERKRKEPREKDKEKEKEKNSCVIL ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for tem31.2t_gi|6330169|dbj|BAA86478.1| sequence: 350 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 AASKERFPGQ SVYHIKWIQW KEENTPIITQ NENGPCPLLA ILNVLLLAWK VKLPPMMEII ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 TAEQLMEYLG DYMLDAKPKE ISEIQRLNYE QNMSDAMAIL HKLQTGLDVN VRFTGVRVFE ~~~~~~~~~~ ~~~~~~~~~~ ~~33333333 3333333333 33333333~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- --efgabcde fgabcdefga bcdefgab-- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~77777999 9999999999 999999997~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~22222333 3333333333 333333331~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~33333333 3333332222 2221~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 YTPECIVFDL LDIPLYHGWL VDPQIDDIVK AVGNCSYNQL VEKIISCKQS DNSELVSEGF ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 VAEQFLNNTA TQLTYHGLCE LTSTVQEGEL CVFFRNNHFS TMTKYKGQLY LLVTDQGFLT ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 300 EEKVVWESLH NVDGDGNFCD SEFHLRPPSD PETVYKGQQD QIDQDYLMAL SLQQEQQSQE ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~11 1111111111 * 14 M'95 -w local . | . | . | . | . | INWEQIPEGI SDLELAKKLQ EEEDRRASQY YQEQEQAAAA AAAASTQAQG ~~~~~~~~~~ ~~11111111 1111111111 1111111111 1~~~~~~~~~ ---------- --abcdefga efgabcdefg abcdefgabc d--------- ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~3 3333333333 3333333333 3333333332 111111111~ 11~~~~~~~5 5569999999 9999999721 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** tem31_2.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem31_2.___inter___ (1 sequences) AASKERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWK VKLPPMMEIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAIL HKLQTGLDVNVRFTGVRVFEYTPECIVFDLLDIPLYHGWLVDPQIDDIVK AVGNCSYNQLVEKIISCKQSDNSELVSEGFVAEQFLNNTATQLTYHGLCE LTSTVQEGELCVFFRNNHFSTMTKYKGQLYLLVTDQGFLTEEKVVWESLH NVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQE INWEQIPEGISDLELAKKLQEEEDRRASQYYQEQEQAAAAAAAASTQAQG QPAQASPSSGRQSGNSERKRKEPREKDKEKEKEKNSCVIL (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 33 53 1.003 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 Loop length 32 337 K+R profile 5.00 + CYT-EXT prof - -0.09 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 0.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 4.0000 POS: 4.0000 -> Orientation: undecided CYT-EXT difference: 0.09 -> Orientation: N-out ---------------------------------------------------------------------- "tem31_2" 390 33 53 #t 1.00312 ************************************ *TOPPREDM with prokaryotic function* ************************************ tem31_2.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: tem31_2.___inter___ (1 sequences) AASKERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWK VKLPPMMEIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAIL HKLQTGLDVNVRFTGVRVFEYTPECIVFDLLDIPLYHGWLVDPQIDDIVK AVGNCSYNQLVEKIISCKQSDNSELVSEGFVAEQFLNNTATQLTYHGLCE LTSTVQEGELCVFFRNNHFSTMTKYKGQLYLLVTDQGFLTEEKVVWESLH NVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQE INWEQIPEGISDLELAKKLQEEEDRRASQYYQEQEQAAAAAAAASTQAQG QPAQASPSSGRQSGNSERKRKEPREKDKEKEKEKNSCVIL (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 33 53 1.003 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 Loop length 32 337 K+R profile 5.00 + CYT-EXT prof - -0.09 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 5.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): 0.00 (NEG-POS)/(NEG+POS): 0.0000 NEG: 4.0000 POS: 4.0000 -> Orientation: undecided CYT-EXT difference: 0.09 -> Orientation: N-out ---------------------------------------------------------------------- "tem31_2" 390 33 53 #t 1.00312 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ NOW EXECUTING: /bio_software/1D/stat/saps/saps-stroh/SAPS.SSPA/saps /people/maria/tem31_2.___saps___ SAPS. Version of April 11, 1996. Date run: Tue Oct 31 16:25:36 2000 File: /people/maria/tem31_2.___saps___ ID tem31.2t_gi|6330169|dbj|BAA86478.1| DE KIAA1164 protein [Homo sapiens] number of residues: 390; molecular weight: 44.5 kdal 1 AASKERFPGQ SVYHIKWIQW KEENTPIITQ NENGPCPLLA ILNVLLLAWK VKLPPMMEII 61 TAEQLMEYLG DYMLDAKPKE ISEIQRLNYE QNMSDAMAIL HKLQTGLDVN VRFTGVRVFE 121 YTPECIVFDL LDIPLYHGWL VDPQIDDIVK AVGNCSYNQL VEKIISCKQS DNSELVSEGF 181 VAEQFLNNTA TQLTYHGLCE LTSTVQEGEL CVFFRNNHFS TMTKYKGQLY LLVTDQGFLT 241 EEKVVWESLH NVDGDGNFCD SEFHLRPPSD PETVYKGQQD QIDQDYLMAL SLQQEQQSQE 301 INWEQIPEGI SDLELAKKLQ EEEDRRASQY YQEQEQAAAA AAAASTQAQG QPAQASPSSG 361 RQSGNSERKR KEPREKDKEK EKEKNSCVIL -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 25( 6.4%); C : 8( 2.1%); D : 21( 5.4%); E+ : 40(10.3%); F : 12( 3.1%) G : 19( 4.9%); H : 7( 1.8%); I : 21( 5.4%); K : 24( 6.2%); L : 38( 9.7%) M : 8( 2.1%); N : 19( 4.9%); P : 17( 4.4%); Q+ : 35( 9.0%); R : 12( 3.1%) S : 25( 6.4%); T : 17( 4.4%); V : 22( 5.6%); W : 6( 1.5%); Y : 14( 3.6%) KR : 36 ( 9.2%); ED : 61 ( 15.6%); AGP : 61 ( 15.6%); KRED : 97 ( 24.9%); KR-ED - : -25 ( -6.4%); FIKMNY : 98 ( 25.1%); LVIFM : 101 ( 25.9%); ST : 42 ( 10.8%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 000+-+0000 00000+0000 +--0000000 0-00000000 000000000+ 0+00000-00 61 00-000-000 -000-0+0+- 00-00+000- 0000-00000 0+00000-00 0+0000+00- 121 000-0000-0 0-00000000 0-000--00+ 0000000000 0-+0000+00 -00-000-00 181 00-0000000 000000000- 000000-0-0 0000+00000 000+0+0000 0000-00000 241 --+000-000 00-0-0000- 0-000+000- 0-000+000- 00-0-00000 0000-0000- 301 000-000-00 0-0-00++00 ----++0000 00-0-00000 0000000000 0000000000 361 +00000-+++ +-0+-+-+-+ -+-+000000 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 11/45 or 14/60): none Negative charge clusters (cmin = 12/30 or 16/45 or 20/60): none Mixed charge clusters (cmin = 16/30 or 22/45 or 28/60): 1) From 367 to 386: ERKRKEPREKDKEKEKEKNS -++++-0+-+-+-+-+-+00 quartile: 4; size: 20, +count: 10, -count: 7, 0count: 3; t-value: 11.71 * K: 7 (35.0%); E: 6 (30.0%); R: 3 (15.0%); B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. ___________________________________ High scoring mixed charge segments: score= 1.00 frequency= 0.249 ( KEDR ) score= 0.00 frequency= 0.000 ( BZX ) score= -1.00 frequency= 0.751 ( LAGSVTIPNFQYHMCW ) Expected score/letter: -0.503 M_0.01= 8.57; M_0.05= 7.10 1) From 367 to 386: length= 20, score=14.00 ** 367 ERKRKEPREK DKEKEKEKNS K: 7(35.0%); E: 6(30.0%); R: 3(15.0%); There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 4 | 6 | 7 | 32 | 9 | 10 | 13 | 10 | 12 | 15 | 6 | 8 | lmin1 6 | 7 | 9 | 39 | 11 | 13 | 16 | 12 | 15 | 18 | 8 | 10 | lmin2 7 | 8 | 11 | 43 | 12 | 14 | 18 | 14 | 17 | 20 | 9 | 11 | (Significance level: 0.010000; Minimal displayed length: 6) (*) 17(1,0,0); at 367- 384: ERKRKEPREKDKEKEKEK (4. quartile) -++++-0+-+-+-+-+-+ Run count statistics: + runs >= 3: 1, at 368; - runs >= 4: 1, at 321; * runs >= 5: 3, at 321; 367; 374; 0 runs >= 21: 1, at 336; -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. There are no high scoring transmembrane segments. 2. SPACINGS OF C. H2N-35-C-88-C-29-C-11-C-31-C-11-C -47-C-127-C-3-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-13-H-21-C-64-H-23-C-11-H-17-C-11-C-28-H-2-C-11-C-6-H-31-H-8-C-4-H-122-C-3-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 Aligned matching blocks: [ 225- 228] YKGQ [ 275- 278] YKGQ B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 27 (Expected range: 8-- 37) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 10 (6-10) 6 (11-20) 7 (>=21) 5 3. Clusters of amino acid multiplets (cmin = 12/30 or 15/45 or 18/60): none 4. Long amino acid multiplets (>= 5; Letter/Length/Position): A/8/337 B. CHARGE ALPHABET. 1. Total number of charge multiplets: 7 (Expected range: 1-- 21) 3 +plets (f+: 9.2%), 4 -plets (f-: 15.6%) Total number of charge altplets: 9 (Critical number: 23) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 2 (6-10) 0 (11-20) 1 (>=21) 5 3. Long charge altplets (>= 8; Letters/Length/Position): +-/11/374 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors 337- 344 1 A 8 8 ! 0 338- 357 5 A.... 4 4 0 376- 385 2 K. 5 5 ! 0 B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 63- 86 4 *0.. 6 6 /0/2/././ 321- 326 1 * 6 6 0 367- 384 1 * 17 6 1 -------------------------------------------------------------------------------- SPACING ANALYSIS. Location (Quartile) Spacing Rank P-value Interpretation 165- 282 (3.) I( 117)I 1 of 22 0.0097 large 1. maximal spacing 274- 388 (4.) V( 114)V 1 of 23 0.0089 large maximal spacing 310- 389 (4.) I( 79)I 2 of 22 0.0029 large 2. maximal spacing ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: tem31_2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- Dala_Dala_ligas D-ala D-ala ligase -0.4 70 1 SCAN SCAN domain -43.6 26 1 MCH Cyclohydrolase (MCH) -240.9 87 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- SCAN 1/1 8 114 .. 1 96 [] -43.6 26 Dala_Dala_ligas 1/1 156 166 .. 356 366 .] -0.4 70 MCH 1/1 11 181 .. 1 321 [] -240.9 87 Alignments of top-scoring domains: SCAN: domain 1 of 1, from 8 to 114: score -43.6, E = 26 *->pspEifRqrFRqfrYqets.......GPrEALsrLReLCrqW...LR p++ + + + q ++t+ +++++GP L+ L L W+ +L tem31.2t_g 8 PGQSVYHIKWIQWKEENTPiitqnenGPCPLLAILNVLLLAWkvkLP 54 P..EvhTKEQILELLVLEQFLtILPkElQawVqehhPeSgEEaVtLlEdL P E+ T EQ E L +L + PkE+ + + + a ++l L tem31.2t_g 55 PmmEIITAEQLMEYLG-DYMLDAKPKEISEIQRLNYEQNMSDAMAILHKL 103 erelDepgqqV<-* + lD + tem31.2t_g 104 QTGLDVNVRFT 114 Dala_Dala_ligas: domain 1 of 1, from 156 to 166: score -0.4, E = 70 *->sysdLvdqLie<-* sy +Lv+++i+ tem31.2t_g 156 SYNQLVEKIIS 166 MCH: domain 1 of 1, from 11 to 181: score -240.9, E = 87 *->SVNelAleiVErMiedaeeLrieVakLENGAtViDCGVnapGSyeAG SV + + i ++ee + + ENG + tem31.2t_g 11 SVYHI------KWIQWKEENTPIITQNENGPCPL------------- 38 rlytevCLGGLAdVGisitpfelnglklpaVkvkTdhPaiAcLGsQkAGW LA + + +p+ + T+ tem31.2t_g 39 ----------LAIL--NVLLLAWKVKLPPMMEIITAE------------- 63 svkDeVGdsGYFAmGSGPARALAlKPketYE..EIgYeDdAdvAVLvLEs ++ + Gd Y L KPke E +++ Ye + A L tem31.2t_g 64 QLMEYLGD--YM---------LDAKPKEISEiqRLNYEQNMSDAMAIL-- 100 dkLPdEkVvefvAkeCgVdPENVyllVAPTaSlvGSvQiSaRVVEtGlyK kL +g+d+ NV + RV E y tem31.2t_g 101 HKL-----------QTGLDV-NVRFT-------------GVRVFE---YT 122 mleVgeFDvnkikygaGvAPIAPvvpddlqaMGrTNDavlYgGrvylyVk + + FD+ i G + P ++d ++a G tem31.2t_g 123 PECIV-FDLLDIPLYHGWL-VDPQIDDIVKAVGNC--------------- 155 sDeeddlkelvenlPSttSeDYGKPFyeiFkeAnYDFYKIDkglFAPAev + +lve++ S+ D tem31.2t_g 156 -----SYNQLVEKIISCKQSD----------------------------- 171 vVNDLkTGKtyraGklnpevLkqSFg<-* n e++ + F+ tem31.2t_g 172 ----------------NSELVSEGFV 181 // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: tem31_2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- ras Ras family 2.3 14 1 Dala_Dala_ligas D-ala D-ala ligase -0.4 70 1 Ring_hydroxyl_A Ring hydroxylating alpha subunit (cat -0.8 88 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- Ring_hydroxyl_A 1/1 65 71 .. 1 7 [. -0.8 88 Dala_Dala_ligas 1/1 156 166 .. 356 366 .] -0.4 70 ras 1/1 313 390 .] 167 198 .] 2.3 14 Alignments of top-scoring domains: Ring_hydroxyl_A: domain 1 of 1, from 65 to 71: score -0.8, E = 88 *->LddYLGd<-* L++YLGd tem31.2t_g 65 LMEYLGD 71 Dala_Dala_ligas: domain 1 of 1, from 156 to 166: score -0.4, E = 70 *->sysdLvdqLie<-* sy +Lv+++i+ tem31.2t_g 156 SYNQLVEKIIS 166 ras: domain 1 of 1, from 313 to 390: score 2.3, E = 14 *->eelareilkkvse.................................. +ela+++ + ++++ ++ +++++ +++ ++++ + ++++ tem31.2t_g 313 LELAKKLQEEEDRrasqyyqeqeqaaaaaaaastqaqgqpaqaspss 359 ............vnvnldqpakkkkskCcil<-* ++++++++++++++ +d++++k k++C il tem31.2t_g 360 grqsgnserkrkEPREKDKEKEKEKNSCVIL 390 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: tem31_2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Tue Oct 31 16:27:13 2000 Sequence file: tem31_2 ---------------------------------------- Sequence tem31.2t_gi|6330169|dbj|BAA86478.1| (390 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 92: NMSD 154: NCSY 187: NNTA Total matches: 3 Matching pattern PS00004 CAMP_PHOSPHO_SITE: 325: RRAS Total matches: 1 Matching pattern PS00005 PKC_PHOSPHO_SITE: 166: SCK 359: SGR 366: SER Total matches: 3 Matching pattern PS00006 CK2_PHOSPHO_SITE: 29: TQNE 105: TGLD 204: TVQE 269: SDPE 311: SDLE Total matches: 5 Matching pattern PS00008 MYRISTYL: 106: GLDVNV Total matches: 1 Matching pattern PS00294 PRENYLATION: 387: CVIL Total matches: 1 Total no of hits in this sequence: 14 ======================================== 1314 pattern(s) searched in 1 sequence(s), 390 residues. Total no of hits in all sequences: 14. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** argv[1]=P argv[2]=-m /data/patterns/own/motif.fa argv[4]=-seq tem31_2 ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 390 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: tem31_2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: tem31_2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] (390 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value KIN Protein kinase domain 23 0.70 HISDAC Histone deacetylase domain 23 1.1 SNARE Alpha helical domains which are involved in vesicle fu... 22 1.5 RRM RNA recognition motif domain 21 3.3 MIZFIN MIZ type Cysteine zinc DNA binding domain 21 3.6 DHHC Novel zinc finger domain with DHHC signature 20 3.6 DNASE1 DNASE-1/Sphingomyelinase like domain 20 5.3 KELCH Kelch repeat- beta propeller like domain 20 6.4 14-3-3 14-3-3 protein alpha Helical domain 20 6.7 AP2 A plant specific DNA binding domain (Apetala 2 like) 20 7.5 CATH Cathepsin like protease domain 19 8.2 CBS cystathionine beta -synthase domain (A predicted ligand... 19 9.3 MBL Metallo-betalactamase domain 19 9.6 >KIN Protein kinase domain Length = 313 Score = 23.2 bits (49), Expect = 0.70 Identities = 5/40 (12%), Positives = 14/40 (34%), Gaps = 2/40 (5%) Query: 65 LMEYLGDYMLD--AKPKEISEIQRLNYEQNMSDAMAILHK 102 + E + + P+ +S+ + M + +H Sbjct: 111 VSELMDTDLHQIITSPQPLSDDHCQYFVYQMLRGLKHIHS 150 >HISDAC Histone deacetylase domain Length = 433 Score = 22.6 bits (48), Expect = 1.1 Identities = 8/44 (18%), Positives = 12/44 (27%), Gaps = 3/44 (6%) Query: 343 AASTQAQGQPAQASPSSG---RQSGNSERKRKEPREKDKEKEKE 383 A S Q P A + + K +D E + Sbjct: 386 APSVQLNHTPRDAEDLGDVEEDSAEAKDTKGGSQYARDLHVEHD 429 >SNARE Alpha helical domains which are involved in vesicle fusion Length = 254 Score = 21.8 bits (46), Expect = 1.5 Identities = 17/142 (11%), Positives = 39/142 (26%), Gaps = 36/142 (25%) Query: 204 TVQEGE-LCVFFRNNHFSTMTKYKGQLYLLVTDQGFLTE------EKVVWESLHNVDGDG 256 T++ G F + S Y+++ + + +++ E + + Sbjct: 52 TLKTGRHNINFISSLGVS---------YMMLCTENYPNVLAFSFLDELQKEFITTYNMM- 101 Query: 257 NFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQEINWEQIPEGISDLELA 316 + +RP E D Q + S +IN + +++L Sbjct: 102 ---KTNTAVRPYCFIEF------DNFIQRTKQRYN-NPRSLSTKINLSDMQM---EIKLR 148 Query: 317 KKLQEEEDR------RASQYYQ 332 Q S + Sbjct: 149 PPYQIPMCELGSANGVTSAFSV 170 >RRM RNA recognition motif domain Length = 110 Score = 20.7 bits (43), Expect = 3.3 Identities = 7/41 (17%), Positives = 16/41 (38%) Query: 331 YQEQEQAAAAAAAASTQAQGQPAQASPSSGRQSGNSERKRK 371 + ++E + A + +G+ + P + G S R Sbjct: 69 FSDKESVRTSLALDESLFRGRQIKVIPKRTNRPGISTTDRG 109 >MIZFIN MIZ type Cysteine zinc DNA binding domain Length = 172 Score = 20.7 bits (43), Expect = 3.6 Identities = 14/76 (18%), Positives = 22/76 (28%), Gaps = 7/76 (9%) Query: 252 VDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDYLMALSLQQEQQSQEINWEQIPEGIS 311 DG++C E + Q + S ++ + I Sbjct: 83 FQEDGSWCPMR------PKKEAMKVTSQPCTKVESSSVFSKPCSVTVASDASKKKIDVI- 135 Query: 312 DLELAKKLQEEEDRRA 327 DL + EEED A Sbjct: 136 DLTIESSSDEEEDPPA 151 >DHHC Novel zinc finger domain with DHHC signature Length = 217 Score = 20.5 bits (42), Expect = 3.6 Identities = 3/25 (12%), Positives = 6/25 (24%) Query: 244 VVWESLHNVDGDGNFCDSEFHLRPP 268 + S H + C+ Sbjct: 121 IFNRSQHAHVIEDLHCNLCNVDVSA 145 >DNASE1 DNASE-1/Sphingomyelinase like domain Length = 388 Score = 20.1 bits (41), Expect = 5.3 Identities = 18/123 (14%), Positives = 34/123 (27%), Gaps = 16/123 (13%) Query: 175 LVSEGFVAEQFLNNTATQLTYHGL--CELTSTVQE-------GELCVFFRNNHFSTMTKY 225 + E ++L Y + S + +FF + F + Sbjct: 149 CLQEVDHYFDTFQPILSRLGYQCTFLAKPWSPCLDVEHNNGPDGCALFFLQDRFQLVNSA 208 Query: 226 KGQLYLLVTDQGFLTEEKV-VWESLHNVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQ 284 K + L +V + E+L + C + HL ++ Q Sbjct: 209 K-----IRLSARTLKTNQVAIAETLQCCETGRQLCFAVTHL-KARTGWERFRLAQGSDLL 262 Query: 285 DYL 287 D L Sbjct: 263 DNL 265 >KELCH Kelch repeat- beta propeller like domain Length = 319 Score = 19.9 bits (41), Expect = 6.4 Identities = 5/33 (15%), Positives = 11/33 (33%) Query: 194 TYHGLCELTSTVQEGELCVFFRNNHFSTMTKYK 226 + V ++ + + FS + YK Sbjct: 208 CPQPWRYTAAAVLGNQIFIMGGDTEFSACSAYK 240 >14-3-3 14-3-3 protein alpha Helical domain Length = 270 Score = 19.5 bits (40), Expect = 6.7 Identities = 9/39 (23%), Positives = 12/39 (30%), Gaps = 4/39 (10%) Query: 311 SDLELAKKLQEEEDRRASQYYQEQEQAAAAAAAASTQAQ 349 SD E + A QE + A A +A Sbjct: 235 SDAEYSAAAAGGNTEGA----QENAPSNAPEGEAEPKAD 269 >AP2 A plant specific DNA binding domain (Apetala 2 like) Length = 218 Score = 19.7 bits (40), Expect = 7.5 Identities = 8/32 (25%), Positives = 15/32 (46%) Query: 355 ASPSSGRQSGNSERKRKEPREKDKEKEKEKNS 386 AS S+ ++ R+ E +K K+ K+ Sbjct: 3 ASESTKSWEASAVRQENEEEKKKPVKDSGKHP 34 >CATH Cathepsin like protease domain Length = 371 Score = 19.4 bits (40), Expect = 8.2 Identities = 6/33 (18%), Positives = 14/33 (42%), Gaps = 1/33 (3%) Query: 145 IDDIVKAVGNCSYNQLVEKIISCKQSDNSELVS 177 I + KA+G + ++ C + + V+ Sbjct: 266 IKQLQKAIG-AKPIIKGQYMLPCDKLSSLPNVN 297 >CBS cystathionine beta -synthase domain (A predicted ligand binding domain) Length = 214 Score = 19.2 bits (39), Expect = 9.3 Identities = 5/48 (10%), Positives = 13/48 (26%), Gaps = 5/48 (10%) Query: 285 DYLMALSLQQEQQSQEIN--WEQ---IPEGISDLELAKKLQEEEDRRA 327 + L ++ E E+ + E + K + + Sbjct: 52 ELLGISEKDFKKPITEFMRPVEEVITVYEDDEARNVVLKFVKYKVVSI 99 >MBL Metallo-betalactamase domain Length = 256 Score = 19.3 bits (39), Expect = 9.6 Identities = 7/38 (18%), Positives = 12/38 (31%), Gaps = 5/38 (13%) Query: 97 MAI-LHKLQTGLDVNVRFTGVR----VFEYTPECIVFD 129 MA L L +G V + + + + D Sbjct: 1 MAANLTFLGSGSAFTVGADNFQSNAILTLDNGKKFLID 38 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 13 Number of calls to ALIGN: 13 Length of query: 390 Total length of test sequences: 20182 Effective length of test sequences: 16335.0 Effective search space size: 5778014.4 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= tem31.2t_gi|6330169|dbj|BAA86478.1| KIAA1164 protein [Homo sapiens] (390 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|1212831 [50..289] Protein kinases (PK), catalytic core 28 0.25 gi|1794259 [37..284] Ribosome inactivating proteins (RIP) 27 0.49 gi|451954 [433..740] R1 subunit of ribonucleotide reductase,... 27 0.59 gi|121832 [27..328] Glycosyltransferases of the superhelical... 27 0.75 gi|2132510 [25..215] Thiamin-binding 26 0.81 gi|1708319 [66..432] Protein kinases (PK), catalytic core 25 1.7 gi|2291201 [100..343] Protein kinases (PK), catalytic core 25 1.9 gi|1170645 [26..287] Protein kinases (PK), catalytic core 25 1.9 gi|3836 [670..990] Protein kinases (PK), catalytic core 25 2.6 gi|1402841 [24..384] Heme-dependent peroxidases 25 2.9 gi|547785 [449..624] Protein kinases (PK), catalytic core 24 3.7 gi|1546735 [56..315] (Phosphotyrosine) protein phosphatases ... 24 3.9 gi|438825 [248..397] Protein kinases (PK), catalytic core 24 4.2 gi|1132541 [56..176] Protein kinases (PK), catalytic core 24 4.7 gi|345362 [61..347] Protein kinases (PK), catalytic core 24 4.9 gi|1181440 [11..279] Protein kinases (PK), catalytic core 24 5.4 gi|1730038 [2..222] Protein kinases (PK), catalytic core 24 6.1 gi|2622591 [75..425] Tryptophan synthase, beta-subunit 23 6.3 gi|125641 [307..597] Protein kinases (PK), catalytic core 23 7.0 gi|555930 [44..303] Protein kinases (PK), catalytic core 23 7.2 gi|1673469 [117..360] Protein kinases (PK), catalytic core 23 7.2 gi|1345687 [59..410] Heme-dependent peroxidases 23 7.9 gi|630459 [12..279] Protein kinases (PK), catalytic core 23 8.4 gi|1235958 [6..172] Protein kinases (PK), catalytic core 23 9.1 gi|118788 [27..221] Thymidylate synthase 23 9.5 gi|1065289 [1..348] Isoprenyl diphosphate synthases 23 9.8 >gi|1212831 [50..289] Protein kinases (PK), catalytic core Length = 240 Score = 28.2 bits (61), Expect = 0.25 Identities = 14/105 (13%), Positives = 28/105 (26%), Gaps = 6/105 (5%) Query: 4 KERFPGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWKVKL------PPMM 57 + + IK + + E + L+ + KL + Sbjct: 20 VDVDDRLNQLAIKAMDLSTVTRENSYKLELMVLQRVETLSEVEQTRFSKLIGNFIQDSSL 79 Query: 58 EIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHK 102 + E + + + K S L M+ + LHK Sbjct: 80 GFFVMHKEGECVDEVWMRNKSGRFSASNVLKIVHCMAHGLRSLHK 124 >gi|1794259 [37..284] Ribosome inactivating proteins (RIP) Length = 248 Score = 27.0 bits (59), Expect = 0.49 Identities = 35/155 (22%), Positives = 61/155 (38%), Gaps = 24/155 (15%) Query: 228 QLYLLVTDQ---------GFLTEEKVVWESLHNVD--GDGNFCDSEFHL-RPPSDPETVY 275 + Y+ + Q G ++ +W + N G G +S F + PP + ++ Sbjct: 45 REYVYIRLQFSDTQWVVLGMAAKDMYIWGYVDNRPGFGPGQPPESNFLMDSPPEARQRLF 104 Query: 276 KGQQDQIDQDYLMALSLQQEQQSQEINWEQIPEGISDLELAKK-------LQEEEDRRAS 328 G +I SLQQ Q N + +P G++ L+ A K Q E + Sbjct: 105 PGSNRRITDYGGNYNSLQQRAQR---NRDNVPLGLTSLDGALKSVYGKSTSQLNEGNAEA 161 Query: 329 QYYQEQEQAAAAAAAASTQAQGQPAQASPSSGRQS 363 +++ Q A AA +G A P++ RQ+ Sbjct: 162 RFFLTAIQMVAEAARFKYMERG--ISAPPANFRQN 194 >gi|451954 [433..740] R1 subunit of ribonucleotide reductase, C-terminal domain Length = 308 Score = 26.8 bits (59), Expect = 0.59 Identities = 12/66 (18%), Positives = 21/66 (31%), Gaps = 3/66 (4%) Query: 211 CVFFRNNHFSTMTKYKGQLYLLVTDQGFLTEEKVVWESLHNVDGDGNFCDSEFHLRPPSD 270 + + +S+ K LY+ V + L + L V D+ D Sbjct: 225 GLNDVTSVYSSELK---SLYIPVYNNLLLNRFNKHQQYLKTVGYRVLNVDTNLFTDKELD 281 Query: 271 PETVYK 276 V+K Sbjct: 282 DLAVFK 287 >gi|121832 [27..328] Glycosyltransferases of the superhelical fold Length = 302 Score = 26.5 bits (58), Expect = 0.75 Identities = 12/70 (17%), Positives = 21/70 (29%), Gaps = 4/70 (5%) Query: 128 FDLLDIPLYHGW--LVDPQIDDIVKAVGNCSYNQLVEKIISCKQSDNSELVSEGFVAEQF 185 +D + IPLY W + N + + + + G +A + Sbjct: 213 YDAIRIPLYLYWYDAKTTALVPFQLYWRNYPRLTTPAWVDVLSSNTATYNMQGGLLAVRD 272 Query: 186 LNNTATQLTY 195 L T L Sbjct: 273 L--TMGNLDG 280 >gi|2132510 [25..215] Thiamin-binding Length = 191 Score = 26.4 bits (58), Expect = 0.81 Identities = 7/38 (18%), Positives = 11/38 (28%) Query: 323 EDRRASQYYQEQEQAAAAAAAASTQAQGQPAQASPSSG 360 E E AA AA ++ + + G Sbjct: 40 ESAGLRWVGTCNELNAAYAADGYSRYSNKIGCLITTYG 77 >gi|1708319 [66..432] Protein kinases (PK), catalytic core Length = 367 Score = 25.2 bits (53), Expect = 1.7 Identities = 5/38 (13%), Positives = 9/38 (23%) Query: 65 LMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHK 102 ++ Y S Y + + AM Sbjct: 107 VLPYYEHTDFRQYYSTFSYRDMSIYFRCLFQAMQQTQT 144 >gi|2291201 [100..343] Protein kinases (PK), catalytic core Length = 244 Score = 25.1 bits (53), Expect = 1.9 Identities = 9/56 (16%), Positives = 19/56 (33%), Gaps = 3/56 (5%) Query: 49 WKVKLPPMMEIITAEQLMEYL---GDYMLDAKPKEISEIQRLNYEQNMSDAMAILH 101 + +E + L +Y D++ I L+ +S A+ +H Sbjct: 71 TVFRPMIALEWLPGGTLADYFQFKVREKDDSERSPIQLKDMLSILYQVSQALKYIH 126 >gi|1170645 [26..287] Protein kinases (PK), catalytic core Length = 262 Score = 25.1 bits (53), Expect = 1.9 Identities = 5/40 (12%), Positives = 14/40 (34%) Query: 63 EQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHK 102 + + + L++ + M + +A LH+ Sbjct: 85 PSIPIVADPPVQKYTNQLDVNRYSLSFFRQMVEGIAFLHE 124 >gi|3836 [670..990] Protein kinases (PK), catalytic core Length = 321 Score = 24.8 bits (52), Expect = 2.6 Identities = 7/46 (15%), Positives = 16/46 (34%) Query: 56 MMEIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILH 101 +E+ + D K E ++ + ++ +A LH Sbjct: 75 ALELCNLNLQDLVESKNVSDENLKLQKEYNPISLLRQIASGVAHLH 120 >gi|1402841 [24..384] Heme-dependent peroxidases Length = 361 Score = 24.5 bits (53), Expect = 2.9 Identities = 11/77 (14%), Positives = 20/77 (25%), Gaps = 9/77 (11%) Query: 228 QLYLLVTDQGFLTEEKVVWES-LHNVDGDGNFCDSEFHLRPPSDPETVYKGQQDQIDQDY 286 + D G + +VV H+V + D P + D Sbjct: 150 TILARFADAGNFSPFEVVSLLASHSV-ARADKVDPTLDAAP-------FDTTPFTFDTQI 201 Query: 287 LMALSLQQEQQSQEINW 303 + + L+ N Sbjct: 202 FLEVLLKGVGFPGLDNN 218 >gi|547785 [449..624] Protein kinases (PK), catalytic core Length = 176 Score = 24.3 bits (51), Expect = 3.7 Identities = 8/78 (10%), Positives = 17/78 (21%) Query: 55 PMMEIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHKLQTGLDVNVRFT 114 + T L+ ++ P + + A+ +H Sbjct: 78 ERINDGTLYDLVISWDEFKRSKIPFAERCRLTIFLSLQLLSALKYMHSKTIVHGDIKLEN 137 Query: 115 GVRVFEYTPECIVFDLLD 132 + E L D Sbjct: 138 CLLQKEGKKSDWKVFLCD 155 >gi|1546735 [56..315] (Phosphotyrosine) protein phosphatases II Length = 260 Score = 24.2 bits (52), Expect = 3.9 Identities = 7/22 (31%), Positives = 9/22 (40%) Query: 10 QSVYHIKWIQWKEENTPIITQN 31 V+H K+ W E N P Sbjct: 136 HQVHHYKFHGWTEFNLPKYEDF 157 >gi|438825 [248..397] Protein kinases (PK), catalytic core Length = 150 Score = 24.1 bits (51), Expect = 4.2 Identities = 4/37 (10%), Positives = 9/37 (23%) Query: 65 LMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILH 101 + + D + + Q + LH Sbjct: 32 RLPLYEMTLRDFIADQRNHKQLALIATQTVQGIKELH 68 >gi|1132541 [56..176] Protein kinases (PK), catalytic core Length = 121 Score = 23.9 bits (51), Expect = 4.7 Identities = 4/41 (9%), Positives = 11/41 (26%) Query: 63 EQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHKL 103 E+ + E + +A+ +H+ Sbjct: 36 EKCKPLELYLKEAGLTESQKEFVVSWGMFQLLNALKFMHEA 76 >gi|345362 [61..347] Protein kinases (PK), catalytic core Length = 287 Score = 23.6 bits (49), Expect = 4.9 Identities = 7/81 (8%), Positives = 17/81 (20%) Query: 21 KEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEIITAEQLMEYLGDYMLDAKPKE 80 E + N +A + + Sbjct: 37 GMEIEILKKLKGASNIVQYFGSNHTKMAPGSVTSETISFAMEYASSSLEAEMSSPKNHSG 96 Query: 81 ISEIQRLNYEQNMSDAMAILH 101 +S ++ + S A++ L Sbjct: 97 LSSNALIDLVVDCSMALSALR 117 >gi|1181440 [11..279] Protein kinases (PK), catalytic core Length = 269 Score = 23.6 bits (49), Expect = 5.4 Identities = 10/95 (10%), Positives = 25/95 (25%) Query: 8 PGQSVYHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEIITAEQLME 67 G+ + + E+T +L + + + + + E L Sbjct: 30 TGEKIAVKTIKTKRYESTLHARHIILREYEMLQMFDCENIIKPLGINIENEASMLLPLYT 89 Query: 68 YLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHK 102 + E L ++ A+ +H Sbjct: 90 NDILMHTIQSKYGMPEEDVLKVGFQLTRAVKCIHD 124 >gi|1730038 [2..222] Protein kinases (PK), catalytic core Length = 221 Score = 23.6 bits (49), Expect = 6.1 Identities = 10/92 (10%), Positives = 28/92 (29%), Gaps = 4/92 (4%) Query: 13 YHIKWIQWKEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEIITAEQLMEYLGDY 72 + E + I + N + ++L+ + ++E L ++L Sbjct: 36 KTSEANMKNEYDVMKILSSCNPHPNICSMLDFYTDDSYYIM--VLEYCECGDLYDFLDIA 93 Query: 73 MLDA--KPKEISEIQRLNYEQNMSDAMAILHK 102 + +I + + A++ H Sbjct: 94 KSQGSPSSPSLIQIDMQKIIKQLCSAISFAHS 125 >gi|2622591 [75..425] Tryptophan synthase, beta-subunit Length = 351 Score = 23.3 bits (50), Expect = 6.3 Identities = 7/64 (10%), Positives = 14/64 (20%), Gaps = 12/64 (18%) Query: 98 AILHKLQTGLDVNVRFTGVRVF--------EYTPECIVFDLLDIPL---YHGWLVDPQID 146 L + D G+ ++ P + L LV + Sbjct: 222 PTLTAGEYRYDFGD-TAGMTPLLKMYTLGHDFVPPSVHAGGLRYHGMSPQVALLVREGVI 280 Query: 147 DIVK 150 + Sbjct: 281 NARA 284 >gi|125641 [307..597] Protein kinases (PK), catalytic core Length = 291 Score = 23.2 bits (48), Expect = 7.0 Identities = 5/47 (10%), Positives = 14/47 (29%) Query: 56 MMEIITAEQLMEYLGDYMLDAKPKEISEIQRLNYEQNMSDAMAILHK 102 + + + DA + +E++ + LH+ Sbjct: 75 LGLMDLLSFTQQSFWQPGKDACDPLVKRYLARRFEKHTLLGLEHLHE 121 >gi|555930 [44..303] Protein kinases (PK), catalytic core Length = 260 Score = 23.1 bits (48), Expect = 7.2 Identities = 5/40 (12%), Positives = 14/40 (34%), Gaps = 2/40 (5%) Query: 65 LMEYLGDYMLDA--KPKEISEIQRLNYEQNMSDAMAILHK 102 +M Y + + + + E ++ A+ +H Sbjct: 74 VMPYYKNDLFSFIQESDLLREDYIKVILHDLGLAIRHMHN 113 >gi|1673469 [117..360] Protein kinases (PK), catalytic core Length = 244 Score = 23.4 bits (49), Expect = 7.2 Identities = 7/41 (17%), Positives = 16/41 (38%), Gaps = 5/41 (12%) Query: 65 LMEYLG----DYMLDAKPKEISEIQRLNYEQNMSDAMAILH 101 ++ LG +Y K + + + + M D + +H Sbjct: 8 QLDLLGLNLLEYAELYGGK-LEVPEAFHLARQMIDLLHTIH 47 >gi|1345687 [59..410] Heme-dependent peroxidases Length = 352 Score = 23.2 bits (49), Expect = 7.9 Identities = 11/41 (26%), Positives = 23/41 (55%) Query: 304 EQIPEGISDLELAKKLQEEEDRRASQYYQEQEQAAAAAAAA 344 +++P + +LA + E ++ A +++Q E+ A A A A Sbjct: 308 KKVPTMMMTTDLALRFDPEYEKIARRFHQNPEEFAEAFARA 348 >gi|630459 [12..279] Protein kinases (PK), catalytic core Length = 268 Score = 23.2 bits (48), Expect = 8.4 Identities = 7/82 (8%), Positives = 26/82 (31%) Query: 21 KEENTPIITQNENGPCPLLAILNVLLLAWKVKLPPMMEIITAEQLMEYLGDYMLDAKPKE 80 + ++ I + E ++ V + ++T + + Sbjct: 45 NDNSSEIEFRKEVEMLEKFRCNYIIHFYGAVIIQDNKCMVTEYAKYGSVQKMIESKPSNS 104 Query: 81 ISEIQRLNYEQNMSDAMAILHK 102 +S+ ++ +++ + LH Sbjct: 105 LSKSIKIKMLLDIARGIEYLHN 126 >gi|1235958 [6..172] Protein kinases (PK), catalytic core Length = 167 Score = 22.9 bits (48), Expect = 9.1 Identities = 9/82 (10%), Positives = 26/82 (30%), Gaps = 15/82 (18%) Query: 65 LMEYLGDYMLD----AKPKEISEIQRLNYEQNMSDAMAILHKLQTGLDVNVRFTGVRVFE 120 +M+ + + + +++S + + + +H+ GV + Sbjct: 40 VMDKYLENLEQFRKRKEDEKLSPRVVIKLAFRLVSILEHIHRK-----------GVVHQD 88 Query: 121 YTPECIVFDLLDIPLYHGWLVD 142 + +VF L+D Sbjct: 89 IKLDNVVFGAKVGNKLDIVLID 110 >gi|118788 [27..221] Thymidylate synthase Length = 195 Score = 22.9 bits (49), Expect = 9.5 Identities = 4/21 (19%), Positives = 6/21 (28%) Query: 237 GFLTEEKVVWESLHNVDGDGN 257 G L +E + E Sbjct: 24 GALNDEYIQRELEWYKSKSLF 44 >gi|1065289 [1..348] Isoprenyl diphosphate synthases Length = 348 Score = 23.0 bits (49), Expect = 9.8 Identities = 15/91 (16%), Positives = 25/91 (26%), Gaps = 4/91 (4%) Query: 269 SDPETVYKGQQDQIDQD----YLMALSLQQEQQSQEINWEQIPEGISDLELAKKLQEEED 324 DP K D D + L +Q Q + + + K+L E Sbjct: 245 GDPALTGKVGTDIQDNKCSWLVVQCLQRVTPEQRQLLEDNYGRKEPEKVAKVKELYEAVG 304 Query: 325 RRASQYYQEQEQAAAAAAAASTQAQGQPAQA 355 RA+ E+ + P + Sbjct: 305 MRAAFQQYEESSYRRLQELIEKHSNRLPKEI 335 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 26 Number of calls to ALIGN: 26 Length of query: 390 Total length of test sequences: 256703 Effective length of test sequences: 206078.0 Effective search space size: 71521145.6 Initial X dropoff for ALIGN: 25.0 bits