analysis of sequence from PIGL_RAT.fa ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ >PIGL_RAT MEVVGLLCVA VAVLTWGFLR VWNSAERMRS PEQAGLPGAG SRALVVIAHP DDEAMFFAPT ILGLARLKQQ VSLLCFSSGN YYNQGEIRKK ELLQSCAVLG IPPSRVMIID KREFPDDPEV QWDTEHVAST ILQHIHANAT DLVVTFDAEG VSGHSNHIAL YKAVRALHSG GKLPEGCSVL TLQSVNVLRK YVFLLDLPWT LLSPQGVLFV LTSKEVAQAK KAMSCHRSQL LWFRHLYTVF SRYMSVNSLQ LL ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ sec.str. with predator > PIGL_RAT . . . . . 1 MEVVGLLCVAVAVLTWGFLRVWNSAERMRSPEQAGLPGAGSRALVVIAHP 50 ___HHHHHHHEEE______EEE____________________EEEEEE__ . . . . . 51 DDEAMFFAPTILGLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCAVLG 100 ___________HHHHHHHHHHHEEEE_________HHHHHHHHHHHHHH_ . . . . . 101 IPPSRVMIIDKREFPDDPEVQWDTEHVASTILQHIHANATDLVVTFDAEG 150 ____EEEEEE_____________HHHHHHHHHHHHH____EEEEEE____ . . . . . 151 VSGHSNHIALYKAVRALHSGGKLPEGCSVLTLQSVNVLRKYVFLLDLPWT 200 ______HHHHHHHHHHHH__________EEEEEEEHHHHHHHH_____EE . . . . . 201 LLSPQGVLFVLTSKEVAQAKKAMSCHRSQLLWFRHLYTVFSRYMSVNSLQ 250 E_____EEEEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHH__________ 251 LL 252 __ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ method : 1 alpha-contents : 40.4 % beta-contents : 41.7 % coil-contents : 18.0 % class : mixed method : 2 alpha-contents : 28.7 % beta-contents : 43.2 % coil-contents : 28.0 % class : mixed ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ GPI: learning from metazoa -16.65 -4.70 -2.20 -3.78 -4.00 0.00 0.00 0.00 -0.03 -2.40 0.00 -12.00 -12.00 0.00 0.00 0.00 -57.77 -16.07 -2.98 -2.95 -1.24 -4.00 0.00 0.00 -0.87 -2.75 -2.26 0.00 -12.00 -12.00 0.00 0.00 0.00 -57.12 ID: PIGL_RAT AC: xxx Len: 252 1:I 225 Sc: -57.12 Pv: 3.225255e-01 NO_GPI_SITE GPI: learning from protozoa -29.35 -4.14 -1.68 -0.58 -4.00 0.00 0.00 0.00 -4.38 -1.87 -0.32 -12.00 -12.00 0.00 0.00 0.00 -70.32 -20.85 -5.78 -2.89 -4.49 -4.00 0.00 0.00 -0.79 -3.51 -2.14 -0.32 -12.00 -12.00 0.00 0.00 0.00 -68.78 ID: PIGL_RAT AC: xxx Len: 252 1:I 228 Sc: -68.78 Pv: 3.290390e-01 NO_GPI_SITE ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ # SignalP euk predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? PIGL_RAT 0.731 220 Y 0.426 25 Y 0.988 9 Y 0.863 Y # SignalP gram- predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? PIGL_RAT 0.464 140 N 0.224 84 N 0.856 9 N 0.250 N # SignalP gram+ predictions # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? PIGL_RAT 0.708 220 Y 0.416 220 Y 0.975 64 Y 0.305 N ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ low complexity regions: SEG 12 2.2 2.5 >PIGL_RAT 1-2 ME vvgllcvavavl 3-14 15-252 TWGFLRVWNSAERMRSPEQAGLPGAGSRAL VVIAHPDDEAMFFAPTILGLARLKQQVSLL CFSSGNYYNQGEIRKKELLQSCAVLGIPPS RVMIIDKREFPDDPEVQWDTEHVASTILQH IHANATDLVVTFDAEGVSGHSNHIALYKAV RALHSGGKLPEGCSVLTLQSVNVLRKYVFL LDLPWTLLSPQGVLFVLTSKEVAQAKKAMS CHRSQLLWFRHLYTVFSRYMSVNSLQLL low complexity regions: SEG 25 3.0 3.3 >PIGL_RAT 1-252 MEVVGLLCVAVAVLTWGFLRVWNSAERMRS PEQAGLPGAGSRALVVIAHPDDEAMFFAPT ILGLARLKQQVSLLCFSSGNYYNQGEIRKK ELLQSCAVLGIPPSRVMIIDKREFPDDPEV QWDTEHVASTILQHIHANATDLVVTFDAEG VSGHSNHIALYKAVRALHSGGKLPEGCSVL TLQSVNVLRKYVFLLDLPWTLLSPQGVLFV LTSKEVAQAKKAMSCHRSQLLWFRHLYTVF SRYMSVNSLQLL low complexity regions: SEG 45 3.4 3.75 >PIGL_RAT 1-252 MEVVGLLCVAVAVLTWGFLRVWNSAERMRS PEQAGLPGAGSRALVVIAHPDDEAMFFAPT ILGLARLKQQVSLLCFSSGNYYNQGEIRKK ELLQSCAVLGIPPSRVMIIDKREFPDDPEV QWDTEHVASTILQHIHANATDLVVTFDAEG VSGHSNHIALYKAVRALHSGGKLPEGCSVL TLQSVNVLRKYVFLLDLPWTLLSPQGVLFV LTSKEVAQAKKAMSCHRSQLLWFRHLYTVF SRYMSVNSLQLL low complexity regions: XNU # Score cutoff = 21, Search from offsets 1 to 4 # both members of each repeat flagged # lambda = 0.347, K = 0.200, H = 0.664 >PIGL_RAT MEVVGLLCVAVAVLTWGFLRVWNSAERMRSPEQAGLPGAGSRALVVIAHPDDEAMFFAPT ILGLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCAVLGIPPSRVMIIDKREFPDDPEV QWDTEHVASTILQHIHANATDLVVTFDAEGVSGHSNHIALYKAVRALHSGGKLPEGCSVL TLQSVNVLRKYVFLLDLPWTLLSPQGVLFVLTSKEVAQAKKAMSCHRSQLLWFRHLYTVF SRYMSVNSLQLL 1 - 252 MEVVGLLCVA VAVLTWGFLR VWNSAERMRS PEQAGLPGAG SRALVVIAHP DDEAMFFAPT ILGLARLKQQ VSLLCFSSGN YYNQGEIRKK ELLQSCAVLG IPPSRVMIID KREFPDDPEV QWDTEHVAST ILQHIHANAT DLVVTFDAEG VSGHSNHIAL YKAVRALHSG GKLPEGCSVL TLQSVNVLRK YVFLLDLPWT LLSPQGVLFV LTSKEVAQAK KAMSCHRSQL LWFRHLYTVF SRYMSVNSLQ LL low complexity regions: DUST >PIGL_RAT MEVVGLLCVAVAVLTWGFLRVWNSAERMRSPEQAGLPGAGSRALVVIAHPDDEAMFFAPT ILGLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCAVLGIPPSRVMIIDKREFPDDPEV QWDTEHVASTILQHIHANATDLVVTFDAEGVSGHSNHIALYKAVRALHSGGKLPEGCSVL TLQSVNVLRKYVFLLDLPWTLLSPQGVLFVLTSKEVAQAKKAMSCHRSQLLWFRHLYTVF SRYMSVNSLQLL ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ coiled coil prediction for PIGL_RAT sequence: 252 amino acids, 0 residue(s) in coiled coil state . | . | . | . | . | . 60 MEVVGLLCVA VAVLTWGFLR VWNSAERMRS PEQAGLPGAG SRALVVIAHP DDEAMFFAPT ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 120 ILGLARLKQQ VSLLCFSSGN YYNQGEIRKK ELLQSCAVLG IPPSRVMIID KREFPDDPEV ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 180 QWDTEHVAST ILQHIHANAT DLVVTFDAEG VSGHSNHIAL YKAVRALHSG GKLPEGCSVL ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 14 M'95 -w local . | . | . | . | . | . 240 TLQSVNVLRK YVFLLDLPWT LLSPQGVLFV LTSKEVAQAK KAMSCHRSQL LWFRHLYTVF ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 -w border ---------- ---------- ---------- ---------- ---------- ---------- * 21 M'95 -w register ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 M'95 +w polar ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 21 MTK -w class ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ * 28 M'95 -w signif. ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~1111111 1111111~~~ ~~~~~~~~~~ * 14 M'95 -w local . | SRYMSVNSLQ LL ~~~~~~~~~~ ~~ ---------- -- ~~~~~~~~~~ ~~ ~~~~~~~~~~ ~~ ~~~~~~~~~~ ~~ ~~~~~~~~~~ ~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ prediction of transmembrane regions with toppred2 *********************************** *TOPPREDM with eukaryotic function* *********************************** PIGL_RAT.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: PIGL_RAT.fa.___inter___ (1 sequences) MEVVGLLCVAVAVLTWGFLRVWNSAERMRSPEQAGLPGAGSRALVVIAHP DDEAMFFAPTILGLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCAVLG IPPSRVMIIDKREFPDDPEVQWDTEHVASTILQHIHANATDLVVTFDAEG VSGHSNHIALYKAVRALHSGGKLPEGCSVLTLQSVNVLRKYVFLLDLPWT LLSPQGVLFVLTSKEVAQAKKAMSCHRSQLLWFRHLYTVFSRYMSVNSLQ LL (p)rokaryotic or (e)ukaryotic: e Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 1 21 1.702 Certain 2 193 213 1.067 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 Loop length 0 171 39 K+R profile 1.00 6.00 + CYT-EXT prof - - 1.00 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 7.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -2.00 (NEG-POS)/(NEG+POS): 1.0000 NEG: 1.0000 POS: 0.0000 -> Orientation: N-out CYT-EXT difference: -1.00 -> Orientation: N-in ---------------------------------------------------------------------- "PIGL_RAT" 252 1 21 #t 1.70208 193 213 #t 1.06667 ************************************ *TOPPREDM with prokaryotic function* ************************************ PIGL_RAT.fa.___inter___ is a single sequence Using hydrophobicity file: /bio_software/2D/toppredm/lib/Engelman-Steitz.scale Using cyt/ext file: /bio_software/2D/toppredm/lib/Cyt-Ext.prok Using sequence file: PIGL_RAT.fa.___inter___ (1 sequences) MEVVGLLCVAVAVLTWGFLRVWNSAERMRSPEQAGLPGAGSRALVVIAHP DDEAMFFAPTILGLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCAVLG IPPSRVMIIDKREFPDDPEVQWDTEHVASTILQHIHANATDLVVTFDAEG VSGHSNHIALYKAVRALHSGGKLPEGCSVLTLQSVNVLRKYVFLLDLPWT LLSPQGVLFVLTSKEVAQAKKAMSCHRSQLLWFRHLYTVFSRYMSVNSLQ LL (p)rokaryotic or (e)ukaryotic: p Charge-pair energy: 0 Length of full window (odd number!): 21 Length of core window (odd number!): 11 Number of residues to add to each end of helix: 1 Critical length: 60 Upper cutoff for candidates: 1 Lower cutoff for candidates: 0.6 Total of 1 structures are to be tested Candidate membrane-spanning segments: Helix Begin End Score Certainity 1 1 21 1.702 Certain 2 193 213 1.067 Certain ---------------------------------------------------------------------- Structure 1 Transmembrane segments included in this structure: Segment 1 2 Loop length 0 171 39 K+R profile 1.00 6.00 + CYT-EXT prof - - 1.00 For CYT-EXT profile neg. values indicate cytoplasmic preference. K+R difference: 7.00 Tm probability: 1.00 -> Orientation: N-in Charge-difference over N-terminal Tm (+-15 residues): -2.00 (NEG-POS)/(NEG+POS): 1.0000 NEG: 1.0000 POS: 0.0000 -> Orientation: N-out CYT-EXT difference: -1.00 -> Orientation: N-in ---------------------------------------------------------------------- "PIGL_RAT" 252 1 21 #t 1.70208 193 213 #t 1.06667 ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ SAPS. Version of April 11, 1996. Date run: Fri Mar 8 14:10:01 2002 File: /people/b_eisen/PIGL_RAT.fa.___saps___ ID PIGL_RAT DE - number of residues: 252; molecular weight: 28.0 kdal 1 MEVVGLLCVA VAVLTWGFLR VWNSAERMRS PEQAGLPGAG SRALVVIAHP DDEAMFFAPT 61 ILGLARLKQQ VSLLCFSSGN YYNQGEIRKK ELLQSCAVLG IPPSRVMIID KREFPDDPEV 121 QWDTEHVAST ILQHIHANAT DLVVTFDAEG VSGHSNHIAL YKAVRALHSG GKLPEGCSVL 181 TLQSVNVLRK YVFLLDLPWT LLSPQGVLFV LTSKEVAQAK KAMSCHRSQL LWFRHLYTVF 241 SRYMSVNSLQ LL -------------------------------------------------------------------------------- COMPOSITIONAL ANALYSIS (extremes relative to: swp23s) A : 21( 8.3%); C : 5( 2.0%); D : 9( 3.6%); E : 12( 4.8%); F : 10( 4.0%) G : 15( 6.0%); H : 9( 3.6%); I : 9( 3.6%); K : 10( 4.0%); L+ : 35(13.9%) M : 6( 2.4%); N : 7( 2.8%); P : 11( 4.4%); Q : 12( 4.8%); R : 13( 5.2%) S : 21( 8.3%); T : 10( 4.0%); V+ : 26(10.3%); W : 5( 2.0%); Y : 6( 2.4%) KR : 23 ( 9.1%); ED : 21 ( 8.3%); AGP : 47 ( 18.7%); KRED : 44 ( 17.5%); KR-ED : 2 ( 0.8%); FIKMNY : 48 ( 19.0%); LVIFM : 86 ( 34.1%); ST : 31 ( 12.3%). -------------------------------------------------------------------------------- CHARGE DISTRIBUTIONAL ANALYSIS 1 0-00000000 000000000+ 00000-+0+0 0-00000000 0+00000000 ---0000000 61 00000+0+00 0000000000 00000-0+++ -000000000 0000+0000- ++-00--0-0 121 00-0-00000 0000000000 -00000-0-0 0000000000 0+00+00000 0+00-00000 181 00000000++ 00000-0000 0000000000 000+-0000+ +00000+000 000+000000 241 0+00000000 00 A. CHARGE CLUSTERS. Positive charge clusters (cmin = 9/30 or 11/45 or 14/60): none Negative charge clusters (cmin = 8/30 or 11/45 or 13/60): none Mixed charge clusters (cmin = 13/30 or 18/45 or 22/60): none B. HIGH SCORING (UN)CHARGED SEGMENTS. There are no high scoring positive charge segments. There are no high scoring negative charge segments. There are no high scoring mixed charge segments. There are no high scoring uncharged segments. C. CHARGE RUNS AND PATTERNS. pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H..)| lmin0 4 | 4 | 6 | 44 | 9 | 8 | 11 | 10 | 10 | 13 | 7 | 9 | lmin1 6 | 5 | 7 | 53 | 11 | 10 | 13 | 13 | 12 | 16 | 9 | 11 | lmin2 7 | 6 | 8 | 59 | 12 | 12 | 15 | 14 | 14 | 18 | 10 | 12 | (Significance level: 0.010000; Minimal displayed length: 6) There are no charge runs or patterns exceeding the given minimal lengths. Run count statistics: + runs >= 3: 1, at 88; - runs >= 3: 1, at 51; * runs >= 4: 2, at 88; 110; 0 runs >= 29: 0 -------------------------------------------------------------------------------- DISTRIBUTION OF OTHER AMINO ACID TYPES 1. HIGH SCORING SEGMENTS. There are no high scoring hydrophobic segments. ____________________________________ High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED) Expected score/letter: -2.567 M_0.01= 75.57; M_0.05= 60.42; M_0.30= 42.40 1) From 3 to 19: length= 17, score=54.00 3 VVGLLCVAVA VLTWGFL L: 4(23.5%); A: 2(11.8%); G: 2(11.8%); V: 5(29.4%); 2. SPACINGS OF C. H2N-7-C-66-C-20-C-80-C-47-C-27-COOH 2*. SPACINGS OF C and H. (additional deluxe function for ALEX) H2N-7-C-40-H-25-C-20-C-29-H-7-H-1-H-17-H-2-H-10-H-8-C-47-C-H-8-H-17-COOH -------------------------------------------------------------------------------- REPETITIVE STRUCTURES. A. SEPARATED, TANDEM, AND PERIODIC REPEATS: amino acid alphabet. Repeat core block length: 4 B. SEPARATED AND TANDEM REPEATS: 11-letter reduced alphabet. (i= LVIF; += KR; -= ED; s= AG; o= ST; n= NQ; a= YW; p= P; h= H; m= M; c= C) Repeat core block length: 8 Aligned matching blocks: [ 181- 188] oinoinii [ 245- 252] oinoinii -------------------------------------------------------------------------------- MULTIPLETS. A. AMINO ACID ALPHABET. 1. Total number of amino acid multiplets: 21 (Expected range: 4-- 28) 2. Histogram of spacings between consecutive amino acid multiplets: (1-5) 10 (6-10) 5 (11-20) 3 (>=21) 4 3. Clusters of amino acid multiplets (cmin = 13/30 or 17/45 or 21/60): none B. CHARGE ALPHABET. 1. Total number of charge multiplets: 6 (Expected range: 0-- 10) 4 +plets (f+: 9.1%), 2 -plets (f-: 8.3%) Total number of charge altplets: 5 (Critical number: 11) 2. Histogram of spacings between consecutive charge multiplets: (1-5) 1 (6-10) 0 (11-20) 0 (>=21) 6 -------------------------------------------------------------------------------- PERIODICITY ANALYSIS. A. AMINO ACID ALPHABET (core: 4; !-core: 5) Location Period Element Copies Core Errors There are no periodicities of the prescribed length. B. CHARGE ALPHABET ({+= KR; -= ED; 0}; core: 5; !-core: 6) and HYDROPHOBICITY ALPHABET ({*= KRED; i= LVIF; 0}; core: 6; !-core: 9) Location Period Element Copies Core Errors 182- 211 5 i.... 6 6 0 188- 257 7 i..0... 9 7 /1/././3/./././ -------------------------------------------------------------------------------- SPACING ANALYSIS. There are no unusual spacings. ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Pfam (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/Pfam Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- DUF158 Uncharacterized LmbE-like protein, COG2120 -51.9 0.01 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- DUF158 1/1 31 249 .. 1 292 [] -51.9 0.01 Alignments of top-scoring domains: DUF158: domain 1 of 1, from 31 to 249: score -51.9, E = 0.01 *->ssenrLkdA.kkVLavhAHPDDesigmGATiakftdqGkrVlvvtlT ++ L+ A+ + L v AHPDDe + + Ti ++ +V +++ PIGL_RAT 31 PEQAGLPGAgSRALVVIAHPDDEAMFFAPTILGLARLKQQVSLLCFS 77 eGeegstlgsRWAqllaDsadeLaeiRreElaeAAriLGVkkhiqLglad G ++ +eiR++El + +LG + PIGL_RAT 78 SG---NYYNQ-------------GEIRKKELLQSCAVLGI-PPSRVMII- 109 rgldkGdlgt.qlpDrgLaqsdLeevtkalVkviRelrPHvllVFdpngg + pD+ +q d e v + + i ++++Fd g+ PIGL_RAT 110 --------DKrEFPDDPEVQWDTEHVASTILQHIHANATDLVVTFDAEGV 151 yPYEgHPDHrrtHtVtaAaveaAgafagtpdfpgdpwtvpklYyvvlfLr gH H+ ++ a a +++ ++p + l PIGL_RAT 152 S---GHSNHIA-------LYKAVRA---LHSGGKLPEGCSVL-------- 180 ekvsklsaeflasfhRGPFEeWl.lkrdddidFfvsdDGiddliEakkqa ++++ + + f l+l++ + s G+ + k+ a PIGL_RAT 181 -TLQSVNVLRKYVFL-------LdLPWT-----LLSPQGVLFVLTSKEVA 217 RlrklaAlfahatqsvsePllraiaellgereklykeEgfrlargsfp<- ++ +A+ +h q +++ + ++ r++ + PIGL_RAT 218 --QAKKAMSCHRSQLL-----WFR------HL---YTVFSRYMSVNSL 249 * PIGL_RAT - - // Start with PfamFrag (from /data/patterns/pfam) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/pfam/PfamFrag Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- DUF158 Uncharacterized LmbE-like protein, COG 8.5 0.14 1 Acyl-CoA_hydro Cytosolic long-chain acyl-CoA thioeste 1.3 35 1 Topoisomer_I_N Eukaryotic DNA topoisomerase I, DNA bi -0.9 65 1 DUF229 Protein of unknown function (DUF229) -1.1 58 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- DUF158 1/1 31 100 .. 1 85 [. 8.5 0.14 Acyl-CoA_hydro 1/1 107 139 .. 1 35 [. 1.3 35 DUF229 1/1 177 186 .. 552 561 .] -1.1 58 Topoisomer_I_N 1/1 190 194 .. 223 227 .] -0.9 65 Alignments of top-scoring domains: DUF158: domain 1 of 1, from 31 to 100: score 8.5, E = 0.14 *->ssenrLkdA.kkVLavhAHPDDesigmGATiakftdqGkrVlvvtlT ++ L+ A+ + L v AHPDDe + + Ti ++ +V +++ PIGL_RAT 31 PEQAGLPGAgSRALVVIAHPDDEAMFFAPTILGLARLKQQVSLLCFS 77 eGeegstlgsRWAqllaDsadeLaeiRreElaeAAriLG<-* G ++ +eiR++El + +LG PIGL_RAT 78 SG---NYYNQ-------------GEIRKKELLQSCAVLG 100 Acyl-CoA_hydro: domain 1 of 1, from 107 to 139: score 1.3, E = 35 *->msedstekdveqdlvtmlktklllrtiamPedtNa<-* m++d e++ + + ++ t+++ +ti++ ++Na PIGL_RAT 107 MIIDKREFPDDPEVQWD--TEHVASTILQHIHANA 139 DUF229: domain 1 of 1, from 177 to 186: score -1.1, E = 58 *->CapLeLqkVe<-* C+ L+Lq+V+ PIGL_RAT 177 CSVLTLQSVN 186 Topoisomer_I_N: domain 1 of 1, from 190 to 194: score -0.9, E = 65 *->KYVfL<-* KYVfL PIGL_RAT 190 KYVFL 194 // Start with Repeat Library (from /data/patterns/repeats-Miguel-Andrade/hmm) hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/repeats-Miguel-Andrade/hmm/repeats.hmm-lib Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Prosite --------------------------------------------------------- | ppsearch (c) 1994 EMBL Data Library | | based on MacPattern (c) 1990-1994 R. Fuchs | --------------------------------------------------------- PROSITE pattern search started: Fri Mar 8 14:11:54 2002 Sequence file: PIGL_RAT.fa ---------------------------------------- Sequence PIGL_RAT (252 residues): Matching pattern PS00001 ASN_GLYCOSYLATION: 138: NATD Total matches: 1 Matching pattern PS00005 PKC_PHOSPHO_SITE: 212: TSK Total matches: 1 Matching pattern PS00006 CK2_PHOSPHO_SITE: 212: TSKE Total matches: 1 Matching pattern PS00008 MYRISTYL: 35: GLPGAG 79: GNYYNQ 100: GIPPSR Total matches: 3 Total no of hits in this sequence: 6 ======================================== 1314 pattern(s) searched in 1 sequence(s), 252 residues. Total no of hits in all sequences: 6. Search time: 00:00 min ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with Profile Search ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ Start with motif search against own library ***** bioMotif : Version V41a DB, 1999 Nov 11 ***** SeqTyp=2 : PROTEIN search; >APC D-Box is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >ER-GOLGI-traffic signal is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M minimal SH3 binding is the MOTIF name >PIGL_RAT ;LENGTH=252; DIRECT_SEQUENCE n 1 solutions m %_PXXP 115-118 f >STATISTICS Total : 1 solutions in 1 sequences, 252 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M deubiquitinating enzyme SH3 domain binding motif (Kato, 2000) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M minimal class I consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M minimal class II consensus-SH3 binding motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M exact 14-3-3 binding consensus (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M 14-3-3 binding motif in RAF and others (Muslin 1996 Cell 84 889) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M WW domain binding motif in formins (Bedford 1997) is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >INTRA-SIGNAL-M PY motif for WW domain is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >TM-CYTOPLASMIC-M di-hydrophobic endocytosis motifs for internalized transmembrane proteins is the MOTIF name >PIGL_RAT ;LENGTH=252; DIRECT_SEQUENCE n 1 solutions m %_E 2-2 %_XXXL 3-6 %_L 7-7 f >STATISTICS Total : 1 solutions in 1 sequences, 252 units; out of 1 sequences, 252 units >TM-CYTOPLASMIC-M tyrosine-based endocytosis motif for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >TM-EXTRACELL-M Endocytosis signal for internalized transmembrane proteins is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >EXTRACELL-M minimal furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >EXTRACELL-M extended furin protease cleavage site motif is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >EXTRACELL-M zinc binding motif in MMPs is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >EXTRACELL-M g alpha binding go loco is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS PDX-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS QKI-5 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS HCDA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SV40 LrgT experimentally determined is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS H2B experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS v-Rel experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Amida experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS RanBP3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Pho4p experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS DNAhelicaseQ1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS LEF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS TCF-1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR p53-NLS1 NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hum-Ku70 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS GAL4 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS act/inh betaA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS BDV-P experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS TR2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS THOV NP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS polyomaVP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS HIV-1 Tat experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS HIV-1 Rev experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Rex experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SRY experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SOX9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS NS5A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS DNAse EBV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS adenovE1a experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS ystDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hVDR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS CPV capsid experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hGlu.cort.experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS cFOS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS cJUN experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hDNApolalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hDNAtopoII experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hBLM experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hARNT experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS influenzaNP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS p54 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS hProTalpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Tst1/Oct6 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS protHsc9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS protHsci experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS protHsc3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Ta alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Pax-QNR experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Hunt.Dis.pro experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS MyoD experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS opaque2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS CTP experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS HCV experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS p110RB1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS VirD2-Nterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS VirD2-Cterm experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Nucloplasmin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Nucleolin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS ICP-8 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Nab2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS M9 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS lscMyc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS humKprotein experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS FluA experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Mat-alpha experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS polyoma Lrg-T experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SV40 VP1 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS SV40 VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS polyoma VP2 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS c-myb experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS N-myc experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS p53 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS c-erb-A experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS yeast SKI3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS L29 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS Max experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS L3 experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >NUCLEAR NLS dyskerin experimentally determined NLS is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >PDZ domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units >WW domain binding motif science 278_2075_pawson is the MOTIF name >STATISTICS Total : 0 solutions in 0 sequences, 0 units; out of 1 sequences, 252 units ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~ ~~~ Start with HMM-search search against own library hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm.lib Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/own/own-hmm-f.lib Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ L. Aravind's signalling DB+ PSSM from other authors IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= PIGL_RAT (252 letters) Searching..................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value ARR Arrestin domain 24 0.18 KIN Protein kinase domain 19 5.7 AAA AAA+ ATPase Module 19 6.1 PAP Papain/bleomycin hydrolase like domain 19 7.3 FKBP FK506 binding protein (Peptidyl prolyl isomerase) 19 8.0 >ARR Arrestin domain Length = 454 Score = 24.5 bits (53), Expect = 0.18 Identities = 17/95 (17%), Positives = 17/95 (17%), Gaps = 16/95 (16%) Query: 65 ARLKQQVSLLCFSSGNYYNQGEIRK-KELLQS------CAVLGIPPSRVMIIDKREFP-D 116 Sbjct: 264 IYIIQVADICLFTTASYSCEVARIESNEGFPVGPGGTLSKVFAVCPLLSNNKDKRGLALD 323 Query: 117 DPEVQWDTEHVASTILQHIHANATDLVVTFDAEGV 151 Sbjct: 324 GQLKHEDTNLASSTILDS--------KTSKESLGI 350 >KIN Protein kinase domain Length = 313 Score = 19.3 bits (39), Expect = 5.7 Identities = 10/83 (12%), Positives = 10/83 (12%), Gaps = 16/83 (19%) Query: 71 VSLLCFSSGNYYNQGEIRKKELLQSCAVLGIPPSRVMI----------IDKREFPDDPEV 120 Sbjct: 222 CIFAELLGRKPLFQGKDYIHQITLIIETIGSPSEEDICNIANEQARQFIRSLNMGNQPKV 281 Query: 121 QWDTEHVASTILQHIHANATDLV 143 Sbjct: 282 NF------ANMFPKANPDAIDLL 298 >AAA AAA+ ATPase Module Length = 298 Score = 19.0 bits (38), Expect = 6.1 Identities = 3/52 (5%), Positives = 3/52 (5%), Gaps = 1/52 (1%) Query: 129 STILQHIHANATDLVVTFDAEGVSGHSNHIALYKAVRALHSGGKLPEGCSVL 180 Sbjct: 94 TLLARAVAHHTDCTFIRVSGSELVQKF-IGEGARMVRELFVMAREHAPSIIF 144 >PAP Papain/bleomycin hydrolase like domain Length = 376 Score = 18.8 bits (38), Expect = 7.3 Identities = 7/29 (24%), Positives = 7/29 (24%) Query: 6 LLCVAVAVLTWGFLRVWNSAERMRSPEQA 34 Sbjct: 10 LLALLVAGLAQGIRGPLRAQDLGPQPLEL 38 >FKBP FK506 binding protein (Peptidyl prolyl isomerase) Length = 149 Score = 18.6 bits (38), Expect = 8.0 Identities = 13/74 (17%), Positives = 13/74 (17%), Gaps = 5/74 (6%) Query: 85 GEIRKKELLQSCAVLGIP-PSRVMIIDKREFPDDPEVQWDTEHVASTILQHIHANATDLV 143 Sbjct: 58 GDKTTFSLEPDAAF-GVPSPDLIQYFSRREFMDAGEPEIGAIMLFTAMDGSEMPGV---I 113 Query: 144 VTFDAEGVSGHSNH 157 Sbjct: 114 REINGDSITVDFNH 127 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 105 Number of sequences better than 10.0: 5 Number of calls to ALIGN: 5 Length of query: 252 Total length of test sequences: 20182 Effective length of test sequences: 16738.0 Effective search space size: 3671940.0 Initial X dropoff for ALIGN: 25.0 bits Y. Wolf's SCOP PSSM IMPALA version 1.1 [20-December-1999] Reference: Alejandro A. Schaffer, Yuri I. Wolf, Chris P. Ponting, Eugene V. Koonin, L. Aravind, Stephen F. Altschul (1999), "IMPALA: Matching a Protein Sequence Against a Collection of "PSI-BLAST-Constructed Position-Specific Score Matrices", Bioinformatics 15:1000-1011. Query= PIGL_RAT (252 letters) Searching.................................................done Results from profile search Score E Sequences producing significant alignments: (bits) Value gi|127802 [1..266] Glucosamine 6-phosphate deaminase 25 1.5 gi|116909 [47..259] S-adenosyl-L-methionine-dependent methyl... 25 1.7 gi|2707940 [1..184] Ribonuclease H-like motif 24 3.3 gi|1051321 [353..924] P-loop containing nucleotide triphosph... 24 3.4 gi|1361319 [311..442] Aldehyde oxidoreductase, molybdemum co... 23 4.6 gi|554502 [2..57] Flavodoxin-like 23 5.6 gi|1351438 [723..850] Aldehyde oxidoreductase, molybdemum co... 23 5.6 gi|585716 [2..158] Phosphotyrosine protein phosphatases I 23 6.3 gi|1653663 [65..327] beta-Lactamase/D-ala carboxypeptidase 23 6.4 gi|2635745 [372..520] Aldehyde oxidoreductase, molybdemum co... 23 7.1 gi|1495469 [3..289] P-loop containing nucleotide triphosphat... 22 7.8 gi|1351438 [967..1131] Aldehyde oxidoreductase, molybdemum c... 23 7.9 gi|134540 [478..645] Lysozyme-like 22 8.5 gi|2842409 [43..367] beta/alpha (TIM)-barrel 22 8.9 gi|887816 [375..530] Aldehyde oxidoreductase, molybdemum cof... 22 9.9 >gi|127802 [1..266] Glucosamine 6-phosphate deaminase Length = 266 Score = 24.9 bits (54), Expect = 1.5 Identities = 10/54 (18%), Positives = 10/54 (18%), Gaps = 5/54 (9%) Query: 121 QWDTEHVASTILQHIHANATDLVVTFDAEGVSGHSNHIALYKAVRALHSGGKLP 174 Sbjct: 14 KWAARHIVNRINAFKPTADRPFVL-----GLPTGGTPMTTYKALVEMHKAGQVS 62 >gi|116909 [47..259] S-adenosyl-L-methionine-dependent methyltransferases Length = 213 Score = 24.9 bits (53), Expect = 1.7 Identities = 12/120 (10%), Positives = 12/120 (10%), Gaps = 10/120 (8%) Query: 61 ILGLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCAVLGIPPSRVMIIDKRE----FPD 116 Sbjct: 47 MDAVIREYSPSLVLELGAYCGYSAVRMARL-LQPGARLLTMEMNPDYAAITQQMLNFAGL 105 Query: 117 DPEVQW---DTEHVASTILQHIHANATDLVVTFDAEGVSGHSNHIALYKAVRALHSGGKL 173 Sbjct: 106 QDKVTILNGASQDLIPQLKKKYDVDTLDMVFLDHWKDRYLPDT--LLLEKCGLLRKGTVL 163 >gi|2707940 [1..184] Ribonuclease H-like motif Length = 184 Score = 23.7 bits (51), Expect = 3.3 Identities = 9/88 (10%), Positives = 9/88 (10%), Gaps = 16/88 (18%) Query: 72 SLLCFSSGNYYNQGEIRKKELLQSCAVLGIPPSRVMI-------IDKREFPDDPEVQWDT 124 Sbjct: 33 YLYLFSDSNHMTFGYEAES-------LMSNEKVKGSFYRDLKRWVGCDSSNLDEYLNRLK 85 Query: 125 EHVASTILQHIHANATDLVVTFDAEGVS 152 Sbjct: 86 PHYSVRLIK--IGSGLNETVSIGNFGGT 111 >gi|1051321 [353..924] P-loop containing nucleotide triphosphate hydrolases Length = 572 Score = 23.7 bits (51), Expect = 3.4 Identities = 12/53 (22%), Positives = 12/53 (22%), Gaps = 9/53 (16%) Query: 66 RLKQQVSLLCFSSGNYYNQGEIRK---------KELLQSCAVLGIPPSRVMII 109 Sbjct: 121 EMKAKFGIKGLQKFFYINQGNSSENIQHDVNRFKHLESALHVLGFSDDHCMSI 173 >gi|1361319 [311..442] Aldehyde oxidoreductase, molybdemum cofactor-binding domain Length = 132 Score = 23.3 bits (50), Expect = 4.6 Identities = 2/15 (13%), Positives = 2/15 (13%) Query: 98 VLGIPPSRVMIIDKR 112 Sbjct: 92 GVGLEPDQLVLVANP 106 >gi|554502 [2..57] Flavodoxin-like Length = 56 Score = 22.9 bits (48), Expect = 5.6 Identities = 10/32 (31%), Positives = 10/32 (31%) Query: 39 AGSRALVVIAHPDDEAMFFAPTILGLARLKQQ 70 Sbjct: 1 AVRRALIVLAHAERTSFNYAMKEAAVEALKKK 32 >gi|1351438 [723..850] Aldehyde oxidoreductase, molybdemum cofactor-binding domain Length = 128 Score = 22.8 bits (49), Expect = 5.6 Identities = 6/15 (40%), Positives = 6/15 (40%) Query: 98 VLGIPPSRVMIIDKR 112 Sbjct: 86 ALGVPSNRIVVRVKR 100 >gi|585716 [2..158] Phosphotyrosine protein phosphatases I Length = 157 Score = 22.9 bits (49), Expect = 6.3 Identities = 5/31 (16%), Positives = 5/31 (16%), Gaps = 1/31 (3%) Query: 119 EVQWDTEHVASTILQHIHANATDLVVTFDAE 149 Sbjct: 65 NHGINTAHKARQVTKEDFVTF-DYILCMDES 94 >gi|1653663 [65..327] beta-Lactamase/D-ala carboxypeptidase Length = 263 Score = 22.5 bits (48), Expect = 6.4 Identities = 11/60 (18%), Positives = 11/60 (18%), Gaps = 2/60 (3%) Query: 140 TDLVVTFDAEGVSGHSNHIALYKAVRALHSGGKLPEGCSVLTLQSVNVLRKYVFLLDLPW 199 Sbjct: 196 GNLSWAGAAGGLV--ANTEDIINWVRALFVEDQLLSTEQKTQLTRLVSLITGKLIDGTTS 253 >gi|2635745 [372..520] Aldehyde oxidoreductase, molybdemum cofactor-binding domain Length = 149 Score = 22.6 bits (48), Expect = 7.1 Identities = 4/20 (20%), Positives = 4/20 (20%), Gaps = 1/20 (5%) Query: 92 LLQSCA-VLGIPPSRVMIID 110 Sbjct: 104 IEQIVMEELGCAAEDISIVI 123 >gi|1495469 [3..289] P-loop containing nucleotide triphosphate hydrolases Length = 287 Score = 22.3 bits (46), Expect = 7.8 Identities = 4/81 (4%), Positives = 4/81 (4%), Gaps = 1/81 (1%) Query: 90 KELLQSCAVLGIPPSRVMIIDKREFPDDPEVQWDTEHVASTILQHIHANATDLVVTFDAE 149 Sbjct: 131 CGGFAMPIRENKAQ-EIYIVMSGEMMAMYAANNISKGILKYANSGGVRLGGLICNERKTD 189 Query: 150 GVSGHSNHIALYKAVRALHSG 170 Sbjct: 190 KELELATALAAKLNSKLIHFV 210 >gi|1351438 [967..1131] Aldehyde oxidoreductase, molybdemum cofactor-binding domain Length = 165 Score = 22.5 bits (48), Expect = 7.9 Identities = 7/20 (35%), Positives = 7/20 (35%), Gaps = 1/20 (5%) Query: 92 LLQSCA-VLGIPPSRVMIID 110 Sbjct: 109 MIQVASRSLGIPTSKIYISE 128 >gi|134540 [478..645] Lysozyme-like Length = 168 Score = 22.2 bits (47), Expect = 8.5 Identities = 17/58 (29%), Positives = 17/58 (29%), Gaps = 2/58 (3%) Query: 99 LGIPPSRVMIIDKREFPDDPEVQWDTEHVASTILQHIHANATDLVVTFDAEGVSGHSN 156 Sbjct: 14 KEIPQSYAMAIARQESAWNPKVK--SPVGASGLMQIMPGTATHTVKMFSIPGYSSPGQ 69 >gi|2842409 [43..367] beta/alpha (TIM)-barrel Length = 325 Score = 22.2 bits (47), Expect = 8.9 Identities = 6/42 (14%), Positives = 6/42 (14%) Query: 69 QQVSLLCFSSGNYYNQGEIRKKELLQSCAVLGIPPSRVMIID 110 Sbjct: 51 TAYDKIIFGFTGIVGDKGANQYKIEQAAAWTGKKQYEMTILD 92 >gi|887816 [375..530] Aldehyde oxidoreductase, molybdemum cofactor-binding domain Length = 156 Score = 22.2 bits (47), Expect = 9.9 Identities = 11/49 (22%), Positives = 11/49 (22%), Gaps = 3/49 (6%) Query: 63 GLARLKQQVSLLCFSSGNYYNQGEIRKKELLQSCA-VLGIPPSRVMIID 110 Sbjct: 81 ARLLMNQDGTINVQSGATEIGQGADTV--FSQMVAETVGVPVSDVRVIS 127 Underlying Matrix: BLOSUM62 Number of sequences tested against query: 1187 Number of sequences better than 10.0: 15 Number of calls to ALIGN: 15 Length of query: 252 Total length of test sequences: 256703 Effective length of test sequences: 210706.0 Effective search space size: 44924394.3 Initial X dropoff for ALIGN: 25.0 bits ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ calculation of internal repeats with prospero ***** PROSPERO v1.3 Fri Mar 8 14:12:28 2002 ***** Copyright 2000, Richard Mott, Wellcome Trust Centre for Human Genetics, University of Oxford For help see http://www.well.ox.ac.uk/ariadne For usage use -help using gap penalty 11+1k using matrix BLOSUM62 printing all alignments with eval < 0.100000 using sequence1 PIGL_RAT using self-comparison ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~ TIGRFAM hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/tigrfam/tigrfam.hmm-f Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- [no hits above thresholds] Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- [no hits above thresholds] Alignments of top-scoring domains: [no hits above thresholds] // SMART hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/iprscan/data/smart.HMMs Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- 53EXOc 5'-3' exonuclease -147.2 17 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- 53EXOc 1/1 104 252 .] 1 276 [] -147.2 17 Alignments of top-scoring domains: 53EXOc: domain 1 of 1, from 104 to 252: score -147.2, E = 17 *->kkptLlLvDGsSLayRAyfAlpnqkepLknskGepTnAvyGFlrmLl + ++++D R +p+ +p + e+ +++++l PIGL_RAT 104 SR--VMIIDK-----R---EFPD--DPEVQWDTEH------VASTIL 132 kllkeekkPtyvavvFDsakgktFRhelYpeYKanRpepq....ktPde. + + + t ++v FD a+g + + + YKa R+ ++++k P+ PIGL_RAT 133 QHIHANA--TDLVVTFD-AEGVSGHSNHIALYKAVRA--LhsggKLPEGc 177 ..LieQ.iplikelldCalGipvleveGyEADDvIATLAkkaeaeGfeVr + L Q++ ++++++ +l +p ++ PIGL_RAT 178 svLTLQsVNVLRKYVF-LLDLP-WT------------------------- 200 IvSgDKDllQLvsdkvsvldptkgikdfedlytpenVieKfyGvtPeQii L+s+ +++++ + + e PIGL_RAT 201 ----------LLSPQGVLFVL-----TSK---------EV---------- 216 DylALmGDsSDNIPGVpGIGeKTAakLLkeyGSLEnilenldelktlakk + k PIGL_RAT 217 -----------------------------------------AQAK----- 220 lrekLlahkEdAkLSrkLatietdvplevdledlrlk<-* ++ +h+++++ +r L t + + + +l+l PIGL_RAT 221 --KAMSCHRSQLLWFRHL---YTVFSRYMSVNSLQLL 252 // COG hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG2120 -33.2 1.2e-05 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG2120 1/1 11 251 .. 1 317 [] -33.2 1.2e-05 Alignments of top-scoring domains: COG2120: domain 1 of 1, from 11 to 251: score -33.2, E = 1.2e-05 *->mfeinspleeaffkllr.eVlemtstsplaselrvLavfpHPDDEsi + + ++ +r e +++ ++++ r L+v +HPDDE+ PIGL_RAT 11 VAVLTWGFLRVWNSAERmRSPEQAGL--PGAGSRALVVIAHPDDEAM 55 GcGGTLakyaaaGveVglvclTlGEmGenltqplldthetlgeiRreEla + T+ +a+ +V l+c+ +++ +++++++ ++El PIGL_RAT 56 FFAPTILGLARLKQQVSLLCF---------SSGNYYNQGEIR---KKELL 93 eAArvLGvekrillglgsrypDggLetepddqearvdlvqaqtalva.td +++vLG+++ + ++ ++++ pdd+e+ d + va+t PIGL_RAT 94 QSCAVLGIPPSRVMIID----KREF---PDDPEVQWDTEH-----VAsTI 131 LravirelrPhvVltpdPwnGgdgHPDHrathelavaAvasagipkrpnd L + + + ++V+t+d G+ gH H+a +++++a + ++p + PIGL_RAT 132 LQHIHANAT-DLVVTFDA-EGVSGHSNHIALYKAVRALHSGGKLPEGCSV 179 wgvsaavayytdlgasplqagsryqlnatldPdvllltsdelavtqkvvd +++ ++ y + +l+ ++l + + +++ v PIGL_RAT 180 L------------TLQSVNVLRKYVFLLDLPWTLL--SPQGVLF----VL 211 isavaevKlaAiraHrtQfadtalferafplenerlelrveaaltwygle +s+ +++ ++A +Hr+Q+ + ++ +++ + + +++ PIGL_RAT 212 TSKEVAQAKKAMSCHRSQLLW--FRHLYTVFSR-YMSVN----------- 247 agvtyaegfrgeeslllailgl<-* +l+l PIGL_RAT 248 ------------------SLQL 251 // hmmpfam - search a single seq against HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /data/patterns/cogs/cogs.hmm-f Sequence file: PIGL_RAT.fa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: PIGL_RAT Scores for sequence family classification (score includes all domains): Model Description Score E-value N -------- ----------- ----- ------- --- COG2120 10.9 0.029 1 COG0394 7.7 0.49 1 COG0633 2.4 19 1 COG2881 0.2 75 1 COG1893 -0.2 57 1 COG0472 -0.5 63 1 COG0083 -0.6 99 1 COG0519 -3.1 98 1 Parsed for domains: Model Domain seq-f seq-t hmm-f hmm-t score E-value -------- ------- ----- ----- ----- ----- ----- ------- COG0472 1/1 3 21 .. 393 432 .] -0.5 63 COG2881 1/1 42 69 .. 181 208 .] 0.2 75 COG2120 1/1 44 103 .. 35 106 .. 10.9 0.029 COG0633 1/1 106 125 .. 1 20 [. 2.4 19 COG0519 1/1 121 135 .. 1 15 [. -3.1 98 COG0394 1/1 119 151 .. 63 95 .. 7.7 0.49 COG0083 1/1 123 157 .. 291 329 .] -0.6 99 COG1893 1/1 159 175 .. 311 327 .] -0.2 57 Alignments of top-scoring domains: COG0472: domain 1 of 1, from 3 to 21: score -0.5, E = 63 *->iislilaliglatlllaavgvllavifaflrfviwlklrl<-* +++l++++++++t w +lr+ PIGL_RAT 3 VVGLLCVAVAVLT---------------------WGFLRV 21 COG2881: domain 1 of 1, from 42 to 69: score 0.2, E = 75 *->lklaavfaaiplaiylllivlalvrlll<-* ++l+++ + +a +++ ++l+l rl + PIGL_RAT 42 RALVVIAHPDDEAMFFAPTILGLARLKQ 69 COG2120: domain 1 of 1, from 44 to 103: score 10.9, E = 0.029 *->LavfpHPDDEsiGcGGTLakyaaaGveVglvclTlGEmGenltqpll L+v +HPDDE+ + T+ +a+ +V l+c+ +++ PIGL_RAT 44 LVVIAHPDDEAMFFAPTILGLARLKQQVSLLCF---------SSGNY 81 dthetlgeiRreElaeAArvLGvek<-* +++++++ ++El +++vLG+++ PIGL_RAT 82 YNQGEIR---KKELLQSCAVLGIPP 103 COG0633: domain 1 of 1, from 106 to 125: score 2.4, E = 19 *->mmanttiefpdgkeleiesd<-* +m ++ +efpd+ e++++++ PIGL_RAT 106 VMIIDKREFPDDPEVQWDTE 125 COG0519: domain 1 of 1, from 121 to 135: score -3.1, E = 98 *->nWtmenfieeaieeI<-* +W++e+ + ++ ++I PIGL_RAT 121 QWDTEHVASTILQHI 135 COG0394: domain 1 of 1, from 119 to 151: score 7.7, E = 0.49 *->ehGidisgltsrqlteedfdefDliiamdgenk<-* e+ +d+++++s+ l ++++++ Dl++++d e++ PIGL_RAT 119 EVQWDTEHVASTILQHIHANATDLVVTFDAEGV 151 COG0083: domain 1 of 1, from 123 to 157: score -0.6, E = 99 *->daeeiaaalrevftknGnidaevgvltidgdGakverap<-* d+e +a+ + ++++ n a v+t+d++G++++++ PIGL_RAT 123 DTEHVASTILQHIHAN----ATDLVVTFDAEGVSGHSNH 157 COG1893: domain 1 of 1, from 159 to 175: score -0.2, E = 57 *->tLyalvkaleaeggkvk<-* +Ly++v+al+++g + PIGL_RAT 159 ALYKAVRALHSGGKLPE 175 //