ID POLG_POL3L STANDARD; PRT; 2206 AA. AC P03302; Q84783; Q84784; Q84785; Q84786; Q84787; Q84788; Q84789; AC Q84790; Q98592; Q98593; Q98594; DT 21-JUL-1986 (Rel. 01, Created) DT 21-JUL-1986 (Rel. 01, Last sequence update) DT 30-MAY-2000 (Rel. 39, Last annotation update) DE GENOME POLYPROTEIN [CONTAINS: COAT PROTEINS VP1 TO VP4; CORE PROTEINS DE P2A TO P2C, P3A; GENOME-LINKED PROTEIN VPG; PICORNAIN 3C DE (EC 3.4.22.28) (PROTEASE 3C) (P3C); RNA-DIRECTED RNA POLYMERASE P3D DE (EC 2.7.7.48)]. OS Poliovirus type 3 (strains P3/Leon/37 and P3/Leon 12A[1]B). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Picornaviridae; OC Enterovirus. OX NCBI_TaxID=12088; RN [1] RP SEQUENCE FROM N.A. RC STRAIN=P3/LEON/37; RX MEDLINE=84170338; PubMed=6324200; RA Stanway G., Hughes P.J., Mountford R.C., Reeve P., Minor P.D., RA Schild G.C., Almond J.W.; RT "Comparison of the complete nucleotide sequences of the genomes of the RT neurovirulent poliovirus P3/Leon/37 and its attenuated Sabin vaccine RT derivative P3/Leon 12a1b."; RL Proc. Natl. Acad. Sci. U.S.A. 81:1539-1543(1984). RN [2] RP SEQUENCE FROM N.A. RC STRAIN=P3/LEON 12A[1]B; RX MEDLINE=83299239; PubMed=6310508; RA Stanway G., Cann A.J., Hauptmann R., Hughes P., Clarke L.D., RA Mountford R.C., Minor P.D., Schild G.C., Almond J.W.; RT "The nucleotide sequence of poliovirus type 3 leon 12 a1b: comparison RT with poliovirus type 1."; RL Nucleic Acids Res. 11:5629-5643(1983). RN [3] RP X-RAY CRYSTALLOGRAPHY (2.9 ANGSTROMS) OF 1-878. RX MEDLINE=95120467; PubMed=7820548; RA Grant R.A., Hiremath C.N., Filman D.J., Syed R., Andries K., RA Hogle J.M.; RT "Structures of poliovirus complexes with anti-viral drugs: RT implications for viral stability and drug design."; RL Curr. Biol. 4:784-797(1994). RN [4] RP X-RAY CRYSTALLOGRAPHY (2.4 ANGSTROMS) OF 1-878. RA Syed R., Filman D.J., Hogle J.M.; RL Submitted (MAR-1995) to the PDB data bank. CC -!- FUNCTION: P3C POLYPEPTIDE IS A PROTEASE THAT CLEAVES AT CERTAIN CC Q/G SITES IN THE POLYPROTEIN. IT MAY BE A CYSTEINE PROTEASE. CC -!- SUBUNIT: THE VIRUS CAPSID IS COMPOSED OF 60 ICOSAHEDRAL UNITS, CC EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1, VP2, CC VP3, AND VP4. CC -!- PTM: SPECIFIC ENZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS. CC -!- MISCELLANEOUS: THE SEQUENCE OF STRAIN SABIN VACCINE P3/LEON/37 IS CC SHOWN. CC -!- MISCELLANEOUS: THE STRAIN SABIN VACCINE P3/LEON/37 IS THE CC PROGENITOR OF THE STRAIN SABIN VACCINE P3/LEON 12A[1]B. CC -!- SIMILARITY: THE PROTEASE BELONGS TO PEPTIDASE FAMILY C3. CC -------------------------------------------------------------------------- CC This SWISS-PROT entry is copyright. It is produced through a collaboration CC between the Swiss Institute of Bioinformatics and the EMBL outstation - CC the European Bioinformatics Institute. There are no restrictions on its CC use by non-profit institutions as long as its content is in no way CC modified and this statement is not removed. Usage by and for commercial CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ CC or send an email to license@isb-sib.ch). CC -------------------------------------------------------------------------- DR EMBL; K01392; AAA46914.1; -. DR EMBL; X00925; CAA25444.1; -. DR PIR; A03900; GNNY4P. DR PDB; 1PIV; 20-JUL-95. DR PDB; 1PVC; 15-SEP-95. DR PDB; 1VBA; 11-JUL-96. DR PDB; 1VBB; 11-JUL-96. DR PDB; 1VBC; 11-JUL-96. DR PDB; 1VBE; 11-JUL-96. DR MEROPS; C03.001; -. DR MEROPS; C03.020; -. DR INTERPRO; IPR000081; -. DR INTERPRO; IPR000199; -. DR INTERPRO; IPR000605; -. DR INTERPRO; IPR001205; -. DR INTERPRO; IPR001676; -. DR INTERPRO; IPR002527; -. DR PFAM; PF00548; Cys-protease-3C; 1. DR PFAM; PF00947; Pico_P2A; 1. DR PFAM; PF01552; Pico_P2B; 1. DR PFAM; PF00680; RNA_dep_RNA_pol; 1. DR PFAM; PF00910; RNA_helicase; 1. DR PFAM; PF00073; rhv; 3. KW Polyprotein; Coat protein; Core protein; Transferase; KW RNA-directed RNA polymerase; Hydrolase; Thiol protease; Myristate; KW 3D-structure. FT CHAIN 2 69 COAT PROTEIN VP4. FT CHAIN 70 340 COAT PROTEIN VP2. FT CHAIN 341 578 COAT PROTEIN VP3. FT CHAIN 579 878 COAT PROTEIN VP1. FT CHAIN 879 1027 CORE PROTEIN P2A. FT CHAIN 1028 1124 CORE PROTEIN P2B. FT CHAIN 1125 1453 CORE PROTEIN P2C. FT CHAIN 1454 1540 CORE PROTEIN P3A. FT CHAIN 1541 1562 GENOME-LINKED PROTEIN VPG. FT CHAIN 1563 1745 PICORNAIN 3C. FT CHAIN 1746 2206 RNA-DIRECTED RNA POLYMERASE P3D. FT LIPID 2 2 MYRISTATE. FT ACT_SITE 1709 1709 PROTEASE (POTENTIAL). FT ACT_SITE 1723 1723 PROTEASE (POTENTIAL). FT VARIANT 431 431 S -> F (IN P3/LEON 12A[1]B). FT VARIANT 864 864 K -> R (IN P3/LEON 12A[1]B). FT VARIANT 908 908 T -> A (IN P3/LEON 12A[1]B). SQ SEQUENCE 2206 AA; 246163 MW; 4766B15C861F66D3 CRC64; MGAQVSSQKV GAHENSNRAY GGSTINYTTI NYYKDSASNA ASKQDYSQDP SKFTEPLKDV LIKTAPALNS PNVEACGYSD RVLQLTLGNS TITTQEAANS VVAYGRWPEF IRDDEANPVD QPTEPDVATC RFYTLDTVMW GKESKGWWWK LPDALRDMGL FGQNMYYHYL GRSGYTVHVQ CNASKFHQGA LGVFAIPEYC LAGDSDKQRY TSYANANPGE RGGKFYSQFN KDNAVTSPKR EFCPVDYLLG CGVLLGNAFV YPHQIINLRT NNSATIVLPY VNALAIDSMV KHNNWGIAIL PLSPLDFAQD SSVEIPITVT IAPMCSEFNG LRNVTAPKFQ GLPVLNTPGS NQYLTSDNHQ SPCAIPEFDV TPPIDIPGEV KNMMELAEID TMIPLNLEST KRNTMDMYRV TLSDSADLSQ PILCLSLSPA SDPRLSHTML GEVLNYYTHW AGSLKFTFLF CGSMMATGKI LVAYAPPGAQ PPTSRKEAML GTHVIWDLGL QSSCTMVVPW ISNVTYRQTT QDSFTEGGYI SMFYQTRIVV PLSTPKSMSM LGFVSACNDF SVRLLRDTTH ISQSALPQGI EDLISEVAQG ALTLSLPKQQ DSLPDTKASG PAHSKEVPAL TAVETGATNP LAPSDTVQTR HVVQRRSRSE STIESFFARG ACVAIIEVDN EQPTTRAQKL FAMWRITYKD TVQLRRKLEF FTYSRFDMEF TFVVTANFTN ANNGHALNQV YQIMYIPPGA PTPKSWDDYT WQTSSNPSIF YTYGAAPARI SVPYVGLANA YSHFYDGFAK VPLKTDANDQ IGDSLYSAMT VDDFGVLAVR VVNDHNPTKV TSKVRIYMKP KHVRVWCPRP PRAVPYYGPG VDYKNNLDPL SEKGLTTYGF GHQNKAVYTA GYKICNYHLA TKEDLQNTVS IMWNRDLLVV ESKAQGTDSI ARCNCNAGVY YCESRRKYYP VSFVGPTFQY MEANDYYPAR YQSHMLIGHG FASPGDCGGI LRCQHGVIGI VTAGGEGLVA FSDIRDLYAY EEEAMEQGIS NYIESLGAAF GSGFTQQIGD KISELTSMVT STITEKLLKN LIKIISSLVI ITRNYEDTTT VLATLALLGC DVSPWQWLKK KACDTLEIPY VIRQGDSWLK KFTEACNAAK GLEWVSNKIS KFIDWLRERI IPQARDKLEF VTKLKQLEML ENQISTIHQS CPSQEHQEIL FNNVRWLSIQ SKRFAPLYAL EAKRIQKLEH TINNYIQFKS KHRIEPVCLL VHGSPGTGKS VATNLIARAI AEKENTSTYS LPPDPSHFDG YKQQGVVIMD DLNQNPDGAD MKLFCQMVST VEFIPPMASL EEKGILFTSN YVLASTNSSR ITPPTVAHSD ALARRFAFDM DIQVMGEYSR DGKLNMAMAT ETCKDCHQPA NFKRCCPLVC GKAIQLMDKS SRVRYSVDQI TTMIINERNR RSNIGNCMEA LFQGPLQYKD LKIDIKTRPP PECINDLLQA VDSQEVRDYC EKKGWIVNIT SQVQTERNIN RAMTILQAVT TFAAVAGVVY VMYKLFAGHQ GAYTGLPNKR PNVPTIRAAK VQGPGFDYAV AMAKRNIVTA TTSKGEFTML GVHDNVAILP THASPGESIV IDGKEVEILD AKALEDQAGT NLEITIITLK RNEKFRDIRQ HIPTQITETN DGVLIVNTSK YPNMYVPVGA VTEQGYLNLG GRQTARILMY NFPTRAGQCG GVITCTGKVI GMHVGGNGSH GFAAALKRSY FTQSQGEIQW MRPSKEAGYP IINAPTKTKL EPSAFHYVFE GVKEPAVLTK NDPRLKTDFE EAIFSKYVGN KITEVDEYMK EAVDHYAGQL MSLDISTEQM CLEDAMYGTD GLEALDLSTS AGYPYVAMGK KKRDILNKQT RDTKEMQRLL DAYGINLPLV TYVKDELRSK TKVEQGKSRL IEASSLNDSV AMRMAFGNLY AAFHRNPGVV TGSAVGCDPD LFWSKIPVLM EEKLFAFDYT GYDASLSPAW FEALKMVLEK IGFGDRVDYI DYLNHSHHLY KNKIYCVKGG MPSGCSGTSI FNSMINNLII RTLLLKTYKG IDLDHLKMIA YGDDVIASYP HEVDASLLAQ SGKDYGLTMT PADKSATFET VTWENVTFLK RFFRADEKYP FLIHPVMPMK EIHESIRWTK DPRNTQDHVR SLCLLAWHNG EEEYNKFLAK IRSVPIGRAL LLPEYSTLYR RWLDSF //