ID POLG_HE71M STANDARD; PRT; 2193 AA. AC Q66479; DT 01-NOV-1997 (Rel. 35, Created) DT 01-NOV-1997 (Rel. 35, Last sequence update) DT 15-DEC-1998 (Rel. 37, Last annotation update) DE GENOME POLYPROTEIN [CONTAINS: COAT PROTEINS VP1 TO VP4; CORE PROTEINS DE P2A TO P2C, P3A; GENOME-LINKED PROTEIN VPG; PICORNAIN 3C DE (EC 3.4.22.28) (PROTEASE 3C) (P3C); RNA-DIRECTED RNA POLYMERASE P3D DE (EC 2.7.7.48)]. OS Human enterovirus 71 (strain 7423/MS/87) (Ev 71). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Picornaviridae; OC Enterovirus. OX NCBI_TaxID=103922; RN [1] RP SEQUENCE FROM N.A. RX MEDLINE=96434998; PubMed=8837884; RA Brown B.A., Pallansch M.A.; RT "Complete nucleotide sequence of enterovirus 71 is distinct from RT poliovirus."; RL Virus Res. 39:195-206(1995). CC -!- FUNCTION: P3C POLYPEPTIDE IS A PROTEASE THAT CLEAVES AT CERTAIN CC Q/G SITES IN THE POLYPROTEIN. IT MAY BE A CYSTEINE PROTEASE. CC -!- SUBUNIT: THE VIRUS CAPSID IS COMPOSED OF 60 ICOSAHEDRAL UNITS, CC EACH OF WHICH IS COMPOSED OF ONE COPY EACH OF PROTEINS VP1, VP2, CC VP3, AND VP4. CC -!- PTM: SPECIFIC ENZYMATIC CLEAVAGES IN VIVO YIELD MATURE PROTEINS. CC -!- SIMILARITY: THE PROTEASE BELONGS TO PEPTIDASE FAMILY C3. CC -------------------------------------------------------------------------- CC This SWISS-PROT entry is copyright. It is produced through a collaboration CC between the Swiss Institute of Bioinformatics and the EMBL outstation - CC the European Bioinformatics Institute. There are no restrictions on its CC use by non-profit institutions as long as its content is in no way CC modified and this statement is not removed. Usage by and for commercial CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ CC or send an email to license@isb-sib.ch). CC -------------------------------------------------------------------------- DR EMBL; U22522; AAB39969.1; -. DR HSSP; P03299; 1POV. DR INTERPRO; IPR000081; -. DR INTERPRO; IPR000199; -. DR INTERPRO; IPR000605; -. DR INTERPRO; IPR001205; -. DR INTERPRO; IPR001676; -. DR INTERPRO; IPR002527; -. DR PFAM; PF00548; Cys-protease-3C; 1. DR PFAM; PF00947; Pico_P2A; 1. DR PFAM; PF01552; Pico_P2B; 1. DR PFAM; PF00680; RNA_dep_RNA_pol; 1. DR PFAM; PF00910; RNA_helicase; 1. DR PFAM; PF00073; rhv; 3. KW Polyprotein; Coat protein; Core protein; Transferase; KW RNA-directed RNA polymerase; Hydrolase; Thiol protease; Myristate. FT CHAIN 2 69 COAT PROTEIN VP4 (P1A). FT CHAIN 70 323 COAT PROTEIN VP2 (P1B). FT CHAIN 324 565 COAT PROTEIN VP3 (P1C). FT CHAIN 566 862 COAT PROTEIN VP1 (P1D). FT CHAIN 863 1012 CORE PROTEIN P2A. FT CHAIN 1013 1111 CORE PROTEIN P2B. FT CHAIN 1112 1440 CORE PROTEIN P2C. FT CHAIN 1441 1526 CORE PROTEIN P3A. FT CHAIN 1527 1548 GENOME-LINKED PROTEIN VPG (P3B). FT CHAIN 1549 1731 PICORNAIN 3C. FT CHAIN 1732 2193 RNA-DIRECTED RNA POLYMERASE P3D. FT LIPID 2 2 MYRISTATE (BY SIMILARITY). FT ACT_SITE 1695 1695 PROTEASE (POTENTIAL). FT ACT_SITE 1709 1709 PROTEASE (POTENTIAL). SQ SEQUENCE 2193 AA; 242656 MW; 35E1B3CFF88A50EF CRC64; MGSQVSTQRS GSHENSNSAT EGSTINYTTI NYYKDSYAAT AGKQSLKQDP DKFANPVKDI FTEMAAPLKS PSAEACGYSD RVAQLTIGNS TITTQEAANI IVGYGEWPSY CSDDDATAVD KPTRPDVSVN RFYTLDTKLW EKSSKGWYWK FPDVLTETGV FGQNAQFHYL YRSGFCIHVQ CNASKFHQGA LLVAILPEYV IGTVAGGTGT EDSHPPYKQT QPGADGFELQ HPYVLDAGIP ISQLTVCPHQ WINLRTNNCA TIIVPYMNTL PFDSALNHCN FGLLVVPISP LDFDQGATPV IPITITLAPM CSEFGGLRQA VTQGFPTELK PGTNQFLTTD DGVSAPILPN FHPTPCIHIP GEVRNLLELC QVETILEVNN VPTNATSLME RLRFPVSAQA GKGELCAVFR ADPGRDGPWQ STMLGQLCGY YTQWSGSLEV TFMFTGSFMA TGKMLIAYTP PGGPLPKDRA TAMLGTHVIW DFGLQSSVTL VIPWISNTHY RAHARDGVFD YYTTGLVSIW YQTNYVVPIG APNTAYILAL AAAQKNFTMK LCKDTSHILQ TASIQGDRVA DVIESSIGDS VSRALTQALP APTGQNTQVS SHRLDTGEVP ALQAAEIGAS SNTSDESMIE TRCVLNSHST AETTLDSFFS RAGLVGEIDL PLEGTTNPNG YANWDIDITG YAQMRRKVEL FTYMRFDAEF TFVACTPTGE VVPQLLQYMF VPPGAPKPES RESLAWQTAT NPSVFVKLTD PPAQVSVPFM SPASAYQWFY DGYPTFGEHK QEKDLEYGAC PNNMMGTFSV RTVGSSKSKY PLVVRIYMRM KHVRAWIPRP MRNQNYLFKA NPNYAGNSIK PTGTSRNAIT TLGKFGQQSG AIYVGNFRVV NRHLATHNDW ANLVWEDSSR DLLVSSTTAQ GCDTIARCNC QTGVYYCNSK RKHYPVSFSK PSLIYVEASE YYPARYQSHL MLAAGHSESG DCGGILRCQH GVVGIASTGG NGLVGFADVR DLLWLDEEAM EQGVSDYIKG LGDAFGTGFT DAVSREVEAL RNHLIGSDGA VEKILKNLIK LISALVIVIR SDYDMVTLTA TLALIGCHGS PWAWIKAKTA SILGIPIAQK QSASWLKKFN DMASAAKGLE WISNKISKFI DWLREKIVPA AKEKAEFLTN LKQFPLLENQ ITHLEQSAAS QEDLEAMFGN VSYLAHFCRK FQPLYATEAK RVYVLEKRMN NYMQFKSTHR IEPVCLIIRG SPGTGKSLAT GIIARAIADK YHSSVYSLPP DPDHFDGYKQ QVVTVMDDLC QNPDGKDMSL FYQMVSTVDI IPPMASLEEK GVSFTSKFVI ASTNASNIIV PTVSDSDAIR RRFYMDCDIE VTDSSKTDLG RLDAGRAAKL CSENNTANFK RCSPLVCGKA IQLRDRKSKV RYSVDTVVSE LIREYNSRSA IGNTIEALFQ GPPKFRPIRI SLEEKPAPDA ISDLLASVDS EEVRQYCREQ GWIIPETPTN VERHLNRAVL VMQSIATVVA VVSLVYVIYK LFAGFQGAYS GAPNQVLKKP VLRTATVQGP SLDFALSLLR RNIRQVQTDQ GHFTMLGVRD RLAVLPRHSQ PGKTIWVEHK LVNILDAAEL VDEQGVNLEL TLVTLDTNEK FRDITKFIPE TISGASDATL VINTEHMPSM FVPVGDVVQY GFLNLSGKPT HRTMMYNFPT KAGQCGGVVT SVGKIIGIHI GGNGRQGFCA GLKRSYFASE QGEIQWVKSN KETGRLNING PTRTKLEPSV FHDVFEANKE PAVLTSKDPR LEVDFEQALF SKYVGNVLHE PDEYVHQAAL HYANQLKQLD INTKKMSMEE ACYGTDNLEA IDLHTSAGYP YSALGIKKRD ILDPATRDVS KMKSYMDKYG LDLPYSTYVK DELRSLDKIK KGKSRLIEAS SLNDSVYLRM TFGHLYEVFH ANPGTVTGSA VGCNPDVFWS KLPILLPGSL FAFDYSGYDA SLSPVWFRAL EVVLREIGYS EEAVSLIEGI NHTHHIYRNK TYCVLGGMPS GCSGTSIFNS MINNIIIRTL LIKTFKGIDL DELNMVAYGD DVLASYPFPI DCLELAKTGK EYGLTMTPAG KSPCFNEVTW ENATFLKRGF LPDHQFPFLI HPTMPMKEIH ESIRWTKDAR NTQDHVRSLC LLAWHNGKDE YEKFVSTIRS VPVGKALAIP NFENLRRNWL ELF //