IMP Bioinformatics Group Leftlogo IMP Bioinformatics Group Rightlogo

ProflAP - A superfamily of profilin-like adaptor proteins

This page is an extended supplementary material accompanying the article:

Kurzbauer R, Teis D, de Araujo ME, Maurer-Stroh S, Eisenhaber F, Bourenkov GP, Bartunik HD, Hekman M, Rapp UR, Huber LA, Clausen T.
Crystal structure of the p14/MP1 scaffolding complex: how a twin couple
attaches mitogen-activated protein kinase signaling to late endosomes.

Proc Natl Acad Sci U S A. 2004 Jul 27;101(30):10984-9.
Article -
Compact supplement -

It is divided into the following subsections:

Extension of a sequence superfamily

p14 was already known to be part of an ancient superfamily (ref.) that extends from a eukaryotic family of Roadblock dynein light chains (KOG4115) to several archaeal and bacterial proteins that were summarized as MglB-family (COG2018). Progress of sequencing projects has increased the searchable protein sequence space, which allows the detection of further homology relationships between proteins and protein families. Extensive iterative sequence searches with the PSI-BLAST (ref.) program starting with all members of the superfamily, as previously defined, against the non-redundant database at the National Center for Biotechnology (NCBI), establish significant connections of the MglB-family not only with the large family of Profilins (KOG1755) but also with the p14 interaction partner MP1 (for details see Methods section).

It has been noted that conserved predicted secondary structural elements among the family members would point to a conserved fold (ref.) which also applies to the extended superfamily (with the exception of the Roadblock dynein light chains that apparently lack the C-terminal helix). The linkage of the protein families by sequence similarity is further strengthened by similarity on the structural level that is indicated by the strong correspondence of the folds of Profilins with those of p14 and MP1 that are described in this work. We therefore extend the superfamily to include the following families: KOG4115 (Roadblock, etc.), COG2018 (MglB, etc.), KOG4107 (p14), KOG1755 (Profilins) and MP1s (no KOG at time of publication). It is interesting to note that plant Profilins represent the first hits on the sequence level with bacterial members of the superfamily and also appear to structurally more closely resemble the fold of p14 and MP1. Below follows a multiple alignment of the ProflAP superfamily sequences and the three available structures (for details see Methods section):

Multiple alignment of the ProflAP superfamily.

Back to top

Functional implications of sequence superfamily relationships

Several proteins of the MglB-family appear in the genetic context, for example in the same operon, with a GTPase from the Ras/Rab/Rho superfamily. Dynein light chains form complexes with the dynein heavy chains that consist of ATPases. Hence, a common role for the superfamily in NTPase regulation has been suggested (ref.). The context of NTPase interaction is also fulfilled indirectly by Profilins which are regulated by Ras/Rab/Rho GTPases through interaction with other adaptor proteins. For example N-WASP binds both Profilin and the GTPase Cdc42 (ref.). Another common feature of several superfamily members appears to be a connection to subcellular scaffold structures. Roadblock dynein light chains form part of the complex that connects cargo to dynein motors to be transported along microtubules (ref.). Profilin is involved in regulating actin polymerization and rearrangment of the actin cytoskeleton plays a role in eukaryotic cell motility (ref.). Mutations in MglB and the neighboring GTPase MglA affect gliding motility in Myxococcus xanthus. Although the mechanisms in bacteria behind the general term ‘gliding’ are diverse, they are often involving subcellular scaffold structures (ref.).

With the extension of the superfamily it becomes clear that the functions of the different proteins might cover a much broader spectrum. For example, a gene for a helix-turn-helix (HTH) protein (corresponding to domain of unknown function DUF742) appears conserved inbetween the GTPase and MglB-family genes in the genomes of at least 6 different actinobacteria (Mycobacterium tuberculosis, Mycobacterium bovis, Streptomyces coelicolor, Streptomyces griseus, Streptomyces avermitilis, Thermobifida fusca). Furthermore, there exists a structure of a family of bacterial transcriptional regulator proteins (IclR-family = COG1414) consisting of an N-terminal HTH- and a C-terminal Profilin-fold domain (ref.). No significant sequence similarity is currently found for the Profilin-fold domain with the superfamily described here, but a significant sequence link can be made between DUF742 and IclR-family members (for details see Materials and Methods section). This relationship would be in favor of a gene fusion of DUF742 (HTH-fold) and the MglB-family (Profilin-fold) neighboring genes resulting in a domain arrangement as in the IclR-family (HTH-fold+Profilin-fold). This example adds transcriptional regulation as yet another facet to the possible functions that could be related to members of the superfamily.

Back to top

Functional implications of structure superfamily relationships

The structure of Profilin, among proteins with resolved structure most closely related to p14 and MP1, shows some characteristic differences, mainly in the region building up the binding interface of the p14/MP1 complex. Loop b3 appears extended and helix b translocated along the helix axis while additional residues between helix b and beta strand 3 form a complex 30 amino acid long alpha-beta mixed scaffold that also builds up parts of the large area actin binding site (ref.). Aromatic residues intercalating between helix a and c form the binding groove for proline-rich peptides in Profilins (ref.). Furthermore, clusters of positive charges on the surface are likely sites of interactions with PIP2 (ref.). The latter feature, of PIP2 binding over positively charged surface regions, could be conserved to the p14/MP1 complex and influence its subcellular localization by targeting to membranes.

Besides the members of the sequence superfamily, there are also several more proteins that share similarities to the fold of Profilin, p14 and MP1 as unveiled by searches against the PDB database using the DALI structure comparison tool (ref.). The fact that we cannot find significant sequence similarities between the structurally related Profilin-fold domain of IclR transcriptional regulator with the sequence superfamily, although the genetic context would suggest an evolutionary relationship, nurtures the idea that also other structural neighbors could share more than just the basic scaffold. Of course, molecular and cellular function conservation of proteins that appear to be related merely through structural similarities vanishes during the aeons of evolution. However, it is interesting that in the IclR structure the Zn2+ signal for transcriptional regulation is presumed to bind to a site (ref.) whose location corresponds to the center of the dimerization interface of the p14/MP1 complex. As p14 and MP1 act as adaptors to localize MAPK signaling to late endosomes, several proteins that share the Profilin-fold and are also related to endosomal functions or localizations have caught our attention. For example, vesicle trafficking protein Sec22b, the N-terminal domain of nonsyntaxin SNARE protein Ykt6, transport protein particle component SEDL and adaptor-related protein complex 2 (AP2) subunits alpha-2 and N-mu-2, all belong to the Profilin-like fold family. It is striking, however, that all of these structures are circular permutations of the original Profilin fold breaking loop b1 and connecting helices a and c. Therefore, helix a appears at the C- instead of the N-terminus in the sequences.

Similarily to p14 and MP1, Sec22b has conserved surface residues in the b3-pocket, loop b2* and conserved hydrophobic residues in the region corresponding to the p14/MP1 dimerization interface (ref.). As biochemical data does not support homodimerization of the protein, a heterodimerization in a similar style as in the p14/MP1 complex or other binding partner interactions of this region are plausible. In this context it is interesting that the mammalian Roadblock dynein light chains Robl1 and Robl2 that are predicted to share the fold of the superfamily have been found to be capable of forming both homo- and heterodimers (ref.). Ykt6, also features a conserved stretch of hydrophobic residues in the region corresponding to the p14/MP1 dimerization interface, which is presumed to be required for regulatory interaction with its C-terminal SNARE core domain (ref.). A similar conserved mainly hydrophobic patch can be found in the SEDL structure and it is interesting to note that a mutation responsible for the X-linked skeletal disorder spondyloepiphyseal dysplasia tarda maps to this surface region (ref.). Additionally, there appears another potential solvent-exposed hydrophobic protein interaction site in the loop that is formed between helices a and c (p14 nomenclature) after the circular permutation, which in the structural alignment would correspond exactly to the conserved site where the N-terminus of p14 tries to bridge the gap to the flexible but conserved C-terminus. On the contrary, the AP2 subunits alpha-2 and N-mu-2 do not appear to share conserved interaction locations with p14 and MP1 as they seem to undergo more large-area interactions with other subunits and perform a scaffolding role in the AP2 complex (ref.).

The pattern of binding sites for diverse interaction partners located around the center of the interaction interface between p14 and MP1 continues with other more distantly related proteins sharing the Profilin fold but do not have the circular permutation as the previous examples. For instance, the GAF domains of 3’,5’-cyclic nucleotide phosphodiesterase 2a bind cGMP in this location that is furthermore embraced by an extended loop between adjacent beta sheets (ref.). More hidden by an additional mainly helical scaffold are the active or binding sites of d-Ala carboxypeptidase, beta-lactamase and penicillin-binding proteins (ref.).

Another structural neighbor, that shares the circular permutation with the SNARE related proteins mentioned above, is the signal recognition particle receptor alpha subunit homolog. The fold similarity is of special interest since the SRX domain of the alpha subunit is an effector for the beta subunit which is a GTPase (ref.). Moreover, the interaction site of the GTPase with the Profilin-fold protein corresponds to the b2 loop that in most members of the superfamily has a conserved negative charge followed by a glycine and has been proposed to be involved in GTPase regulation (ref.). On the other hand, the fact that this loop is conserved also in several more of the only structurally related proteins with GTPase independent functions could point out that its role might lie more in the stabilization of the structure in this region.

Back to top


Extension of the ProflAP superfamily:

A fan-like search strategy (ref.) using PSI-BLAST (ref.) was applied to find new superfamily members in the non-redundant database (NR) at the National Center for Biotechnology (NCBI). Accession numbers of representative pairs linking families as well as their expectation values (E-value) at first encounter in the PSI-BLAST procedure are listed below. COG2018 (MglB-family) finds KOG1755 (Profilins): ZP_00057416.1 (residues 11 to 126; Thermobifida fusca) hits O65809 (residues 1 to 126; Glycine max) in round 1 with an E-value of 0.003. COG2018 (MglB-family) finds MP1s: NP_294339.1 (residues 56 to 166; Deinococcus radiodurans) hits AAP06349.1 (residues 7 to 124; Schistosoma japonicum) in round 5 with an E-value of 0.002.
Helix-turn-helix (HTH) motif of DUF742 finds HTH motif of COG1414 (IclR): NP_627109.1 (residues 49 to 107; Streptomyces coelicolor) hits ZP_00019631.1 (residues 44 to 102; Chloroflexus aurantiacus) in round 3 with an E-value of 0.002.

Multiple alignment of the ProflAP superfamily:

Selected sequences were aligned to a profile derived from a structural alignment of p14, MP1 and profilin (secondary structures on top of the corresponding sequences) using ClustalX (ref.) and setting secondary structure specific gap penalties. Family, organism and accession numbers are given. Numbers preceding or consecutive to sequences denote residue extensions. Standard ClustalX coloring (shortly: hydrophobic – cyan, polar – green, negative – red background). Abbreviations: Prof – profilin family, MglB – archeal or bacterial MglB-family member, Robl – Roadblock/LC7 family; Hs – Homo sapiens, Dm – Drosophila melanogaster, Ce – Caenorhabditis elegans, Dre – Danio rerio, Xl –Xenopus laevis, VV – Variola virus, Gma – Glycine max, Dd - Dictyostelium discoideum, Ac - Acanthamoeba castellanii, Sco - Streptomyces coelicolor A3(2), Mt - Mycobacterium tuberculosis H37Rv, Mj - Methanococcus jannaschii, Ca - Chloroflexus aurantiacus, Mx - Myxococcus xanthus, Gme - Geobacter metallireducens, Dra - Deinococcus radiodurans, Tf - Thermobifida fusca, Cr - Chlamydomonas reinhardtii.


Back to top


Multiple alignment of the ProflAP superfamily (Clustal format)

Calibrated Hidden Markov Model of the ProflAP superfamily (HMMer format)