The orthologous groups of Hop2 as well as Mnd1 proteins are well defined in public ortholog databases (OrthoMCL-DB) and family collections (Pfam). Using the full length Hop2 and Mnd1 sequences extracted from these resources, a common core domain architecture can been defined for these protein families. In the two independently analyzed protein sets, highest conservation is observed in a segment of approximately 70-80 amino-acids (aa) typically found in the very N-terminus of the proteins. A 60-80 aa long region with a high likelihood of forming coiled-coil structures is found adjacent to the conserved domains in Hop2 as well as Mnd1 proteins (known to be involved in hetero-dimerization with Mnd1 and potential homo-dimerization: 17426123). A shorter C-terminal segment with a high helical content follows (required for efficient DNA binding: 15192114).
Hop2 and Mnd1 protein families are typically represented by one member per species. Unexpectedly, two Hop2 and two Mnd1 proteins are found when applying profile hidden Markov models (HMM) against the Tetrahymena proteome (using the HMMER 2 package, Sean Eddy, http://hmmer.wustl.edu/):
- a HMM derived from the Pfam PF07106/TBPIP/Hop2 seed alignment and applied against the Tetrahymena proteome produces two significant hits: TTHERM_01190440 (E=1.2e-41) and TTHERM_00794620 (E=3.1e-06) (hmmsearch versus tta1_oct07, Tt-10.24.06) .
Both proteins are already reported as Hop2 family members in the Pfam Hop2/TBPIP protein set (Q239Q1_TETTH, Q23VW9_TETTH), and their Hop2 relatedness can also be confirmed using BLAST searches. TTHERM_01190440, the predicted sequence with a better correspondence to the Hop2 HMM is also found to be the reciprocal best proteome blast hit of Hop2 proteins.
- HMM-searches performed with a model generated from the alignment of non-Tetrahymena Mnd1 sequences against the Tetrahymena proteome also yields two significant hits: TTHERM_00382290 (E=1.7e-61), TTHERM_00300660 (E=7.3e-10) (hmmsearch versus tta1_oct07, Tt-10.24.06). Both of the hits can be confirmed as Mnd1 family members using BLAST searches. TTHERM_00382290- the predicted sequence with a better correspondence to the Mnd1 HMM is also found to be the reciprocal best proteome blast hit of Mnd1 proteins.
Using sensitive sequence analysis techniques, we find a distant similarity of the Hop2 protein family to the Mnd1 protein family:
- When using the Pfam PF07106/TBPIP/Hop2 full alignment as an input to a HHpred search (max. PSI-BLAST iterations set to 0) against all available Pfam domains, PF03962/Mnd1 is listed with an E=1.9E-05 as the only significant hit besides PF07106/TBPIP/Hop2 (query results).
- When performing the reciprocal HHpred search started with a Mnd1 family alignment versus the Pfam database, PF07106/TBPIP/Hop2 is listed with an E=3E-06 as the only significant hit next to PF03962/Mnd1. (input based on Interproscan entry Mnd1/IPR005647, hhpred search results)
The likely homologous N-terminal conserved domain of Hop2 and Mnd1 proteins is predicted to belong to the "winged helix" DNA-binding domain superfamily. One among several approaches to illustrate the similarity is depicted here:
- The N-terminal conserved segment in mouse Hop2 (NP_032975 region 10-82) protein was submitted to a FFAS03 search against the SCOP database (results). Only proteins of the '"winged helix" DNA-binding domain' superfamily are found among the similarities reported in this search, with the top hit yielding a FFAS03 score of -16.500 (values below -9.5 are considered significant by the method).
A similarity of HTH proteins and the N-terminal conserved region of TBPIP proteins, can be established through profile HMM comparisons using HHpred against the SCOP database (input sequence: region 10-82 of mouse TBPIP NP_032975, psi-blast rounds: 2). Hits to the superfamily of "Winged helix" DNA-binding domains are found with a significant E-value of 1.7E-05 (results). In fact, an assignment of TBPIP/PF07106 to HTH proteins is also reported in the Pfam domain database and has been obtained on the basis of profile-profile comparison of all Pfam domains.
- The N-terminal conserved segment in mouse Mnd1 protein was submitted to a FFAS03 search against the SCOP database. Only proteins of the '"winged helix" DNA-binding domain' superfamily are found among the similarities reported in this search, with the top hit yielding a FFAS03 score of -11.60 (values below -9.5 are considered significant) (results).