*This document is not part of the official presentation of the Institute of Molecular Pathology. The contents of this page do not represent the opinions of IMP representatives.

Database Search History of Brix Domain Proteins

This page supplies supplementary material for the manuscript

"The Brix domain protein family - a key to the ribosomal biogenesis pathway"
by Frank Eisenhaber, Christian Wechselberger, Guenther Kreil

Acknowledgements to:
M. Breitenbach, F.M. Jantsch, G. Lepperdinger
for sharing experimental data with respect to Brix and yol077c with us prior to publication, for discussion of the sequence analysis results and for carefully reading the manuscript.

On this page, you will find

the method for collecting the whole superfamily
the method for collecting each of the six families individually and their nearest neighbors
the graph of qualitative familiy relationships rooted at the Archaea/Eukarya division

Protocols of database searches in NR(proteins) with BLAST/PSIBLAST as of Feb. 14th, 2001

Conditions: standard LCR filters, inclusion E-value < 0.002, with composition-based statistical correction

Brix from X. laevis is a sequence almost at the center of the superfamily in the sequence space. A PSIBLAST search does collect all six families and almost the whole superfamily (except of a few archaean sequences and one alternative human sequence from the Peter Pan clade - AAH00535.1, group II). This automatic, standard search with the command-line version concluded after six iteration with convergence.

The following data allows to reconstruct the collection of the six families and the detection of their neighbor relationship to the other families.
  1. Collection of family I (archaea) and its closest non-family neighbors
    start with M. jannaschii Q58012 round 1
    start with M. jannaschii Q58012 round 2
    Round 1 collects all members except of the sequence of S. solfataricus, the next iteration completes the archaean family and reveals the stage specific peptide 24 of T. cruzi and other members of group IV as closest neighbors. The gap in E-value is 18 orders of magnitude (if the sequence of S. solfataricus is excluded).
  2. Collection of family II (Peter Pan, SSF1/2 and homologues) and its closest non-family neighbors
    start with D. melanogaster AAD16459.1 round 1
    start with D. melanogaster AAD16459.1 round 2
    Round 1 collects all family members. Round 2 detects (with 55 orders of magnitude higher E-values compared to the lowest family member AAHO0535.1 from human) proteins of A. thaliana from group III as closest neighbors.
  3. Collection of family III (YHR088wp and homologues) and its closest non-family neighbors
    start with S. cerevisae P38805 single blastp
    start with G. theta CAC27105.1 single blastp
    Almost 10 orders of E-value magnitude between AAG38541.1 (P. carinii) and AAF56395.1 (D. melanogaster) devide the members of group III from their closest neighbors from group IV. The D. melanogaster CG6712(AAF53162.1) protein is the closest neighbor for the G. theta CAC27105.1 hypothetical protein.
  4. Collection of family IV (IMP4 and homologues) and its closest non-family neighbors
    start with S. cerevisae P53941 single blastp
    Almost 30 orders of E-value magnitude (between the stage specific peptide 24 of T. cruzi and the A. thaliana CAB77726.1 protein) distinguish family IV members from their closest relatives in family III. The human homologue for IMP4 can be delineated from EST data: The true orthologue is represented by UniGene cluster Hs.91579, the true orthologue for the family III version is Hs.287863.
  5. Collection of family V (Brix, YOL077cp and homologues) and its closest non-family neighbors
    start with X. laevis Brix round 1
    start with X. laevis Brix round 2
  6. Collection of family VI (YKR081cp and homologues) and its closest non-family neighbors
    start with S. cerevisae P36160 round 1
    start with S. cerevisae P36160 round 2
    start with S. cerevisae P36160 round 1 without composition-based statistical correction
    start with S. cerevisae P36160 round 2 without composition-based statistical correction


Putative evolutionary relationship between Brix domain protein families

The relative position between between families in the sequence space can be found by the nature of the first non-family hit in database searches started with one of the family members. The resulting graph of family connectivity has been rooted at the division between Archaea and Eukaryota. This figure does not aim at quantifying evolutionary distances but is solely an illustration of nearest neighbor relationships between these protein families.








Last modified: Feb 2001