IMP GPI Lipid Anchor Project IMP-Bioinformatics

Occurence of potentially GPI modified proteins in kingdoms of life

Birgit Eisenhaber (MDC/IMP)
Peer Bork (MDC/EMBL)
Frank Eisenhaber (IMP)

Proprotein sequences to be posttranslationally GPI-modified are searched for in complete genomes with the recently developed big-PI predictor. The results indicate that GPI-anchoring is present among Eukaryota and possibly among a subgroup of the Archaea but it appears absent in other Archaea and all Eubacteria studied. The chromosomal distribution of GPI-modified proteins is uneven. Dozens of proteins annotated just as hypothetical could be characterized as GPI-anchored.

Previous work:

  1. Sequence analysis of GPI-anchored proteins on the proprotein level
    An analysis of physical properties of amino acid residues at given sequence positions in the vicinity of the GPI-modification site allowed the construction of a model of the active site of the putative transamidase complex (Eisenhaber et al., Prot.Eng., 11, 1155-1161,1998).
  2. The GPI-modification motif and its recognition in proprotein sequences
    For the first time, a new prediction technique locating potential GPI-modification sites in precursor sequences has been applied for large-scale protein sequence database searches.
    (Eisenhaber et al., JMB, 292 (3), 741-758, 1999)

    Data sheets: Prediction of potentially GPI modified proteins in complete genomes/chromosomes
    (Nov 2000)

    Summary Table


  3. Arabidopsis thaliana (Chr02+Chr04)
  4. Caenorhabditis elegans (wormpep 17)
  5. Caenorhabditis elegans (wormpep 31)
  6. Leishmania major (Chr01)
  7. Plasmodium falciparum (Chr02+Chr03)
  8. Saccharomyces cerevisiae (Oct 1999)
  9. Saccharomyces cerevisiae (Nov 2000)


  10. Aeropyrum pernix K1
  11. Archaeoglobus fulgidus
  12. Halobacterium sp. NRC-1
  13. Methanobacterium thermoautotrophicum
  14. Methanococcus jannaschii
  15. Pyrococcus abyssi
  16. Pyrococcus furiosus
  17. Pyrococcus horikoshii
  18. Thermoplasma acidophilum
  19. Eubacteria:

  20. Aquifex aeolicus
  21. Bacillus halodurans C-125
  22. Bacillus subtilis
  23. Borrelia burgdorferi
  24. Buchnera sp. APS
  25. Campylobacter jejuni
  26. Chlamydia muridarum
  27. Chlamydia pneumoniae
  28. Chlamydia pneumoniae AR39
  29. Chlamydia pneumoniae J138
  30. Chlamydia trachomatis
  31. Chlamydia trachomatis MoPn
  32. Deinococcus radiodurans R1
  33. Escherichia coli
  34. Haemophilus influenzae Rd
  35. Helicobacter pylori
  36. Helicobacter pylori J99
  37. Mycobacterium tuberculosis
  38. Mycoplasma genitalium
  39. Mycoplasma pneumoniae
  40. Neisseria meningitidis serogroup A strain Z2491
  41. Neisseria meningitidis serogroup B strain MC58
  42. Pseudomonas aeruginosa PA01
  43. Rickettsia prowazekii
  44. Synechocystis
  45. Thermotoga maritima
  46. Treponema pallidum
  47. Ureaplasma urealyticum
  48. Vibrio cholerae
  49. Xylella fastidiosa


The protein sequences for the complete genomes/chromosomes were taken from the following web-pages:

All complete genomes for Archaea and Eubacteria:
with the exception of
Pyrococcus furiosus:
Campylobacter jejuni:
Chlamydia trachomatis MoPn:
Helicobacter pylori:

Last modified: 13th June 2002