![]() |
![]() |
![]() |
Home | Initium | Angiogenesis | FAQ |
TEM @ IMP Bioinformatics FAQs |
---|
For a detailed description of the SAGE technique refer to the subsection SAGE (Supplementary). SAGE Tag to UniGene Mapping: SAGEmap The present study relies in a starting step of SAGEtag to gene mapping on the information made available by SAGEmap. The assignment of the tag sequences to Unigen clusters in SAGEmap has been automated and runs through the steps of:
Reliability of tag2gen mapping: A method of background noise removal implemented in SAGEmap allows a distinction of "reliable" tag to gene mapping compared to "total" tag to CID mapping. The method is based on an assumed 10 base error rate. A respective percentage of the "weakest" tag to gene connections are assumed to be most likely due to errors. Thus they are removed from the SAGEmap "reliable" tag to gene mapping. Only the clusters "reliably" linked with a particular tag have been considered in the present study. Reliability of gen2tag mapping: SAGEmap offers a less restrictive estmation of "reliable" gene to tag mapping. All tags derived from well-characterized mRNA or CDS sequence, as well as the most frequently occurring tags derived from EST data are accepted as being "reliable". In the present study the CID->tag ratios are listed as they show the frequency of occurrance of the particular tag in the pool of EST data or cDNA data associated with the Unigen cluster. With a higher CID->tag ratio a higher reliability of the gen2tag assignment has been assumed. UniGene to Gene Mapping Gathering extensive information about a gene starting from the limited sequence information of a SAGE tag has its restrictions in missing, raw or poorly annotated genomic and expressed sequence information. In cases where tag hits do not map to a well-defined gene whose mRNA sequence and intron/exon structure have been experimentally verified, EST and genome sequence data can sometimes be used to reconstruct this information (as has been done to a various degree for Tem19, Tem35, Tem41). The UniGene database is generated by automatical clustering of ESTs. most of the ESTs will align towards the end (3' part) of the consensus is there a poly(A)-signal (AATAAA) somewhere towards the end of the alignment? .
References: About the Graphical Representation of Proteins The protein graphs contain information about primary and secondary structure of the proteins. |