About:
Homologous Sequence Set input:
Sequences provided as a homologous sequence set will be made non-redundant, masked and afterwards analysed via rps-blast. Subsignificant rps-blast hits will be further evaluated using DOUTfinder. The use of selected conserved regions is advicable. Input is limited to 75 protein sequences in a fasta format.
Depending on the length of your query this analysis will take around several minutes, and is faster than the single sequence option, which requires a psi-blast search to be performed in addition.
Single Sequence input:
If a single sequences is provided an initial psi-blast search will be performed to collect a homologous sequence set. This initial analysis step will take several minutes. All consequent steps parallel the analysis of a homologous sequence set as described above.
Simple text file upload:
You can upload your sequences in a simple text format. To generate a simple text file you can use any text-processing software, and save file as text.
Rps-Blast flag: SEG Filter
You can determine whether rps-blast will be run with a SEG low complexity filter switched on (T) or off (F). If filtering is turned off, false positive hits can increase in compositionally biased regions.
Input flag: Coil Filter
You can determine whether a coiled-coil filter should be applied to your input (T) or not (F). If filtering is turned off, false positive hits can increase in compositionally biased regions.
Input flag: Transmembrane Filter
You can determine whether a HMMTOP-based filtering should be applied to your input (T) or not (F). If filtering is turned off, false positive hits can increase in compositionally biased regions.
Protein set flag: maximum-sequence identity
CD-HIT is used to obtain a non-redundant protein set with a user determined identity cut-off. The set is made non-redundant, in order to reduce noise due to highly similar sequences.
DOUT-analysis flag: Expect
Subsignificant domain hits (>0.01) are only taken into consideration as potential domain outliers in case the E-value is below this user defined threshold. False positive results are rare with the default setting of 0.01. Higher E-value thresholds give more false positives, while lower E-values increase reliability.
DOUT-analysis flag: Coverage
Subsignificant domain hits (>0.01) are only taken into consideration as potential domain outliers in case the coverage of the domain is above this user-defined threshold.
Single seq flag: Blastpgp rounds
If a single seqeuence is given as an input PSI-BLAST is used to obtain a sequence set. You can change the maximum number of PSI-BLAST passes to use in multipass version.
Single seq flag: Inclusion threshold
If a single seqeuence is given as an input PSI-BLAST is used to obtain a sequence set. You can change the E-value threshold for inclusion in this initial search.
Single seq flag: Database choise
If a single seqeuence is given as an input PSI-BLAST is used to obtain a sequence set. You can run the PSI-BLAST against two versions of the NCBI non-redundant database (mar06), which have both processed using cd-hit. nr80d is a 80% non-redundant derivate of NCBI nr supplemented by Pfam 19 seed files, and the CDD Smart and Pfam domain fasta files. nr90d is a 90% non-redundant derivate of NCBI nr supplemented by Pfam 19 seed files, and the CDD Smart and Pfam domain fasta files.
|