Workflow for Remote Ortholog and Conserved domain prediction

 

Summary

The workflow enables to find probable remote orthologs for a test input sequence. Simultaneously it also allows identifying conserved domains amongst the closely related sequences of the same input.

 

Standard tools

·        BLASTP

·        ClustalW

·        PSI-BLAST

 

Parsers

·        Filter hits obtained from the BLASTP

 

Custom tools

·        Orthologous cluster tool

·        Remote ortholog tool

 

 

Fig: Translated implementation

In the first step of the flow BLASTP program identifies sequences similar to the query. Hits thus obtained are segregated using 60% similarity cutoff with the help of the parser. Hits with >= 60% similarity are subjected to Multiple Sequence alignment using ClustalW. The .aln file is then forwarded to the custom tool, Orthologous cluster tool, which extracts the conserved regions from given set of orthologous sequences.

 

If the hits obtained are less than 60% similar the query file is directed towads PSI-BLAST for searching remotely similar sequences. Scoring Matrix PAM250 is preferred over other matrices for remote ortholog search. The output of PSI-BLAST is then used as input for Remote ortholog tool (ROT). Hits between 30% to 60% identity achieved from each round of PSI-BLAST are extracted by ROT. These are probable remote ortholog sequences and can be saved in FASTA format for further analysis.