A Platform for Biological Sequence Comparison on Parallel Computers

Authors:Deshpande, A, Department of Computer ScienceUniversity of Virginia Richards, D, Department of Computer ScienceUniversity of Virginia

We have written two programs for searching biological sequence databases that run on Intel hypercube computers. PSCANLIB compares a single sequence against a sequence library, and PCOMPLIB compares all the entries in one sequence library against a second library. The programs provide a general framework for similarity searching; they include functions for reading in query sequences, search pararneters, and library entries, and reporting the results of a search. We have isolated the code for the specific function that calculates the similarity score between the query and library sequence; alternative searching algorithms can be implemented by editing two files. We have implemented the rapid FASTA sequence comparison algorithm and the more rigorous Smith-Waterman algorithm within this framework. The PSCANLIB program on a 16-node iPSC/2 80386-based hypercube can compare a 229 amino acid protein sequence with a 3.4 million residue sequence library in about 16 seconds with the FASTA algorithm. Using the Smith-Waterman algorithm, the same search takes 35 minutes. The PCOMPLIB program can compare a 0.8 million amino acid protein sequence library with itself in 5.3 minutes with FASTA on a third-generation 32¬Ľnode Intel iPSC/860 hypercube.
Source Citation:

Deshpande, A, and D Richards. "A Platform for Biological Sequence Comparison on Parallel Computers." University of Virginia Dept. of Computer Science Tech Report (1990).

University of Virginia, Department of Computer Science
