next up previous
Next: Acknowledgments Up: Efficiency of Shared-Memory Previous: Overhead

Conclusion

We conducted experiments on SMP machines to further our understanding of their suitability for genetic sequence similarity computation. For BLASTX, BLASTN, and BLASTP, the three major variations of the BLAST algorithm, experiments were conducted on three different SMP machines--SGI Challenge, Sun Sparc Center 2000, and Cray CS6400.

We found that the throughput performance was linearly scalable with the number of processors, with little performance degradation. With one BLAST process per processor throughput doubled when the number of processors used was doubled.

Using multiple processors on a single input sequence resulted in significant improvement in response time. For instance, a BLASTX process that took nearly 1.3 hours on a single processor was completed in only 3.5 minutes with 24 processors on the Cray CS6400. Serial components in the parallel algorithm were measured, and the speedup predicted theoretically using Amdahl's law compares very well with the measured speedup for BLASTN and BLASTP.

  

(a) Predicted number of processors that could be used effectively as database size grows

(b) Predicted number of processors needed to keep the response time fixed for a 500 base sequence


Figure: Predicted applicability of SMP machines to genome sequence analysis

Knowing the efficiency of SMPs for BLAST computation, we can predict the number of processors that can be efficiently utilized as the genetic sequence database grows. We know that genetic databases are doubling in size approximately every 1.3 years, and that processor speed is doubling roughly every 1.5 years. Since the database is growing faster than processor speed, more processors will be needed to keep the response time low over time. As figure gif (b) shows, 17 processors will soon be needed to keep the response time of BLASTX for a 500 base sequence at 15 seconds.

Moreover, for a fixed processor efficiency, more processors can be used to improve performance on the growing databases. As figure gif (a) shows, if we want to keep the processor utilization level at 85% efficiency, then by the year 2005 we could use up to 22 processors for running BLASTX on a 500 base sequence.

These predictions show that Shared-Memory Multiprocesseor architectures will offer significant performance benefits for genetic sequence similarity computation for years to come.



next up previous
Next: Acknowledgments Up: Efficiency of Shared-Memory Previous: Overhead



Ed H. Chi
Wed May 1 17:13:37 CDT 1996