The performance of the SMP machines for throughput was striking. Figure shows linear improvement as we add more processors, with almost no loss of efficiency. Adding more processors than 24 is likely to result in still higher throughput without encountering system bus limits. The measured results are close to optimum, which rejects Hypothesis II-A for the throughput case. They are explained by SMP architectures that provide sufficient memory bandwidth for each processor to work at nearly peak efficiency even when all other processors are also working at peak efficiency,
Figure: Throughput performance measured in Bases per second on Cray CS6400
Response time also decreases as more processors are added, but in this case there is some loss of efficiency. Using traditional scalability analysis, speedup is defined as , where is the one processor serial run time and is the parallel run time on multiple processors . For the three different architectures, the speedup curves for BLASTN, BLASTP, and BLASTX are plotted in Figure .
Figure: Speedup vs. # of processors for all three algorithms on three SMP machines
The speedup curves indicate that BLASTX and BLASTP have fairly good response time speedups to at least 24 processors. Moreover, the three different SMP architectures obtained similar speedups. BLASTN, on the other hand, has significantly poorer speedup. What caused this difference in performance between the algorithms? Let us analyze the efficiency and the overhead of each algorithm.
The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.