next up previous
Next: Efficiency Up: Analysis Previous: Analysis

Speedup

The performance of the SMP machines for throughput was striking. Figure gif shows linear improvement as we add more processors, with almost no loss of efficiency. Adding more processors than 24 is likely to result in still higher throughput without encountering system bus limits. The measured results are close to optimum, which rejects Hypothesis II-A for the throughput case. They are explained by SMP architectures that provide sufficient memory bandwidth for each processor to work at nearly peak efficiency even when all other processors are also working at peak efficiency,

  


Figure: Throughput performance measured in Bases per second on Cray CS6400

Response time also decreases as more processors are added, but in this case there is some loss of efficiency. Using traditional scalability analysis, speedup is defined as , where is the one processor serial run time and is the parallel run time on multiple processors [9]. For the three different architectures, the speedup curves for BLASTN, BLASTP, and BLASTX are plotted in Figure gif.

  

(a) BLASTN

(b) BLASTP

(c) BLASTX


Figure: Speedup vs. # of processors for all three algorithms on three SMP machines

The speedup curves indicate that BLASTX and BLASTP have fairly good response time speedups to at least 24 processors. Moreover, the three different SMP architectures obtained similar speedups. BLASTN, on the other hand, has significantly poorer speedup. What caused this difference in performance between the algorithms? Let us analyze the efficiency and the overhead of each algorithm.



Ed H. Chi
Wed May 1 17:13:37 CDT 1996