2011年11月2日水曜日

MKL parallel execution benchmark

In the following example, most(?) of computation time is spent with LAPACK DSYEV to obtain all eigenvalues of 400x400 real symmetric matrices.

The effect of parallelization is not so large at this size. (See also the next post.)


FK_2.c で 400x400実対称行列の全固有値を DSYEV で求める場合
(ちゃんと調べていないが、計算時間のかなりの部分はLAPACKで固有値を求めるのに使われている)

以下は、20x20サイトの系でtight-binding Hamiltonianの固有値問題を
各MC step について 20x20x2 回求め、10 stepの計算。

このサイズではあまり並列化の効果は大きくないようだ。


**** smp 1 ****

実行時間: 209秒
Command being timed: "./a.out"
User time (seconds): 209.29
System time (seconds): 0.00
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:29.47

**** smp 2 *****

実行時間: 185秒
Command being timed: "./a.out"
User time (seconds): 364.18
System time (seconds): 4.30
Percent of CPU this job got: 199%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:04.85

***** smp 4 *****

実行時間: 205秒
Command being timed: "./a.out"
User time (seconds): 483.49
System time (seconds): 12.41
Percent of CPU this job got: 242%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:24.66

***** smp 8 ******

実行時間: 169秒
Command being timed: "./a.out"
User time (seconds): 1314.38
System time (seconds): 29.81
Percent of CPU this job got: 794%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:49.19

0 件のコメント:

コメントを投稿