ATLAS Timings
[Home]
[Docs]
[FAQ]
[Errata]
[Software]
[Support]
[Lists]
[Developer home]
[Timings]
This is a place where we index some performance results for ATLAS. This
area will never have anything like a comprehensive series of timings. In
particular, it is mainly for timings that have been put into visual formats
so people can just scope a graph, etc.
Unless otherwise noted, all timings were obtained using the ATLAS timers
in ATLAS/bin, and we were flushing at least three times the actual
cache size.
A valuable resource for a greater variety of timings is the
ATLAS results mail archive.
Typical ATLAS asymptotic DGEMM performance
Once you've got an install, it can be helpful to see if you are getting
the expected performance. The following table gives a rough estimate
of ATLAS's asymptotic DGEMM performance as a percentage of peak for a variety
of systems. Some variance is expected; as CPU Mhz rises, you may expect
percent of peak to diminish slightly, unless caches are enlarged
or memory bus speed rises in proportion. Similarly, differing models of
the same machine get greater/lesser percent of peak (eg., some USIII get
roughly 87%, rather than the 82% shown below). To get an idea of what this
is for your system, run DGEMM in the range of say N=1000-2000
(./xdmmtst -N 1000 2000 200), and take
the best number. If you are very much below this percent of peak for
a similar platform, make sure you are using the
architectural defaults and default
flags, and if you are and still get poor performance, enter a
help request.
ARCH | ATLAS | COMP | % Peak | PEAK (Gflop) | LINK |
2.4Ghz Core2 | 3.9.5 | gcc 4.2.3 | 89% | 9.6 | NO |
900Mhz Itanium2 | 3.6.0 | icc | 90% | 3.6 | YES |
1.6Ghz Opteron | 3.6.0 | gcc | 88% | 3.2 | YES |
1062Mhz UltraSPARC III | 3.7.8 | gcc 3.3 | 82% | 2.124 | NO |
600Mhz Atdlon | 3.5.7 | gcc 2.95.3 | 80% | 1.2 | YES |
2.8Ghz Pentium4E | 3.7.3 | gcc 3.3.2 | 77% | 5.6 | YES |
2.6Ghz Pentium4 | 3.6.0 | gcc | 77% | 5.2 | YES |
1Ghz PentiumIII | 3.7.7 | gcc 2.95.3 | 76% | 1 | YES |
1Ghz Efficieon | 3.7.7 | gcc 3.2 | 60% | 2 | YES |
1.8Ghz PPC970FX (G5) | 3.7.10 | Apple gcc 3.3 | 69% | 7.2 | NO |
3.0Ghz P4E EM64T | 3.7.10 | gcc RH 3.2.3 | 78% | 6.0 | NO |
Note that these numbers reflect asymptotic DGEMM speed only, and having a
high percentage does not necessarily make the machine faster for real
computational tasks.
3.9 Developer timings:
- Multiprocessor timings comparing new (3.9) and old (3.8) threading subystems
- Timings showing performance of new threading system on 8 and 4 processor
Core2 systems (Linux and Windows, respectively), and a 6-process SiCortex
MIPS node.
old 3.7 Developer timings:
- Efficeon and Pentium III timings for ATLAS 3.7.3
- Serial [D,S]GEMM and [D,S]LU results.
- Opteron 64 v 32 bit timings for ATLAS 3.7.1
- Pentium 4 and Pentium4E timings for ATLAS 3.7.3
- Serial DGEMM and DLU results.
- Opteron 64 v 32 bit timings for ATLAS 3.7.1
- Serial SGEMM and DGEMM results.
Here are the 3.6 timings:
- ATLAS 3.6.0 v 3.4.2 on a 1.6Ghz Opteron
- LU, Cholesky and GEMM results.
- ATLAS 3.6.0 v 3.4.2 on a 2.6Ghz P4HT
- LU, Cholesky and GEMM results, including a graph showing the effects of
hyperthreading.
- ATLAS 3.6.0 v 3.4.2 on a 900Mhz Itanium 2
- LU, Cholesky and GEMM results, including a graph showing the performance
bug in TRSM that killed 3.4 performance.
Old 3.5 developer timings:
- ATLAS 3.5.6 on a Dual 1.6Ghz Opteron
- Serial and dual threaded results for LU and Cholesky factorizations,
and matrix multiply (double precision only).
- ATLAS 3.5.6 vs. ATLAS 3.4.1 on a 1.7Ghz P4.
- Double and single precision real results for LU and matrix multiply.
Includes a graph showing the effects of kernel cleanup.
- ATLAS 3.5.6 vs. ATLAS 3.4.1 on my 1Ghz PIII laptop
- Double and single precision real results for LU and matrix multiply.
- ATLAS 3.5.7 vs. ATLAS 3.5.6 Athlon & Opteron
- Double precision real results for 3.5.7's improved SYRK and Cholesky
on an Athlon and Opteron.
- ATLAS 3.5.10 on Opteron
- Multiprecision ATLAS 3.5.10 results for GEMM, SYRK, LU, and Cholesky on
one processor of a 1.6 Ghz Opteron.