Contrasting non-default install performance

If you do not install using the architectural defaults, make time will only print out the Present columns. This gives you a good summary of ATLAS's library performance, but it can be hard to tell what is good and bad if you are not familiar with ATLAS on this hardware. Sometimes, ATLAS has architectural defaults for your platform, but your install doesn't use them. This is usually because the installer has specified the use of a non-default compiler, or has explicitly asked that the architectural defaults not be used, or has overridden the detection of the architecture, etc. In this case, make time does not do the comparison against the architectural defaults, and so only the Present columns are printed.

However, if you wish to ensure that your library is as good as one that uses the architectural defaults, then you can manually tell the program called by make time (xatlbench to do the comparison. The most common example would be you have switched to an unsupported compiler (eg., the Intel compiler), and now you want to see if the library you built using it is as fast or faster than the one using the default compiler. Another example would be that you want to compare the performance of two closely related architectures. This is what we will do here, where we contrast the performance of the 32 and 64 bit versions of the library on my Core2Duo.

In order to manually do a comparison between a present install and any of the results stored in ATLAS's architectural defaults you'll need to perform the following steps:

  1. make time issued in the BLDdir of your non-default install. This does the timings of the present build, and stores the results in BLDdir/bin/INSTALL_LOG.
  2. cd SRCdir/CONFIG/ARCHS, and find the tarfile containing the results you wish to compare against. In our case, we choose Core2Duo32SSE3.tar.bz2 to compare against our own Core2Duo64SSE results.
  3. bunzip2 -c Core2Duo32SSE3.tar.bz2 | tar xvf - untars the selected architectural results (replace Core2Duo32SSE3.tar.bz2 with the tarfile you have selected in step#2).
  4. cd BLDdir
  5. ./xatlbench -dp SRCdir/CONFIG/ARCHS/<ARCH> -dc BLDdir/bin/INSTALL_LOG
    xatlbench is the program that compares two sets of results, with the -dp pointing to the previous (Refrenc) install result directory and -dc pointing to the current (Present) install result directory.

Figure [*] shows me doing this on my Core2Duo, with SRCdir = /home/whaley/TEST/ATLAS3.7.36.0 and BLDdir = /home/whaley/TEST/ATLAS3.7.36.0/obj64, where we compare the present 64-bit install to the stored 32-bit install. We see that the 64-bit install, which gets to use 16 rather than 8 registers, is slightly faster for almost all kernels and precisions, as one might expect.

Figure: Comparing 32 and 64 bit libraries on a 2.4 Ghz Core2Duo
\begin{figure}\begin{footnotesize}
\begin{verbatim}core2.home.net. cd /home/wh...
....1 114.2 116.9 27.9 26.0 41.5 45.6\end{verbatim}
\end{footnotesize}
\end{figure}

R. Clint Whaley 2016-07-28