Introduction

This note provides a quick reference to installing and using ATLAS [#!atlas-hp!#,#!atlas_wn97!#,#!atlas_sc98!#,#!atlas_siam!#,#!WN147!#,#!whaley04!#]. ATLAS (Automatically Tuned Linear Algebra Software), is an empirical tuning system that produces a BLAS [#!blas3!#,#!blas2a!#,#!blas2b!#,#!blas1a!#,#!blas1b!#] (Basic Linear Algebra Subprograms) library which has been specifically optimized for the platform you install ATLAS on. The BLAS are a set of building block routines which, when tuned well, allow more complicated Linear Algebra operations such as solving linear equations or finding eigenvalues to run extremely efficiently (this is important, since these operations are computationally intensive). For a list of the BLAS routines, see the FORTRAN77 and C API quick references guides available in the ATLAS tarfile at:
   ATLAS/doc/cblasqref.pdf
   ATLAS/doc/f77blasqref.pdf

ATLAS also natively provides a few routines from the LAPACK [#!lug!#] (Linear Algebra PACKage). LAPACK is an extremely comprehensive FORTRAN package for solving the most commonly occurring problems in numerical linear algebra. LAPACK is available as an open source FORTRAN package from netlib [#!lapack-hp!#], and its size and complexity effectively rule out the idea of ATLAS providing a full implementation. Therefore, we add support for particular LAPACK routines only when we believe that the potential performance win we can offer make the extra development and maintenance costs worthwhile. Presently, ATLAS provides roughly most of the routines that involve the LU, QR and Cholesky factorizations. ATLAS's implementation uses pure recursive version of LU and Cholesky based on the work of [#!Toledo-LU!#,#!RecFred!#,#!gustavson98A!#,#!WN146!#], and the QR version uses the hybrid algorithm with static outer blocking and panel recursion described in [#!RecQR!#]; the static blocking is empirically tuned as described in [#!lanb08!#]. In parallel, these routines are further sped up by the PCA panel factorization [#!panel10!#] and the threading techniques discussed in [#!thr08!#]. The standard LAPACK routines use statically blocked routines, which typically run slower than recursively blocked for all problem sizes.

In addition to providing the standard FORTRAN77 interface to LAPACK, ATLAS also provides its own C interface, modeled after the official C interface to the BLAS [#!Blast!#,#!blast-toms!#], which includes support for row-major storage in addition to the standard column-major implementations. The netlib LAPACK has recently begun supporting Intel's propriatary C interface, which is incompatible with the C BLAS as well as ATLAS's C interface, as well as performaing a host of unnecessary matrix transpositions. Note that there is no official C interface to LAPACK, and so there is no general C API that allows users to easily substitute one C-interface LAPACK for another, as there is when one uses the standard FORTRAN77 API. For a list of the LAPACK routines that ATLAS natively supplies, see the FORTRAN77 and C API quick references guide available in the ATLAS tarfile at:

   ATLAS/doc/lapackqref.pdf

Note that although ATLAS provides only a handful of LAPACK routines, it is designed so that it can easily be combined with netlib LAPACK in order to provide the complete library. See Section [*] for details.

R. Clint Whaley 2016-07-28