INSTALL QUESTIONS:
POST-INSTALL QUESTIONS:
Who uses ATLAS?
ATLAS can be used by anyone needing fast linear algebra routines. ATLAS
is used directly by a great many research scientists. Because of the
open nature of ATLAS, we have no way of knowing how many users of ATLAS
there are. In the following paragraphs, we indicate some of the users
that we know about, but this is far from a complete list.
ATLAS is used, or is planned to be used, in the following PSEs:
Additionally, ATLAS is included in some way by the following OS distributions:
What are the academic references for ATLAS?
The academic references for ATLAS are given in bibtex format below. If
you want to reference one paper only, probably the newest (first shown)
is the best, as it references the others. The first two papers
contain the bulk of the needed information. Referencing the homepage can
help other researchers find the software.
Note that there have been quite a few subsequent papers that discuss ATLAS (with varying degrees of accuracy and detail) written by people not directly involved in ATLAS's production and design. While these papers may be about ATLAS, they are not, obviously, primary sources, and should not be cited as such. If the paper is not authored by Whaley or Petitet, it is not a primary-source ATLAS paper.
@ARTICLE{whaley04, AUTHOR = "R. Clint Whaley and Antoine Petitet", TITLE = "Minimizing development and maintenance costs in supporting persistently optimized {BLAS}", JOURNAL= "Software: Practice and Experience", volume = "35", number = "2", pages = "101-121", month = "February", YEAR = "2005", NOTE = {\verb+http://www.cs.utsa.edu/~whaley/papers/spercw04.ps+} } @ARTICLE{WN147, AUTHOR = "R. Clint Whaley and Antoine Petitet and Jack J. Dongarra", TITLE = "Automated Empirical Optimization of Software and the {ATLAS} Project", JOURNAL = "Parallel Computing", VOLUME = "27", NUMBER = "1--2", PAGES = "3--35", YEAR = 2001, NOTE = "Also available as University of Tennessee LAPACK Working Note \#147, UT-CS-00-448, 2000 ({\tt www.netlib.org/lapack/lawns/lawn147.ps})" } @inproceedings{atlas_siam, AUTHOR = {R. Clint Whaley and Jack Dongarra}, TITLE = "{Automatically Tuned Linear Algebra Software}", BOOKTITLE = "Ninth SIAM Conference on Parallel Processing for Scientific Computing", NOTE = "CD-ROM Proceedings", YEAR = 1999 } @inproceedings{atlas_sc98, AUTHOR = "R. Clint Whaley and Jack Dongarra", TITLE = "Automatically Tuned Linear Algebra Software", BOOKTITLE = "SuperComputing 1998: High Performance Networking and Computing", YEAR = "1998", NOTE = "CD-ROM Proceedings. {\bf Winner, best paper in the systems category.}\\ URL: \verb+http://www.cs.utsa.edu/~whaley/papers/atlas_sc98.ps+" } @techreport{atlas_wn97, AUTHOR = {R. Clint Whaley and Jack Dongarra}, TITLE = "{Automatically Tuned Linear Algebra Software}", INSTITUTION = "University of Tennessee", YEAR = "1997", MONTH = "December", NUMBER = "UT-CS-97-366", NOTE = "URL : \verb+http://www.netlib.org/lapack/lawns/lawn131.ps+" } @UNPUBLISHED{atlas-hp, TITLE = "ATLAS homepage", AUTHOR = "{See homepage for details}", NOTE = "http://math-atlas.sourceforge.net/" }
Does ATLAS run on my platform (OS/hardware)?
ATLAS should produce optimized libraries on almost any platform
possessing an ANSI/ISO C compiler, and some Unix-like command-line tools
(eg., make, cp, etc). ATLAS runs on pretty much all Unix variants
(including embedded systems), as well as Windows (Windows users must install
the free cygnus tools).
What software license does ATLAS use
(AKA: in what ways and for what purposes am I allowed to use ATLAS)?
ATLAS uses a BSD-style license, without the advertising clause. ATLAS's
license is taken almost verbatim from the example given at
opensource.org. Here is the relevant portion of the license,
as taken from an ATLAS source file:
* Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions, and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the ATLAS group or the names of its contributers may * not be used to endorse or promote products derived from this * software without specific written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE ATLAS GROUP OR ITS CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE.To see the exact license, simply edit almost any source file in the ATLAS tarfile (eg., ATLAS/src/auxil/ATL_lcm.c).
How do I get help/technical support with ATLAS?
Your first resource for should always be
the ATLAS errata file. This file keeps track
of all discovered errors in ATLAS, and their workarounds or fixes. It
also contains workarounds for common system problems (eg., compiler errors,
non-standard commands, etc), as well containing advice necessary to get
the best performance on various machines.
If you have downloaded the ATLAS source, your ATLAS/doc directory contains some useful documentation, though it is often more dated than the info in the errata and online.
If (and only if) neither of these sources provides the information you need, you can can submit a support request to:
Do not, under any circumstances, post your support request to the "bug" tracker. As documented on the tracker itself, this is for developer confirmed bugs only. All users should use the support or feature request trackers. Things that turn out to be bugs will later be escalated to the bug tracker by the confirming developer.In addition, please understand that the tone of your support request is important, as described here.
You do not need to create SourceForge account in order to use the tracker (that persistant plea to "please log in" can be ignored), though it makes things easier if you do. In particular, if you don't log in, you won't be able to later attach extra files, etc (you can attach a file in your initial report, but afterwords, it is unsure you are the original poster, so it won't allow it). So, if you think you may need to do this kind of thing relatively often, it may be worth doing.
Note that you should upload the error_[ARCH].tgz file as well. If the error killed the ATLAS install before it succesfully created the error tarfile, create it yourself by issuing the following command from your BLDdir subdirectory:
make error_report
Note that the [ARCH] of the above directions should be replaced by your architecture string that ATLAS is using (eg., Linux_P4SSE1 or SunOS_SunUS4, etc).
What documentation is available (usage info)?
ATLAS's main job is to provide optimized libraries, so most of the
documentation is on the appropriate APIs. ATLAS does provide some
executables, but these are merely testers and timers for the provided
libraries. A very rough description of the operation of these executables
is given in ATLAS/doc/TestTime.txt in your ATLAS source directory.
Here's some pointers to ATLAS documentation:
What mailing lists, archives, and so on does ATLAS have?
ATLAS has the following tracker lists:
ATLAS also has various mail lists and archives. Anyone can sign up or post to these guys. They are:
Can I download a prebuilt binary instead of installing
from source?
Unfortunately, we lack the manpower to provide prebuild binaries.
Can I get ATLAS in rpm or .deb or some other format?
Our only supported format is a compressed tarfile. If you really feel the
need for .rpm or .deb versions, other parties (eg, Debian, SuSE) provide
them (note that we can't answer questions on ATLAS installed in this way,
however, since we don't know much about them). ATLAS provided by third
parties may not be as up-to-date, or may run slower than compiled ATLAS
(eg., some companies compile only a couple of x86 libraries, so that they
would use the same library for the P4 and P4E chips, even though
ATLAS should tune itself separately to all the x86 variants for maximal
performance).
What does the version number of ATLAS mean?
ATLAS version numbers look like:
<major number>.<minor number>.<update number>.
The meaning of these terms is:
So, 3.2.1 would be a stable release, with one group of fixes already applied. 3.3.12 would be the 12th update (13th release) of the associated developer release.
How can I tell what version of ATLAS I have?
For ATLAS version 3.3.6 or newer, you can find out version and build
information via the routine
ATL_buildinfo. The following complete program will give build
information (including version number) when linked against version 3.3.6
or later libatlas.a's:
main() /* * Compile, link and run with something like: * gcc -o xprint_buildinfo -L[ATLAS lib dir] -latlas ; ./xprint_buildinfo * if link fails, you are using ATLAS version older than 3.3.6. */ { void ATL_buildinfo(void); ATL_buildinfo(); exit(0); }
If you are using an ATLAS version prior to 3.3.6, there is no easy way to find the version information without looking at the source. If you have the source tree around, the easiest fix is to examine pretty much any source file (eg. ATLAS/src/auxil/ATL_lcm.c); the major and minor version number will be given in the copyright notice at the top. To find out the update number, you'd have to consult the actual routines updated by the particular update, as given in the ATLAS errata file.
If you still have the directory where you built ATLAS around, you can find this version information w/o writing the above routine by:
cd BLDdir/bin make xprint_buildinfo ./xprint_buildinfo
Developer releases, on the other hand, are meant to be used, as the name suggests, by ATLAS developers, contributers, and people happy to live on the bleeding edge. Developer releases are meant to allow access to the newest ATLAS sources, and may represent a simple snapshot of the internal developer tree. As such, they are essentially untested, and may not build, much less run, correctly. So, while they may possess features not available in the current ATLAS release, only the most experienced of users should consider utilizing them.
Developer releases are available from the
developer site,
while stable releases are available from the
ATLAS main page. Stable
and developer releases are also distinguished by their version numbers,
as explained here.
What LAPACK routines does ATLAS provide?
The only way to be sure you have the most up-to-date list is to examine
the source in ATLAS/interfaces/lapack/F77/src/. It is pretty
much a foregone conclusion that any documentation, this page included,
will eventually become out of date. ATLAS 3.6 and 3.8 provide C and Fortran77
interfaces to these routines:
Since LAPACK has no official C API, ATLAS provides its own in ATLAS/interfaces/lapack/C/src/.
What header files does ATLAS provide?
The official header file for the C interface to the BLAS is available
as ATLAS/include/cblas.h. The header file for the
C interface to LAPACK is ATLAS/include/clapack.h.
How can I get dynamic (.so) libraries rather than ATLAS's
default static libraries (.a)?
ATLAS 3.8.0 has prototype support for building dynamic libraries,
as described here.
What's the best hardware for running ATLAS/what machine
do you recommend I buy for this kind of work?
This is another question that is pretty much impossible to answer
generally or keep up to date. I need to update this entry once I
get my hands on the new Opteron!
Can I use ATLAS with CLAPACK?
Yes. CLAPACK gives you the option to compile CLAPACK to use the standard
C interface to the BLAS, which ATLAS provides. If you run CLAPACK's included
BLAS tester, be sure to turn off error-exit tests, since it can't properly
test the error exits returned by the CBLAS. ATLAS provides essentially the
same testers in ATLAS/interfaces/blas/C/testing, which do
correctly test the error exits, if that's important to you.
How well optimized are the various routines in
ATLAS?
All of the routines in ATLAS tend to be competitive with the machine-specific
versions for most known architectures. However, ATLAS is not just about
working well on known architectures, but also tries to be optimal for
unknown machines. When it comes to the generality of the optimizations
ATLAS uses, there is a definite heirarchy:
I need routine/architecture X optimized, can you do
it?
If you have a particular operation and/or architecture you really need
optimized, you may want to post a mention of that to
the ATLAS feature request tracker.
We don't do optimization on request, but when we have to choose the next
set of operations to support, user input can certainly influence things.
To maximize your chance of swaying us, you'll want to include what percentage
of your application time is spent in the particular operation, etc.
A quicker way to get action is to do it yourself. ATLAS is open source,
and The developer homepage
explains how you can use ATLAS to optimize various operations.
How is ATLAS funded?
ATLAS is presently funded by NSF CAREER OCI-1149303.
It was supported in the past by NSF CRI CNS-0551504 and the NSF EPSCoR
Cooperative Agreement No. EPS-1003897, with additional support from
the Looisiana Board of Regents. For more details, see
here.
I originally started ATLAS development when I worked at the Innovative Computer Laboratory at the University of Tennessee. I got enough of it working to convince Jack (Dongarra, of LAPACK and BLAS fame) to give the development go-ahead on my own time. After that, ATLAS was written into a variety of grants, but was never funded (to my knowledge) solely on its own grant at ICL. I believe some of it's later development took place under the NSF grant "Linear Algebra Algorithms and Tools for Emerging Computing Environments and User Communities", Grant Number ACI-9813362.
Both Antoine and myself (the two full-time ATLAS researchers and developers) left ICL in 2001. After this date, ATLAS work was pretty much entirely unsupported, which slowed development in a massive way. However, in 2003, Advanced Micro Devices funded a year of my graduate studies in return for some Opteron tuning. This allowed me to spend quite a bit more time on ATLAS than previously, and resulted in the release of ATLAS 3.6.0 in that year.
After this, I found very little time to work on ATLAS due to faculty duties until 2006 when I got funding for both research and maintainence, thanks to both NSF and DoD. Details can be found here. ATLAS development has therefore picked up again, and ATLAS 3.8.0 was released in October of 2007.
At present, the main support for ATLAS comes from my CAREER award. The government contract was not transfered to LSU, but Tony Castaldo is supported on it until mid-2014.
Who provides infrastructure support?
Obviously, Sourceforge provides
the the ATLAS main page,
including CVS services, tracker, etc. Also
netlib provides
access for a large part of the mathematical community through
ATLAS's original homepage.
As far as machine access for tuning:
Who wrote/contributed to ATLAS?
Note that this question addresses package design and code contribution,
not money,
infrastructure or
testing.
R. Clint Whaley founded the
ATLAS project. After the initial release, he was joined on the project
by Antoine Petitet. Between
them, these two individuals are responsible for 95% of the code in ATLAS,
along with pretty much all of the design. That is not to say that others
have not made substantial contributions, however.
In particular, ATLAS has been designed to allow for outside contribution such that a user can provide only a very small kernel, and thus speed up large portions of the library. Many people have contributed in this manner, and this has resulted in extremely large performance improvements for ATLAS on certain architectures. These contributers (in alphabetic order), and a rough sketch of what they have done, are:
For ATLAS 3.8.0, I did all testing myself, though several developers provided
me with machine access.
There are many places in the search where I could prune things back and
have no effect on performance on any known architecture, but since the
speed is adequate, additional search options are left on in case an unexpected
architectural change is found. I could also utilize more sophisticated
sampling techniques, but these would then need to be validated to work
on the vast array of machines (the present search having been tested
for over seven years, and on innumerable architectures). All this is to say
that speeding up the search is not a bad thing, it just is not that helpful
to the core usage of ATLAS, and so it is not worth the cost/risk of change
at this time. If additional tuning capabilities are added, so that the
search time becomes more critical, then of course the search will be
updated.
For most systems, you can at least
tell ATLAS's configure to use the cycle-accurate
wall timers, which will make all timings much more accurate if you
are on an unloaded machine.
Ultimately, a search with some real statistics should be built, which
would determine if two timings are statistically different or not, and
also use some statistics to see how many probes are required to get
reliable results. This is the area that I think I a most likely to
improve the search in, if I ever have time.
Empirical searches, when ran on real machines experiencing unrelated load,
are almost never strictly repeatable, even in the best of cases. The default
ATLAS search is far from the best case: the sampling and timing mechanisms
are crude, made to work on the lowest-common denominator setups, up to and
included embedded systems. So, when run in this mode, the search is designed
to give you a library that isn't bad, but is often far from the best.
To get better results (which are then saved as architectural defaults),
I usually run the search multiple times, and if necessary, intervene by
hand to probe promising transformations. Thus, the architectural
defaults can be thought of as a save of several installs + some user
intervention. Also, the architecural defaults are synergistic with the
default compiler flags, so you want to leave both alone for best results.
The first thing to check is that your library runs about the same as those
that you would get using the architectural defaults. You can determine
this by make time,
as described here.
If you choose not to use the recommended compiler and architectural defaults,
be sure to follow these directions.
If you suspect your performance
is suboptimal, open up a support request and ask.
Play with the different compiler and flags to find things that better
match both the defaults, and your output flags. Be sure to do all the normal
post-install tuning, including tuning
CacheEdge.
Finally, if your install is indeed faster than the arch defaults,
report it.
There are three main factors why even true asymptotic speedup from large
blocking factors are a bad idea:
Note that points (2) & (3) are very important: GEMM is one of the most
studied performance kernels in the world not for its own sake, but due
to the wide variety of applications whose performance can be improved by
speeding it up. Thus, speeding up GEMM at the expense of application
performance is something that only someone interested in benchmarking
GEMM (as opposed to building a usuable library) would want to do.
Therefore, the ATLAS search limits NB to 80. We occasionally relax this
limit (manually, never blindly in the search) when it is absolutely necessary.
For instance, on SPARCs, large NB have proven necessary for decent performance,
and on the Pentium 4 (not P4E), the floating point unit does not make use of
the L1 cache, and so we block for the L2. However, in these cases we first
verified that the win is true and substantial, and we then hand-tuned the
cleanup to ameloriate the effects of large NB as best we could. Even so,
these systems can display very bad performance due to point (3) above, and
we actually do not use the best NB for GEMM performance even so, as we
increase it only large enough to get adequate asymptotic performance.
Without examining this tradeoffs, you should never increase NB, unless you
are tuning for a large GEMM benchmark.
I believe most of the vendors provide some C interface to LAPACK; I think
most of them just use the same name (eg, F77's DGESV becomes
dgesv), and the scalars become pass-by-value. ATLAS provides
a C interface to the LAPACK routines natively provided by ATLAS, which
is based on the standardized C interface to the BLAS. These routines
are prefixed with clapack_, and their prototypes can be found
in ATLAS/include/clapack.h. ATLAS natively provides only a
handful of LAPACK routines, so if you want to call something that is
not provided here, your best bet is to call the Fortran77 interface.
The good news about this is that being a standard interface, all lapack
libraries should support it in the same way.
I understand why. A user has a problem using/installing/understanding
the software, and is understandably frustrated. Only when the frustration
has built up to a great degree is he/she motivated to write to the author.
At that point, the user emotionally feels that the author has done it to
him on purpose, and so, usually without realizing it, attacks the author.
What users should consider is that for an open source developer, support
requests constitute 90% of his contact with his users. If all of this
contact is negative, it does not lead to a desire on the author's part
to provide a lot more support, and in extreme cases, probably causes
developers to quit the project.
My guess is I get probably 2 or 3 message per year that have something
positive to say about the software that I spend enormous amounts of
my life developing, maintaining and supporting, and provide to the
user for free. For every positive message, I get many many insulting
or denigrating replies. It always adds to the anger quotient to feel
that the user thinks so little of you that he thinks nothing of insulting
you while using your software and asking for your help, even though
you don't know him, he's not intending to do anything for you, and you
are providing this stuff for free.
So, I have added this discourse to the FAQ in the hopes of stimulating
users to consider the
Another thing to keep in mind when you are writing your mail is that it
is entirely possible the question you are asking is answered in the
documentation. If I get several questions about the same topic, I usually
wind up creating a FAQ or errata entry about it. In a perfect world, this
would mean you would read the docs and not have to ask it. Even assidiously
trying to do so, it is easy to miss the relavant doc. So, it is a good
idea to couch your language so that the author is not tempted to reply:
RTFM, @ASD@!. If a user sends in a mail
Hey, I can't get these files to link, any idea what's wrong?,
I don't mind giving him the link to the errata entry that's been there
since version 0.0.1 of the software. However, the guy with the more
common Your libraries do not work style of message simultaneously
demotivates me for work on the project, pisses me off, and generates a great
desire to return the anger to him with a good old RTFM diatribe.
Even if you are not going to read through all the docs, if you are submitting
a support request, at least take the time to read the
FAQ entry on how
to submit a support request. Almost half my users send in a message that
translates to I haven't read the docs, and they post it to the "bugs"
list, which is reserved for developer-verified bugs.
Let me show the absolute best kind of support request I get:
As I said, I get maybe one or two messages of this type a year. I get
quite a few absolutely neutral messages, which are OK as well. They don't
imply that the problem is necessarily in my software or my brain, but
rather just report that a problem has been encountered. They go more like:
Another thing that users do that drives me crazy is to
Keep in mind that the author of a package probably consideres himself more
knowledgable than the majority of his users on issues closely related to
his project, and so it may grate upon him
to have users tell him that he has done things the wrong way, or doesn't
understand how "modern" libraries work, etc. So, I would avoid phrasings like
can you fix the insane way config works? or
how about doing this in the correct way (yes, I really get messages
like this).
Another important thing is to understand the context the author is working in.
If you see a piece of code that is truly horribly written, or works in
a particularly awkward way, you may feel justified in arguing that the
author has simply done it all wrong, should call it a bug, and fix it!
However, the author is not responsible only for the couple hundred lines
of code you are examining. In my case, I am responsible for roughly half
a million lines of code, as well as continuing development. Therefore, I may
actually agree that a particular way of doing things is sub-optimal, but if
it is well-tested to work on the enormous number of platforms ATLAS runs
under, I will often decide to leave it alone, in order to concentrate on
more important concerns. Even in the case where the author agrees with you
that the code is a complete POS, a little tact on your part will go a long
way. Understanding that the author is not always free to rewrite the section
of code of greatest interest to you will go even further.
Well, I am not confident that users will read this, but I am confident that
I will use its URL in replying to a lot of future user requests, so I think
this time away from development is well spent!
Cheers,
Is ATLAS thread safe?
It should be completely safe to call any ATLAS routine from a threaded code.
There are no global variables, or other shared information between routines.
Probably the best idea is to say "yes" to threading in config, even if you wish
to do the threading yourself. That way, the ATLAS lib will be compiled with
the threading flags. Then, simply link to the serial interface so that ATLAS
doesn't do the threading. If you want ATLAS to do the threading as well,
simply link to the threaded interface.Can I vary the number of threads ATLAS uses dynamically?
No. The maximum number of threads to use is determined at compile time.
ATLAS will never use more than this, but may use less if the problem sizes
are too small to get speedup from the additional parallelism.What's the deal with the RHS in the row-major factorization/solves?
Most users are confused by the row major factorization and related solves.
The right-hand side vectors are probably the biggest source of confusion.
The RHS array does not represent a matrix in the mathematical sense, it is
instead a pasting together of the various RHS into one array for calling
convenience. As such, RHS vectors are always stored contiguously, regardless
of the row/col major that is chosen. This means that ldb/ldx is always
independent of NRHS, and dependant on N, regardless of the row/col major
setting.Why don't you speedup/improve ATLAS's search
There are several questions here, handled in their own sub-questions:
As you will see by reading each, only the last of these actually would be
helpful for ATLAS's main use, and it has stayed on the backburner for quite
some time because its almost always more useful to expand ATLAS's other
capabilities. Note that the majority of users should use the provided
architectural defaults, thus avoiding the search altogether. The search
is there only for exploration by the expert user (in a user-controlled
fashion), or to enable a naive user to get an adequate library on a
truely new architecture (in its fully automatic mode).
Why don't you improve the type of ATLAS's search?
There has been quite a bit of research on fast search techniques. ATLAS
uses a relaxed 1-D line search, where the `relaxed' comes from the fact
that interacting transforms are usually handled by restricted 2/3-D searches.
This is a very basic search technique, and many people wonder why a more
advanced algorithm, such as hill climbing, simulated annealing, or
genetic algorithm isn't used. The real answer is that it is overkill.
Because I understand the transformations ATLAS attempts, and how they
interact, I am able to target the relaxed line search appropriately.
More advanced techniques are more appropriate when you know do not understand
good start values for transforms and less about the
interactions between optimizations and how to resolve them. The modified line
search has some nice properties: it is easily guided by hand by the
expert user in order to expore spaces more fully, and it is easy to
understand and maintain.
Why don't you improve the speed of ATLAS's search?
I occasionally get suggestions on how to speedup ATLAS empirical search.
I know of a multitude of ways that I could do this. In my view, however, they
are not worth the effort/risk at the present time. Most users should use the
architectural defaults, skipping the search altogether. The only speed
criteria that went into
the search design was that it needed to be tolerable. The main purpose of
ATLAS is to provide an optimized library, and once the search could produce
that in a period of time O(1 day), that seems good enough. Many architectures
are much faster than that, of course.
Why don't you improve the accuracy of ATLAS's search?
This is the search problem that I am most tempted to fix. The present
search is mainly designed to be usable by an installer with no system
priveledges, who must install on stock systems that are experiencing
unrelated load during the installation. Thus, by default ATLAS uses
CPU-time for all non-threaded installation decisions, which is extremely
innaccurate. This often leads to the search going awry (i.e., failing
to find a more optimal kernel), which is why the architectural defaults
are so important.What's the deal with the architectural defaults?
I split this into several seperate questions:
Is using the architectural defaults important, rather
than doing my own search?
The short answer is definitely. As described elsewhere
the search is designed to be used only when architectural defaults are
unavailable or have become non-optimal due to compiler change. To understand
this, you need to understand the nature of empirical searches in general.When should I not use architectural defaults?
As previously mentioned, architectural defaults are usually the result
of several guided installations, and thus represent best of breed installs.
They can become a barrier to performance occasionally, particularly when
a compiler goes through a major release. For instance, ATLAS 3.8.0's
architectural
defaults are for gcc 4.2, and you are presently using 5.1, it might be
possible that things have changed enough to require new defaults, and
if you are using a bad compiler like gcc 4.1 or an old one like gcc 3,
you will almost certainly not want to use the architectural defaults.If I don't use architectural defaults, how can I
get better performance?
First, make sure your defaults are better than the architectural defaults
by comparing the timings of a default install against your search install,
as described here.
Why does the search limit NB to 80?
The default ATLAS search limits GEMM's blocking factor to at most 80.
On systems where larger NB actually blocks for the L2, blocking for the
L2 prevents ATLAS from using it's multilevel blocking parameter,
CacheEdge.
In this case, larger blockings may result in superior kernel timings (which
do no L2 blocking), but if an L1-contained NB is used, similar or superior
performance may be obtained in full GEMM with a tuned CacheEdge. In this
case, the GEMM speedup is illusory, but the application and small-case gemm
slowdown (discussed below) is quite real. On machines with large L1, or
very fast L2, GEMM may indeed get a asymptotic speedup from larger blocking
factors, but it is still almost always a bad idea, as outlined below.
Does ATLAS provide a standard C interface to LAPACK?
The short answer is no, and neither does anyone else. As far as I know, there
has been no official standardization of a C interface to LAPACK. In the
absence of a standard, each library is free to do things differently. For
instance, netlib provides something called clapack, which is the result
of running lapack through f2c on a particular platform. This means all
paremeters must be passed by reference, names have an underscore appended,
etc. ATLAS does not support this adhoc interface, though you can use
ATLAS to provide the BLAS for netlib clapack,
as mentioned
here.
Why are you such a jerk when answering user questions?
It is one of the unfortunate realities of open source development
that one of the few rewards that it should supply turns out in practice to
be a string of disparagement. I am talking, of course, about corresponding
with the people who are using the software you have produced and supported,
free of charge to them.
AKA: how can I help you feel good about providing me with support?
Subject: Problems with 3.5.8
Hi,
I've been using it in my chemistry research in order to do XXX for a couple
of years now, and ATLAS is a great piece of software! However, I'm
now having a problem getting the newest release to work.
I'm already aching to help this guy. Not only has he indicated he appreciates
what I've produced, he's given me an idea of what he is using it for (something
I am always interested in).
He's also not prejudged that the software or I am wrong. He's having
a problem, which he later describes in detail, including the error report.
If it's a bug, I'll tell him so and post a fix. It its a user error, I
won't mind letting him know the fix, and will feel more like helping him
again sometime.
Subject : matvec problems
Whenever I call matvec with N=200, my install seems to get worse performance
than the reference BLAS. Any idea why? I include my timer below.
Thanks,
My Name
This is a simple request for help, that does allow for the interpretation
that there might be an problem with the installation, or perhaps a timer error
on the user's part, as well as the idea that ATLAS is screwed up. This is
good, because the majority of user requests turn out not to be errors in
ATLAS. Nonetheless, here is a more typical phrasing:
Subject : error in matvec
Your matrix-vector product has an error. For N=200, it is slower than the
reference BLAS! Please fix this.
Please keep in mind that user error is more common than package error, so
keep your message open to this interpretation. Do not use the phrase
"there is a bug in your software" (which half my support requests
use), unless you are absolutely confident it is a bug, and have verified it
by finding the problem in the actual code. Otherwise, the chance is too
great that the error is in user understanding, and you have just implied
the author screwed something up.
Clint