Building Generic x86 libraries

Many users ask how ATLAS can be used to build libraries that will run on all x86 platforms. In general, this is a bad idea: ATLAS gets its speed by specializing for particular platforms, so the more generic a library is the less performance it will achieve! Note that libraries like MKL can do well across many platforms by having fat binaries, where each kernel routine has actually been seperately tuned for many different platforms, and then queries something like CPUID to determine what sublibrary to call dynamically. ATLAS does not have the ability to build fat libraries.

So, users wanting generic x86 libraries will definitely lose performance in ATLAS, but many system admins have asked for this feature, and so I have added it to ATLAS. The idea is to get you libraries that get better performance than the reference BLAS, but whose percentage of peak may be woefully low, but that will run on a variety of platforms. You can do this by artificially overriding ATLAS's architecture detection, and manually telling configure to use use some generic architectural defaults that have been created, as described in the following paragraphs.

Never use these libraries unless this portability is absolutely required. They must use portable settings for blocking, for instance, which will mean that on many platforms they will use only a fraction of the actual cache, causing large performance drops. Even worse, peak performance may by reduced by as much as factor of 8, due to not using the proper ISA extension. The most portable ISA uses only the x87 unit, which has a much lower peak rate on most modern machines (eg., an Intel sandy bridge can do 16 flops/cycle using AVX in single precision, but only 8 flops/cycle if using SSE, and only 2 flops/cycle using the x87 unit).



Subsections
R. Clint Whaley 2016-07-28