Rambling on about architectural defaults

One FAQ for architectural defaults is why any timings are necessary when using them. The standard architectural defaults only rarely describe everything discovered by a search, but rather give only those data that we feel sure will not vary a great deal. For instance, for many machines, the kernels to use, etc., are fully specified, but CacheEdge is not. CacheEdge varies depending on your L2 cache size, which varies depending on architecture revision, so it is not specified, allowing it to tune itself for this variable parameter, while still skipping the search over less variable things (eg., if the L1 cache or FPU units change, this is usually a new architecture, not a revision of an old).

That's the theoretical reason why they shouldn't cover all discovered items. However, ATLAS presently times the kernels in order to be able to produce a comprehensive SUMMARY.LOG, and these timings could be skipped, assuming this functionality were added to the atlas install process.

There are some weaknesses of architectural defaults. One of the main ones is how they can go out of date, and cause slowdown. One big way this can happen is with compiler changes. For instance, gcc 3.0 produces completely different (and inferior) x86 code than the 2.x series, and 4.0 was similarly worse than latter-day gcc 3. Almost all architectural defaults in ATLAS 3.10 were compiled with gcc 4.7.0.

Anytime a different compiler is used, the architectural defaults become suspect. For truly inferior compiler (like gcc 3.0, 4.0, or 4.1), there is no way to get good performance, but at least some problems can be worked around by having ATLAS adapt itself to the new compiler, and architectural defaults prevent this from happening.

Clint Whaley 2012-07-10