A guide to the tools (to build your own)

I have written a set of generic tools for manipulating the output of ATLAS's timers, and you can use and extend these tools if you want to autotime fancier/different things. All tools give usage information if you pass -help on the commandline. All tools default to taking input from stdin and output to stdout, so you can pipe them into each other. Each tool does a very simple thing, and the idea is you build a pipe of them to do useful work.

To make building your own tools easy, examine SRCdir/include/atlas_tvec.h which contains a host of prewritten routines and data structures to make tool building easy.

All the tools I have written allow you to choose to keep only certain vectors of data (corresponding to columns of output in the timer output). To give an example, say we ran the following line:

   c2d>./xdmmtst_atl -F 120 -N 10 100 10 -T 0 -# 3 > timer.out

This will use gemmtst.c to time all square problems between 10 and 100 in steps of 10, without doing any testing, forcing at least 120MFLOPS of computation for timing accuracy, with three repetitions.

Here's the tools I have written so far:

: Reads in the output of a timer file, and produces a standard timing vectors file that can be read by routines provided in atlas_tvec.h and all downstream tools. Example usage:
   c2d>./xatl2tvec -# 3
: take tvec file with repetition timings and reduce them to single timings while adding simple statistics like min, max, and average.
: take multiple vector files and combine them into one file for later comparison. Renames vectors as necessary by adding _# to repeated names coming from later files. Can specify for some vectors to get this statistical treatment, and other vectors to just use the first one found.
: Take a standard tvec file and produce a standard ploticus data file from it.
: Take two standard tvecs that contain separate runs of the same data with non-overlapping data, and combine them into one vector. Eg., you do one run with $N=100, 200, 300$ and a second with $N=1000, 5000, 8000$. This routine will allow you to combine these $N$ ranges into one for charting all results in one graph. This can be done repeatedly to merge any number of runs together.
: recast named tvecs as a percentage of a baseline. Can also be used to compute speedup rather than percentage by adding -m 1.0 flag.

To see how these tools can be used together, you can trace the dependence chain of any of the charts that are autobuilt, as explained in §[*].

R. Clint Whaley 2016-07-28