spike(1)spike(1)NAMEspike - Performs code optimization after linking a program
SYNOPSISspike binary_file [options...]
OPTIONS
Determines the frequency cutoff point for the alignment of profile-
based basic blocks. Valid threshold values are floating-point numbers
between 0 and 1, inclusive. A higher number means more basic blocks
will be quadword aligned. [Default: 0.95] Specifies the version of the
Alpha architecture for which to generate instructions. See cc(1) for
information about the possible values of option and for a comparison of
-arch and -tune. The default option is ev4. Spike will accept binaries
that contain instructions that require an architectural extension not
present in the processor specified by -arch. However, Spike will assume
that the instructions are guarded by code that prevents their execution
on some systems and will restrict some optimizations. For best results,
use an appropriate -arch option. Specifies new data segment starting
address, where n is a 64-bit hexadecimal number without the leading
“0x”. Enables DCPI profile-based prefetching (only works with feedback
profile). Controls DCPI profile-based prefetching. The number (n)
describes the minimum latency (in cycles) that a load must have to
become a candidate for prefetching. [Default: 50] Controls DCPI pro‐
file-based prefetching. Valid threshold values are floating-point num‐
bers between 0 and 1, inclusive. The number is used to determine the
frequency cutoff for loads to be prefetched; a higher number means more
loads will be prefetched. [Default: 0.75] Causes Spike to use the feed‐
back database stored in file, where file is the name of the input exe‐
cutable. This database is created by first compiling the program with
the -feedback option (for example, cc -feedback prog) and then instru‐
menting and running the program with the pixie -update or prof -pixie
-update command (see cc(1), pixie(1), and prof(1)). Causes Spike to
use file.Addrs (basic block addresses file) and file.Counts (basic
block counts file) for profile-based optimization. These files are
produced by the pixie tool (see pixie(1) and prof(1)). Prints a short
help screen. Use this option when applying Spike to the UNIX kernel
(vmunix). Spike can be applied only to V5.1 or later kernels. Reduces
the number of padding nops inserted into the code to align instruc‐
tions. The alignment usually makes the code run faster, but makes the
code larger, which can cause more instruction cache misses. Disables
basic block chaining, which arranges code so that the fall through path
is the commonly taken path. Invokes the original Spike and not the DTK
version. Disables procedure ordering. Disables code layout optimiza‐
tion that splits procedures into multiple parts. Specify Spike's opti‐
mization level. These flags are provided for compatibility with other
compilation tools and currently have no effect. Names the optimized
binary output file. The default file name is a.out. Specifies that
obsolete linker-defined symbols are to be ignored. (See RESTRICTIONS.)
Determines the frequency cutoff point for profile-based optimizations.
Valid threshold values are floating-point numbers between 0 and 1,
inclusive. A higher number means more routines will be optimized.
[Default: 0.99] Determines the frequency cutoff point for profile-based
routine splitting. Valid threshold values are floating-point numbers
between 0 and 1, inclusive. A higher number means more code is consid‐
ered hot and less code is considered cold. [Default: 0.95] Enables
stride prefetching based on a profile of data-address strides collected
by using the Pixie tool. This optimization mainly targets programs in
which many data cache misses occur inside loops. Keeps unreachable
routines from being deleted by Spike if they have an entry in the sym‐
bol table. Specifies new text segment starting address, where n is a
64-bit hexadecimal number without the leading “0x”. Instructs the
optimizer to tune the application for a specific version of the Alpha
architecture. See cc(1) for information about the possible values of
option and for a comparison of -tune and -arch. The default option is
ev6. Displays the version number of Spike. Enables extra warning mes‐
sages.
OPERANDS
Name of the binary file to which Spike is to be applied.
DESCRIPTION
Spike is a tool for performing code optimization after linking. It is a
replacement for om and does similar optimizations. Because it can oper‐
ate on an entire program, Spike is able to do optimizations that cannot
be done by the compiler.
Some of the optimizations that Spike performs are code layout, deleting
unreachable code, and optimization of address computations. Spike is
most effective when it uses profile information to guide optimization.
Spike can process binaries linked on Tru64 UNIX (formerly Digital UNIX)
Version 4.0 or later systems. Binaries that are linked on Version 5.1
or later systems contain information that allows Spike to do additional
optimization.
You can use Spike in two ways: By applying the spike command to a
binary file after compilation. As part of the compilation process, by
specifying the -spike option with the cc command (or the cxx, f77, or
f90 command, if the associated compiler is installed).
The -spike option is more convenient when you are not using profile
information (Example 2), or you are using profile information in the
compiler, too (Example 3). The spike command is more convenient if you
do not want to relink the executable (Example 1) or you are using pro‐
file information after compilation (Examples 4 and 5).
All spike command options can be passed directly to the cc command's
-spike option by using the cc command's -WS option. Example 6 shows the
syntax.
RESTRICTIONS
Spike cannot process the following images: Images that have been
stripped. Images that contain certain obsolete linker-defined symbols
and structures such as RPDR tables (see Section 2.3.7, Special Symbols,
of the Object File/Symbol Table Format Specification). This can be
overruled by using the -obsolete_linkerdefs option, but the resulting
binary files may be incorrect, so use with caution. Images that modify
the text section at run time.
Using cord, atom, pixie, hiprof, or third on an image that has been
processed with Spike is unsupported.
NOTES
Spike tries to update the symbol table in the binary so that the opti‐
mized binary can be debugged. As with other compiler optimizations,
there may be some situations where the debugger may not be able to
properly report the current location in the program or display the val‐
ues of variables. If Spike divides a procedure into multiple disjoint
parts, the main body will keep the original procedure name, but the
other parts will have names that are the original name with _cold_n
(where n is a unique number) appended to the end.
EXAMPLES
In the following example, Spike is applied to the binary my_prog, pro‐
ducing the optimized output file prog1.opt: % spike my_prog -o
prog1.opt In the following example, Spike is applied during compilation
with the cc command's -spike option: % cc -c file1.c % cc -o prog3
file1.
The first command line creates the object file file1.o. The
second command line links file1.o into an executable image and
uses Spike to optimize the executable image. The following
example shows how to optimize a program, prog, by first compil‐
ing it with the -feedback option, then merging profiling statis‐
tics from two instrumented runs of the program, and then compil‐
ing it with the -spike and -feedback options so that the feed‐
back information stored in the executable image is used by the
compiler and Spike: % cc -feedback prog -o prog *.c % pixie
-pids prog % prog.pixie (input set 1) % prog.pixie (input set 2)
% prof -pixie -update prog prog.Counts.* % cc -spike -feedback
prog -o prog *.c
The first compilation produces an augmented executable image
that will later accept feedback information.
The pixie command creates an instrumented program (prog.pixie),
which is then run twice. The -pids option adds the process ID of
each test run to the name of the profiling data file produced --
for example, prog.Counts.371 and prog.Counts.422.
The prof -pixie command merges the two data files. The -update
option updates the executable image, prog, with the combined
information.
The program is compiled with the -spike and -feedback options so
the feedback information stored in the executable image is used
by the compiler and Spike. The following example shows how to
optimize a program, prog, by first compiling it with the -feed‐
back option, then merging profiling statistics from two instru‐
mented runs of the program, and then applying the spike -feed‐
back command to use the feedback information stored in the exe‐
cutable image: % cc -feedback prog -o prog *.c % pixie -pids
prog % prog.pixie (input set 1) % prog.pixie (input set 2) %
prof -pixie-update prog prog.Counts.* % spike prog -feedback
prog -o prog.opt
As in the previous example, the first compilation produces an
augmented executable image. The instrumented program is run
twice, producing a uniquely named data file each time. The prof
-pixie -update command merges the two data files and updates the
executable image with the combined information.
The spike-feedback command uses the combined profiling informa‐
tion to produce the optimized output file prog.opt. The follow‐
ing example shows how to optimize a program, prog, by merging
profiling statistics from two instrumented runs of the program,
then applying the spike-fb command to use the feedback informa‐
tion in the and files: % cc prog -o prog *.c % pixie -pids prog
% prog.pixie (input set 1) % prog.pixie (input set 2) % prof
-pixie -merge prog.Counts prog prog.Addrs prog.Counts.* % spike
prog -fb prog -o prog.opt
The first compilation produces a normal executable image. As in
the previous example, the instrumented program is run twice,
producing a uniquely named data file each time.
The prof -pixie -merge command merges the two data files into
one combined prog.Counts file.
The spike-fb command uses the information in prog.Addrs and
prog.Counts to produce the optimized output file prog.opt.
The method in Example 4 is preferred. You should use the method
in Example 5 only if you cannot compile with the -feedback
option, which uses feedback information stored in the executable
image. The following example shows the syntax for passing spike
command options to the cc command's -spike option by using the
cc command's -WS option: % cc -spike -feedback prog -o prog *.c
\
-WS,-splitThresh,.999,-noaggressiveAlign
The following example shows how to optimize a program, prog,
using profiles obtained by using the DCPI profiler: % mkdir db
# create profile directory % dcpid db # start dcpi demon %
./prog # run your program % dcpiquit # stop dcpi demon %
dcpi2bb -make_bbdb -counts -pm all -conf_low -db db prog
# store feedback information
in the binary
spike prog -feedback prog # spike your program utilizing
feedback The following example is similar to the previous one,
but it contains three modifications for DCPI-based prefetching:
% mkdir db # create profile directory % dcpid -vtrace
/usr/lib/dcpi/vp-ldlatency.so db # start dcpi demon %
./prog # run your program % dcpiquit # stop dcpi demon
% dcpi2bb -make_bbdb -counts -pm all -conf_low -load_lat -db db
prog
# store feedback information in
the binary % spike prog -dcpi_prefetch -feedback prog
# spike your program utilizing
feedback The following example demonstrates how to perform
stride prefetching: First, instrument an executable image (prog)
for profiling address strides by the following command: % pixie
-stats dstride prog # Step (a): instrumentation
This command creates an instrumented program (prog.pixie). Sec‐
ond, run the instrumented program with the input intended for
training purpose: % prog.pixie input # Step (b):
stride profiling
This command generates a profile of address strides, which is
stored into the file prog.Counts. Finally, invoke Spike to
insert stride prefetches: % spike prog -fb prog -stride_prefetch
-o prog.pf
# Step (c): prefetch insertion
The output (prog.pf) is a version of the program with stride
prefetches inserted.
Note that it is possible to perform both stride prefetching and
other feedback-directed optimizations at the same time. To do
this, you need to first collect the feedback information for the
other optimizations and store it into the executable image using
the following sequence: % cc -feedback prog -o prog *.c % pixie
prog % prog.pixie input % prof -pixie -update prog prog.counts
Then, you basically repeat Steps (a) to (c) for stride prefetch‐
ing, except that you need to turn on both stride prefetching and
other feedback-directed optimizations in a single spike command:
% pixie -stats dstride prog # same as Step (a) % prog.pixie
input # same as Step (b) % spike prog -feedback
prog -fb prog -stride_prefetch -o prog.opt_pf
# Step (c) plus other feedback-directed opti‐
mizations
The output (prog.opt_pf) is a version of the program with both
stride prefetching and other feedback-directed optimizations.
RETURN STATUS
Spike returns the following status values:
0: Success
Nonzero: Error
SEE ALSOcc(1), pixie(1), prof(1)
Programmer's Guide
The spike web page at http://www.tru64unix.compaq.com/spike/
The DCPI web page at http://www.tru64unix.compaq.com/dcpi/
spike(1)