Vectorizing C Compilers: How Good Are They?
Lauren L. Smith
Visiting Member of the Research Staff
Supercomputing Research Center
17100 Science Drive
Bowie, MD., 20715-4300
Abstract
The programming language C is becoming more and
morepopular among users of highpe~ormance vector com-
puter architectures, With this populm”~ of C, it becomes
more critical to have a good optimiz”nglvectotiz”ng C com-
pilez This paper describes a study of four such vectorizing
C compilers, with emphasis on the automatic vectorization
ability of each compilez This study is similar to the Fortran
study that was descn”bed in [CDL88] and in fact, one facet
of this study is a C version of the same kernels. Three suites
of C loop kernels have been developed to determine the
strengths and weaknesses of vectorizing compilers. The
Convex cc compilec the Convex Application Compileq the
Cray 2 scc compilefi and the Cray YMP scc compiler have
been tested against these suites. Thepaper gives the results
for each suite, with identification ofproblem areas for each
compiler
1. Introduction
Programmers everywhere are obtaining easy access
to high-performance workstations that are ranning
UNIX. With this trend towards UNIX, more program-
mers are using C for scientific codes on vector super-
computers. Therefore, it becomes necessary to look at
the capabilities and performance of C compilers for vec-
tor architectures.
Many of the current veetorizing C compilers use the
same vectorizing techniques that were developed for
Fortran. Are these techniques adequate? This paper
will t~ to investigate the capabilities of current vecto-
rizing C compilers and determine if additional tech-
niques are needed. Key features of C, such as pointers
and dynamically allocated memory objects will be ex-
amined with respect to vectorixation on current vector
architectures.
A multi-faceted approach has been undertaken to
try to understand the capabilities of two vendors’ C
compilers. The vectorizing C compilers used for this
study are Version 4.1 of the Convex C2 Vectorizing C
544
@1991ACM 0-89791-459-7/91/0544$01,50
Compiler (cc) [Con91a] [Con91b], Version 1.0 of the
Convex Application C Compiler (at) [Con91c], and Re-
lease 3.0.0 of the Cray Standard C Compiler (SCC)on the
Cray 2 and Cray Y-MP [Cra90]. Since this study is test-
ing the compilers’ vectorization abilities, no user direc-
tives or special compilation flags are used. Many of
these kernels can be vectorized if the user uses such di-
rectives or flags, but this involves user analysis of their
code which violates the spirit of testing automatically
vectorizing C compilers.
Section 2 discusses the C version of the Argonne test
suite for vectorizing compilers [CDL88]. The ability of
the Convex and Cray C compilers to vectorize the suite
k ecmtrasted with the Fortran compilers and with each
other.
Section 3 describes a continuation of the Argonne
test suite study, but with purely unique C features and
constructs. Comparisons are made between the vecto-
rizing capabilities of the Cray and Convex C compilers.
Some observations are also made on certain C language
features that impact the vectorizing capability of a com-
piler.
Section 4 discusses the results of looking at a suite of
C kernels abstracted from scientific C applications.
Again, the ability of the Cray and Convex C compilers
to vectorize these application kernels h contrasted and
some comments are made on the actual use of certain
C features.
It should be mentioned that some compiler termi-
nology will be used to descriie the abilities of the com-
pilers. The reader might wish to look at [ASU96],
[Ban88], [P0188] or [W0182] for definitions and abetter
understanding of some of the terms.
2. C version of the Argonne test suite
A suite of Fortran loop kernels was collected at Ar-
gonne National Laboratory to test the effectiveness of
automatic vectorizing Fortran compilers [CDL88]. The
loops were written by writers of vectorizing compilers,
and test for specific vectorization features. Some results
for several vector architectures have been reported in
[CDL88] and [Nob89].
The Argonne test suite was translated from Fortran
to C adhering to a Fortran style. Some of the kernels ex-
plicitly test certain Fortran constructs, and were not
translated, so a total of 91 out of 100loops were success-
fully translated.. This suite of 91 kernels was then com-
piled using the vectorizing options of both the Convex
and Cray compilers. As a point of comparison, the suite
of 91 kernels was also compiled using the vectorizing
Fortran compilers on both architectures. The Fortran
compiler used on the Cray 2 was Version 4.0.3 of cft77
along with Version 3.0 of fpp. The Fortran compiler
used on the Convex C2 was Version 6.1 of fc.
The first thing discovered was that the Cray and Con-
vex cc compilers would not fully vectorize loops where
the arrays are passed in as arguments to the kernel.
When arrays are passed to functions in C, the pointer
to the array is passed. The only compiler that does the
necessary interprocedural analysis for this problem is
Convex’s Application Compiler. Figure 1demonstrates
a kernel where this problem arises. Figure 2 shows how
changing the arrays to global variables and not passing
them as parameters allows vectorization.
s171(a,b,n)
float a[],b[];
int n;
{ register int ~
for(i=@ ie~i+ +)
a[i”n] = a[i”n] + b[i]; }
/* Call from another routine “/
maino
{ float a[10000], b[10000];
intn=7z
…
s171(a,b,n>
…
}
Figure 1: A function with arrays as parameters
is NOT vectorized
The Argonne test suite attempts to test the effective-
ness of compiler optimizations on Iocal loop constructs,
as opposed to interprocedural constructs and problems.
Therefore, the parameter passing of arrays was elimi-
nated by changing to global arrays in the C version with
the Fortran version left unchanged.
float a[10000], b[10000];
intn =72;
S1710
{ register int ~
for(i=O; i