程序代写代做代考 ER data structure Fortran compiler Vectorizing C Compilers: How Good Are They?

Vectorizing C Compilers: How Good Are They?

Lauren L. Smith

Visiting Member of the Research Staff

Supercomputing Research Center

17100 Science Drive

Bowie, MD., 20715-4300

Abstract

The programming language C is becoming more and

morepopular among users of highpe~ormance vector com-

puter architectures, With this populm”~ of C, it becomes

more critical to have a good optimiz”nglvectotiz”ng C com-

pilez This paper describes a study of four such vectorizing

C compilers, with emphasis on the automatic vectorization

ability of each compilez This study is similar to the Fortran

study that was descn”bed in [CDL88] and in fact, one facet

of this study is a C version of the same kernels. Three suites

of C loop kernels have been developed to determine the

strengths and weaknesses of vectorizing compilers. The

Convex cc compilec the Convex Application Compileq the

Cray 2 scc compilefi and the Cray YMP scc compiler have

been tested against these suites. Thepaper gives the results

for each suite, with identification ofproblem areas for each

compiler

1. Introduction

Programmers everywhere are obtaining easy access
to high-performance workstations that are ranning
UNIX. With this trend towards UNIX, more program-

mers are using C for scientific codes on vector super-

computers. Therefore, it becomes necessary to look at

the capabilities and performance of C compilers for vec-

tor architectures.

Many of the current veetorizing C compilers use the

same vectorizing techniques that were developed for

Fortran. Are these techniques adequate? This paper
will t~ to investigate the capabilities of current vecto-

rizing C compilers and determine if additional tech-

niques are needed. Key features of C, such as pointers

and dynamically allocated memory objects will be ex-

amined with respect to vectorixation on current vector
architectures.

A multi-faceted approach has been undertaken to

try to understand the capabilities of two vendors’ C

compilers. The vectorizing C compilers used for this

study are Version 4.1 of the Convex C2 Vectorizing C

544

@1991ACM 0-89791-459-7/91/0544$01,50

Compiler (cc) [Con91a] [Con91b], Version 1.0 of the
Convex Application C Compiler (at) [Con91c], and Re-
lease 3.0.0 of the Cray Standard C Compiler (SCC)on the
Cray 2 and Cray Y-MP [Cra90]. Since this study is test-
ing the compilers’ vectorization abilities, no user direc-

tives or special compilation flags are used. Many of

these kernels can be vectorized if the user uses such di-

rectives or flags, but this involves user analysis of their

code which violates the spirit of testing automatically

vectorizing C compilers.

Section 2 discusses the C version of the Argonne test

suite for vectorizing compilers [CDL88]. The ability of

the Convex and Cray C compilers to vectorize the suite

k ecmtrasted with the Fortran compilers and with each

other.

Section 3 describes a continuation of the Argonne

test suite study, but with purely unique C features and

constructs. Comparisons are made between the vecto-

rizing capabilities of the Cray and Convex C compilers.

Some observations are also made on certain C language

features that impact the vectorizing capability of a com-

piler.

Section 4 discusses the results of looking at a suite of

C kernels abstracted from scientific C applications.

Again, the ability of the Cray and Convex C compilers

to vectorize these application kernels h contrasted and

some comments are made on the actual use of certain

C features.

It should be mentioned that some compiler termi-

nology will be used to descriie the abilities of the com-
pilers. The reader might wish to look at [ASU96],
[Ban88], [P0188] or [W0182] for definitions and abetter

understanding of some of the terms.

2. C version of the Argonne test suite

A suite of Fortran loop kernels was collected at Ar-
gonne National Laboratory to test the effectiveness of

automatic vectorizing Fortran compilers [CDL88]. The

loops were written by writers of vectorizing compilers,

and test for specific vectorization features. Some results

for several vector architectures have been reported in
[CDL88] and [Nob89].

The Argonne test suite was translated from Fortran
to C adhering to a Fortran style. Some of the kernels ex-
plicitly test certain Fortran constructs, and were not
translated, so a total of 91 out of 100loops were success-
fully translated.. This suite of 91 kernels was then com-
piled using the vectorizing options of both the Convex
and Cray compilers. As a point of comparison, the suite
of 91 kernels was also compiled using the vectorizing
Fortran compilers on both architectures. The Fortran
compiler used on the Cray 2 was Version 4.0.3 of cft77
along with Version 3.0 of fpp. The Fortran compiler
used on the Convex C2 was Version 6.1 of fc.

The first thing discovered was that the Cray and Con-
vex cc compilers would not fully vectorize loops where
the arrays are passed in as arguments to the kernel.
When arrays are passed to functions in C, the pointer
to the array is passed. The only compiler that does the
necessary interprocedural analysis for this problem is
Convex’s Application Compiler. Figure 1demonstrates
a kernel where this problem arises. Figure 2 shows how
changing the arrays to global variables and not passing
them as parameters allows vectorization.

s171(a,b,n)
float a[],b[];
int n;
{ register int ~
for(i=@ ie~i+ +)

a[i”n] = a[i”n] + b[i]; }

/* Call from another routine “/
maino
{ float a[10000], b[10000];

intn=7z

s171(a,b,n>

}

Figure 1: A function with arrays as parameters

is NOT vectorized

The Argonne test suite attempts to test the effective-
ness of compiler optimizations on Iocal loop constructs,
as opposed to interprocedural constructs and problems.
Therefore, the parameter passing of arrays was elimi-
nated by changing to global arrays in the C version with
the Fortran version left unchanged.

float a[10000], b[10000];
intn =72;

S1710
{ register int ~

for(i=O; i