CS代写 APR2001 02APR2001 03APR2001 04APR2001 05APR2001 06APR2001 07APR2001 08APR20

SAS BASE PROGRAMMING
– Lecture 11 –

Objectives

Copyright By PowCoder代写 加微信 powcoder

 OUTPUT Statement
 Create Multiple Records from a Single Observation
 Write to Multiple SAS Data Sets
 Create a Single Record from Multiple Observations
 Create Running Totals  SUM Statement
 First./Last. variables
 With Single and Multiple BY variables
 DROP= and KEEP=
 Limit the variables written to a SAS data set
 FIRSTOBS= and OBS=
 Limit the number of observations read from a data set

Default Output
 By default, SAS writes a data record to the output data set when it reaches the RUN statement at the end of the DATA step.
data forecast;
set data2.growth; ;
2. Automatic Return
1. Automatic Output

The OUTPUT Statement
 If you use an OUTPUT statement in the DATA step
 SAS writes a record immediately
 SAS DOES NOT write automatically at the end of the DATA step
 With an OUTPUT statement, you can
 Create multiple records from one observation
 Write to multiple output data sets
 Combine information from multiple observations into a single record (when combined with RETAIN statement)

The OUTPUT Statement
 General form
OUTPUT ;

Create Multiple Obs. Example
 Take each patient’s current weight and forecast
forward for the next five years
 Assume 5% weight gain per year
 Example: currently weighs 175 lbs. We project that he will weigh 5% more each year:
Projected Weight
 Use the OUTPUT statement to write forecasted weight to the output data set for each of the next five years

Class Exercise 1
 Use the admit data set in the data1 folder to create
weight projections for the next 5 years.
 Create a new data set called work.weightproj
 Assume a 5% weight gain per year
 Create the variable year
 Calculate the variable projweight based on the previous weight and percent increase
 Create a report from the work.weightproj data set  Suppress observation numbers
 Display name, year, projweight
 Display ‘Patient Name’, ‘Year’, and ‘Projected Weight’ as column headers

Desired Output
Name Year
Murray, W 1 Murray, W 2 Murray, W 3 Murray, W 4 Murray, W 5 Almers, C 1 Almers, C 2 Almers, C 3 Almers, C 4 Almers, C 5 Bonaventure, T 1 Bonaventure, T 2 Bonaventure, T 3 Bonaventure, T 4 Bonaventure, T 5
Projected Weight
176.400 185.220 194.481 204.205 214.415 159.600 167.580 175.959 184.757 193.995 129.150 135.608 142.388 149.507 156.983

Correct Log Output
NOTE: There were 21 observations read from the data set DATA1.ADMIT.
NOTE: The data set WORK.WEIGHTPROJ has 105 observations and 12 variables.

Write to Multiple SAS Data Sets
 How can we use the data1.admit data set to create three separate output data sets?
 Patients with Low activity level
 Patients with Moderate activity level  Patients with High activity level

The DATA Statement
 General Form
DATA ;
Name of the FIRST data set being created by this data step
Name of the nth data set being created by this data step

data1.admit
Name Murray, W
Bonaventure, T Johnson, R LaMance, K Jones, M Reberson, P King, E
Pitts, D Eberhardt, S Nunnelly, A Oberon, M Peterson, V Quigley, M Cameron, L Underwood, K Takahashi, Y
Level Sex HIGH M
LOW F MOD F LOW M HIGH M MOD F MOD M LOW M LOW F HIGH F LOW F MOD M HIGH F MOD M LOW M MOD F
Age Height Weight 27 72 168
31 61 123 43 63 137 51 71 158 29 76 193 32 67 151 35 70 173 34 73 154 49 64 172 44 66 140 28 62 118 30 69 147 40 69 163 47 72 173 60 71 191 43 65 123

Class Exercise 2
 Use data1.admit
 Create the data sets: work.low, work.mod, work.high
 Output each record to the appropriate data set
 Create a report for work.low displaying Name, Level, Sex,
Height, and Weight. Suppress the observation numbers.
 Desired output: (work.low)
Bonaventure, T LaMance, K Pitts, D Eberhardt, S Oberon, M Underwood, K Ivan, H
Level Sex Age
Height Weight
61 123 71 158 73 154 64 172 62 118 71 191 63 139
LOW F LOW M LOW M LOW F LOW F LOW M LOW F
31 51 34 49 28 60 22

RETAIN Statement
 By default, SAS sets the values of all variables to be missing at the start of each iteration of the DATA step.
 Use the RETAIN statement to keep the value of a calculated variable available for use in the next iteration.
 General Form:
RETAIN variable-1 < . . . variable-n>;

 Patients sign up for a six-month weight loss program. They are weighed at the start of the program, and then at their Month 3 and Month 6 visits. The data set contains the weight and the date for each of the visits.
 We are interested in calculating how much weight each patient has lost since the last visit.

Class Exercise 3
 Create a data set called work.visits with the following data:
name visit weight
john 01/01/08 210 john 04/04/08 199 john 07/03/08 183 mary 02/14/08 175 mary 05/16/08 167 mary 08/04/08 153 dave 09/23/08 223 dave 12/28/08 215 dave 03/24/09 206

Class Exercise 3 – continued
 Create a data set called work.weightloss that reads in
the visits data set
 Calculate the weightchange using the last weight and current weight
 Desired output (v. 1):
Patient Visit
john 01/01/08
john 04/04/08 john 07/03/08 mary 02/14/08 mary 05/16/08 mary 08/04/08 dave 09/23/08 dave 12/28/08 dave 03/24/09
Visit in Last
Weight Weight
199 11 199 183 16 183 175 8 175 167 8 167 153 14 153 223 -70 223 215 8 215 206 9 206

First.BY-Variable & Last.BY-Variable
 When a SORTED data set is read into DATA step
using the SET and BY statements
 Two variables are created that can be used to identify the First record in the BY group and the Last record in the BY group.
 These variables are only available for the duration of the DATA step.
 General Form
First.BY-variable Last.BY-variable

Class Exercise 4
 Now modify your data step for weightloss such that  It creates a new baseweight for each person
 It calculates the difference between baseweight and weight
 Desired Output (v. 2):
Subject Visit
dave 09/23/08
dave 12/28/08
dave 03/24/09 john 01/01/08 john 04/04/08 john 07/03/08 mary 02/14/08 mary 05/16/08 mary 08/04/08
Visit Baseline in
Weight Weight
206 223 17 210 210 0 199 210 11 183 210 27 175 175 0 167 175 8 153 175 22
First.Name Last.Name

One Record from Multiple Obs.
 Can use the RETAIN and OUTPUT statements, and First./Last. variables, to combine information from multiple data records, and write only one “summary” record

Class Exercise 5
 Modify your data step for weightloss such that it only displays the total weight change for each person
 Desired Output (v. 3):
Visit Baseline in
Weight Weight
206 223 17
183 210 27 153 175 22

Create a Running Total
 There are a few methods for creating a running total.
 Create a running total (SaleRT) for the variable
SaleAmt located in data2.daysales
 First, create a data set named SaleTotal1  Use the addition (+) operator
 Next, create a data set named SaleTotal2  Use the sum() function

Running Total Example
data SaleTotal1;
set data2.daysales;
if _n_=1 then SaleRT = 0; SaleRT = SaleRT + SaleAmt; retain;
data SaleTotal2;
set data2.daysales;
SaleRT = sum(SaleRT,SaleAmt); retain;
Successfully creates a running total
Successfully creates a running total

The SUM Statement
 The SUM statement (General Form)
 Creates the variable on the left side of the plus sign, if it does
not already exist
 Initializes the variable to zero before the first iteration of the DATA step
 Adds the value of the expression to the variable at execution
 Automatically retains the variable
 Ignores missing values
Variable + Expression;

Running Total Example
data SaleTotal3;
set data2.daysales;
SaleRT+SaleAmt;
SaleDate SaleAmt
498.49 1444.99 2439.96 3004.55 3787.56 4016.38 4946.95 5158.42 5314.65
01APR2001 02APR2001 03APR2001 04APR2001 05APR2001 06APR2001 07APR2001 08APR2001 09APR2001
498.49 946.50 994.97 564.59 783.01 228.82 930.57 211.47 156.23
Accumulating Variable

Accumulate a Total for BY Groups
 We can accumulate a total for each group of values. Note: The input data set must be pre-sorted on the BY
variable(s).
1. Set the Accumulator Variable to zero at the start of each BY group.
2. Increment the accumulating variable with a sum statement.
3. Output only the last observation of each BY group.

Accumulated Totals Example
 Use the regsals data set in the data2 folder
 Create a data set named divsalaries that only contains the
variables div and divsal
 Calculate divsal to be the accumulated total for each division

Accumulated Total Example
proc sort data=data2.regsals out=divsort; by Div;
data divsalaries;
set divsort;
if First.Div then DivSal=0; DivSal+Salary;
if Last.Div;
keep Div DivSal;

More First./Last. Variables
 What if we are now interested in grouping by region as well as division?
 We can create multiple First./Last. variables by specifying multiple BY-variables in our DATA step.

More First./Last. Variables
proc sort data=data2.regsals out=regsort; by Region Div;
data regtotal;
set regsort;
by Region Div; run;

Control Variable Output
 In the DATA step:
 By default, SAS writes all the variables from the input data
set(s) to every output data set.
 The DROP and KEEP statements can be used to control which variables are written to every output data sets.

The DROP and KEEP Statements

DROP= & KEEP= Data Set Options
SAS-data-set(DROP=variable-1 variable-2
… variable-n)
SAS-data-set(KEEP=variable-1
variable-2
… variable-n)

Controlling Variable Input and Output

Control Variable Input and Output
data army(keep=Code Airport);
set data2.military(drop=City State Country); if Type eq ‘Army’ then output;

Control Variable Input

Control Variable Output

Control Which Obs. are Read
 The OBS= data set option specifies the last observation to be read from the input data set
 General Form:  Example:
SAS-data-set(OBS=n)
data army;
set data2.military(obs=25); if Type eq ‘Army’ then output;

Control Which Obs. are Read

Control Which Obs. are Read
 The FIRSTOBS= data set option allows you to skip records at the top of the input file.
 General Form:
 Example (starts reading at 11th record):
SAS-data-set(FIRSTOBS=n)
data military;
set data2.military(firstobs=11 obs=25); if Type eq ‘Army’ then output;

Control Which Obs. are Read

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com