SAS BASE PROGRAMMING
– Lecture 11 –
Objectives
Copyright By PowCoder代写 加微信 powcoder
OUTPUT Statement
Create Multiple Records from a Single Observation
Write to Multiple SAS Data Sets
Create a Single Record from Multiple Observations
Create Running Totals SUM Statement
First./Last. variables
With Single and Multiple BY variables
DROP= and KEEP=
Limit the variables written to a SAS data set
FIRSTOBS= and OBS=
Limit the number of observations read from a data set
Default Output
By default, SAS writes a data record to the output data set when it reaches the RUN statement at the end of the DATA step.
data forecast;
set data2.growth;
2. Automatic Return
1. Automatic Output
The OUTPUT Statement
If you use an OUTPUT statement in the DATA step
SAS writes a record immediately
SAS DOES NOT write automatically at the end of the DATA step
With an OUTPUT statement, you can
Create multiple records from one observation
Write to multiple output data sets
Combine information from multiple observations into a single record (when combined with RETAIN statement)
The OUTPUT Statement
General form
OUTPUT
Create Multiple Obs. Example
Take each patient’s current weight and forecast
forward for the next five years
Assume 5% weight gain per year
Example: currently weighs 175 lbs. We project that he will weigh 5% more each year:
Projected Weight
Use the OUTPUT statement to write forecasted weight to the output data set for each of the next five years
Class Exercise 1
Use the admit data set in the data1 folder to create
weight projections for the next 5 years.
Create a new data set called work.weightproj
Assume a 5% weight gain per year
Create the variable year
Calculate the variable projweight based on the previous weight and percent increase
Create a report from the work.weightproj data set Suppress observation numbers
Display name, year, projweight
Display ‘Patient Name’, ‘Year’, and ‘Projected Weight’ as column headers
Desired Output
Name Year
Murray, W 1 Murray, W 2 Murray, W 3 Murray, W 4 Murray, W 5 Almers, C 1 Almers, C 2 Almers, C 3 Almers, C 4 Almers, C 5 Bonaventure, T 1 Bonaventure, T 2 Bonaventure, T 3 Bonaventure, T 4 Bonaventure, T 5
Projected Weight
176.400 185.220 194.481 204.205 214.415 159.600 167.580 175.959 184.757 193.995 129.150 135.608 142.388 149.507 156.983
Correct Log Output
NOTE: There were 21 observations read from the data set DATA1.ADMIT.
NOTE: The data set WORK.WEIGHTPROJ has 105 observations and 12 variables.
Write to Multiple SAS Data Sets
How can we use the data1.admit data set to create three separate output data sets?
Patients with Low activity level
Patients with Moderate activity level Patients with High activity level
The DATA Statement
General Form
DATA
Name of the FIRST data set being created by this data step
Name of the nth data set being created by this data step
data1.admit
Name Murray, W
Bonaventure, T Johnson, R LaMance, K Jones, M Reberson, P King, E
Pitts, D Eberhardt, S Nunnelly, A Oberon, M Peterson, V Quigley, M Cameron, L Underwood, K Takahashi, Y
Level Sex HIGH M
LOW F MOD F LOW M HIGH M MOD F MOD M LOW M LOW F HIGH F LOW F MOD M HIGH F MOD M LOW M MOD F
Age Height Weight 27 72 168
31 61 123 43 63 137 51 71 158 29 76 193 32 67 151 35 70 173 34 73 154 49 64 172 44 66 140 28 62 118 30 69 147 40 69 163 47 72 173 60 71 191 43 65 123
Class Exercise 2
Use data1.admit
Create the data sets: work.low, work.mod, work.high
Output each record to the appropriate data set
Create a report for work.low displaying Name, Level, Sex,
Height, and Weight. Suppress the observation numbers.
Desired output: (work.low)
Bonaventure, T LaMance, K Pitts, D Eberhardt, S Oberon, M Underwood, K Ivan, H
Level Sex Age
Height Weight
61 123 71 158 73 154 64 172 62 118 71 191 63 139
LOW F LOW M LOW M LOW F LOW F LOW M LOW F
31 51 34 49 28 60 22
RETAIN Statement
By default, SAS sets the values of all variables to be missing at the start of each iteration of the DATA step.
Use the RETAIN statement to keep the value of a calculated variable available for use in the next iteration.
General Form:
RETAIN variable-1 < . . . variable-n>;
Patients sign up for a six-month weight loss program. They are weighed at the start of the program, and then at their Month 3 and Month 6 visits. The data set contains the weight and the date for each of the visits.
We are interested in calculating how much weight each patient has lost since the last visit.
Class Exercise 3
Create a data set called work.visits with the following data:
name visit weight
john 01/01/08 210 john 04/04/08 199 john 07/03/08 183 mary 02/14/08 175 mary 05/16/08 167 mary 08/04/08 153 dave 09/23/08 223 dave 12/28/08 215 dave 03/24/09 206
Class Exercise 3 – continued
Create a data set called work.weightloss that reads in
the visits data set
Calculate the weightchange using the last weight and current weight
Desired output (v. 1):
Patient Visit
john 01/01/08
john 04/04/08 john 07/03/08 mary 02/14/08 mary 05/16/08 mary 08/04/08 dave 09/23/08 dave 12/28/08 dave 03/24/09
Visit in Last
Weight Weight
199 11 199 183 16 183 175 8 175 167 8 167 153 14 153 223 -70 223 215 8 215 206 9 206
First.BY-Variable & Last.BY-Variable
When a SORTED data set is read into DATA step
using the SET and BY statements
Two variables are created that can be used to identify the First record in the BY group and the Last record in the BY group.
These variables are only available for the duration of the DATA step.
General Form
First.BY-variable Last.BY-variable
Class Exercise 4
Now modify your data step for weightloss such that It creates a new baseweight for each person
It calculates the difference between baseweight and weight
Desired Output (v. 2):
Subject Visit
dave 09/23/08
dave 12/28/08
dave 03/24/09 john 01/01/08 john 04/04/08 john 07/03/08 mary 02/14/08 mary 05/16/08 mary 08/04/08
Visit Baseline in
Weight Weight
206 223 17 210 210 0 199 210 11 183 210 27 175 175 0 167 175 8 153 175 22
First.Name Last.Name
One Record from Multiple Obs.
Can use the RETAIN and OUTPUT statements, and First./Last. variables, to combine information from multiple data records, and write only one “summary” record
Class Exercise 5
Modify your data step for weightloss such that it only displays the total weight change for each person
Desired Output (v. 3):
Visit Baseline in
Weight Weight
206 223 17
183 210 27 153 175 22
Create a Running Total
There are a few methods for creating a running total.
Create a running total (SaleRT) for the variable
SaleAmt located in data2.daysales
First, create a data set named SaleTotal1 Use the addition (+) operator
Next, create a data set named SaleTotal2 Use the sum() function
Running Total Example
data SaleTotal1;
set data2.daysales;
if _n_=1 then SaleRT = 0; SaleRT = SaleRT + SaleAmt; retain;
data SaleTotal2;
set data2.daysales;
SaleRT = sum(SaleRT,SaleAmt); retain;
Successfully creates a running total
Successfully creates a running total
The SUM Statement
The SUM statement (General Form)
Creates the variable on the left side of the plus sign, if it does
not already exist
Initializes the variable to zero before the first iteration of the DATA step
Adds the value of the expression to the variable at execution
Automatically retains the variable
Ignores missing values
Variable + Expression;
Running Total Example
data SaleTotal3;
set data2.daysales;
SaleRT+SaleAmt;
SaleDate SaleAmt
498.49 1444.99 2439.96 3004.55 3787.56 4016.38 4946.95 5158.42 5314.65
01APR2001 02APR2001 03APR2001 04APR2001 05APR2001 06APR2001 07APR2001 08APR2001 09APR2001
498.49 946.50 994.97 564.59 783.01 228.82 930.57 211.47 156.23
Accumulating Variable
Accumulate a Total for BY Groups
We can accumulate a total for each group of values. Note: The input data set must be pre-sorted on the BY
variable(s).
1. Set the Accumulator Variable to zero at the start of each BY group.
2. Increment the accumulating variable with a sum statement.
3. Output only the last observation of each BY group.
Accumulated Totals Example
Use the regsals data set in the data2 folder
Create a data set named divsalaries that only contains the
variables div and divsal
Calculate divsal to be the accumulated total for each division
Accumulated Total Example
proc sort data=data2.regsals out=divsort; by Div;
data divsalaries;
set divsort;
if First.Div then DivSal=0; DivSal+Salary;
if Last.Div;
keep Div DivSal;
More First./Last. Variables
What if we are now interested in grouping by region as well as division?
We can create multiple First./Last. variables by specifying multiple BY-variables in our DATA step.
More First./Last. Variables
proc sort data=data2.regsals out=regsort; by Region Div;
data regtotal;
set regsort;
by Region Div; run;
Control Variable Output
In the DATA step:
By default, SAS writes all the variables from the input data
set(s) to every output data set.
The DROP and KEEP statements can be used to control which variables are written to every output data sets.
The DROP and KEEP Statements
DROP= & KEEP= Data Set Options
SAS-data-set(DROP=variable-1 variable-2
… variable-n)
SAS-data-set(KEEP=variable-1
variable-2
… variable-n)
Controlling Variable Input and Output
Control Variable Input and Output
data army(keep=Code Airport);
set data2.military(drop=City State Country); if Type eq ‘Army’ then output;
Control Variable Input
Control Variable Output
Control Which Obs. are Read
The OBS= data set option specifies the last observation to be read from the input data set
General Form: Example:
SAS-data-set(OBS=n)
data army;
set data2.military(obs=25); if Type eq ‘Army’ then output;
Control Which Obs. are Read
Control Which Obs. are Read
The FIRSTOBS= data set option allows you to skip records at the top of the input file.
General Form:
Example (starts reading at 11th record):
SAS-data-set(FIRSTOBS=n)
data military;
set data2.military(firstobs=11 obs=25); if Type eq ‘Army’ then output;
Control Which Obs. are Read
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com