BIS679A
BIS679A
HW#1 Lab
Formats
Can be embedded in working file or external-in this lab will be external
Create format file (format.sas or other name)
Create main program and “call” format.sas to be applied!
Demographics data set
ID V1 V2 V3 V4 V5 V6 V7
1 M 0 2 0 0 0 1
2 M 0 1 0 1 0 1
3 M 1 0 0 2 1 1
4 F 1 0 0 2 0 1
5 M 1 0 0 0 0 1
6 M 1 0 0 0 2 1
7 F 1 0 0 1 0 1
8 F 1 2 0 2 0 1
9 F 1 0 0 1 1 0
Formats.SAS
*YOU NEED TO HAVE CREATED THE LIBRARY WITHIN THE MAIN PROGRAM – OTHERWISE THIS WILL NOT WORK OR YOU WOULD NEED TO INCLUDE THE LIBRARY AT THE START;
proc format library = work.demo;
value $gender “M”=”Male“ “F”=”Female”;
value employ 1=”Employed“ 0=”Unemployed”;
value insurance 1=”Uninsured” 2=”Public” 0=”Private”;
value ed 1=”< high school" 0=">= high school”;
value marriage 0=”Married/Relationship” 1=”Never Married“ 2=”Divorced/Widowed/separated”;
value alcohol 0=”<= 2 drinks/week" 1="> 2 drinks/week”;
value smoking 0=”Never” 1=”Former” 2=”Current”;
run;
Main_ProGRAM.sas
*Call the external format file;
*Include the format file for the data set;
%include “&pathname.Formats.sas”;
*Then ask sas to locate the formats (called work.demo) created by the Formats file;
options fmtsearch=(work.demo);
*Would use a.demo rather than work.demo if permanent file;
Then import data and apply formats to it-remember that proc import does not let you use format -have to do in a separate data step!
proc import datafile=”&pathname.Demographics.xlsx” out=demo2 replace;
label V1=”Gender”;
label V2=”Employment Status”;
label V3=”Insurance Status”;
label V4=”Education Level”;
label V5=”Marital Status”;
label V6=”Smoking Status”;
label V7=”Alcohol Consumption”;;
run;
data demo2; set demo2;
format v1 $gender. v2 employ. v3 insurance. v4 ed. v5 marriage. v6 smoking. v7 alcohol.; run;
MISSING
SAS represents missing data in a number of ways. Usually the basic rule is that character values are represented by a blank (‘ ‘) or a null (”) and numeric values are represented by a single period (.).
IF then to remove or change
See what happens when read data in with proc import re missing
Replace missing
Many ways to do things
Can replace with mean of those not missing
For lab-do this only for continuous variables-think about what proc to use to get the mean
PROC STDIZE is a nice option!
data lab;
input age ht wt; cards;
66 60 140
55 60 150
72 . 145
44 61 .
;
proc print data=lab; title ‘With missing’; run;
proc stdize reponly method=mean out=lab1;
var age ht wt;
run;
proc print data=lab1; title ‘Mean replaces missing’; run;
Obs age ht wt
1 66 60 140
2 55 60 150
3 72 . 145
4 44 61 .
Obs age ht wt
1 66 60.0000 140
2 55 60.0000 150
3 72 60.3333 145
4 44 61.0000 145
/docProps/thumbnail.jpeg