CS计算机代考程序代写 flex /* INTRODUCTION TO SAS MACROS */

/* INTRODUCTION TO SAS MACROS */

/* WHY USE MACROS?
– If you find yourself writing similar code over and over again.
– Make code more efficient

Macros and macro variables can help in several ways:

1) You can make a small change in your program and
have SAS echo that change throughout your program
(easy to maintain a program) [e.g. pathname]

2) Macros allow you to write a piece of code and use it over
and over again in the same or different programs.

3) Can make your programs data driven, letting SAS decide what
to do based on the actual data values.

[From SAS Macro Programming for Beginners]
*/

/* MACRO VS MACRO VARIABLES

Names of macro variables start with an ampersand (&)
– Like a standard data variable except that it does not belong to a data set
and has only a single value which is always character.
– The value of a macro variable could be a variable name,
a numeral, or any text you want substituted in your program

Names of macros start with a percent sign (%)
– Larger piece of a program that can contain complex logic
– Often, contains macro variables.

/* MACRO LANGUAGE: statement and syntax structure that is used
by the macro facility and has its own terminology

SAS programs are executed in a series of DATA and PROC steps, one step at a time.
GLOBAL statements (ex. TITLE, FOOTNOTE, %LET) can exist outside of these steps
and are executed immediately when encountered.
For each step, SAS first checks to see if macro references exist.
If they exist, these are resolved first and become part of
the SAS code passed to the DATA or PROC step processor.

GLOBAL macro variable: has a single value available to all macros
within the program. Macro variables defined outside of any macro
will be global (e.g. pathname).

LOCAL macro variables have values that are available only within
the macro in which they are defined. Therefore, macro variables
defined in one macro may be undefined within another macro.

Often times you will need both global and local macro variables

SAS macros allow for the creation of flexible, reusable code
that can save you time and effort.

It is important to understand that when you write macro code,
you are writing a program that writes a program. When you
write macro code, there is an extra step.

One suggestion to avoid a programming “headache” is to develop the program
in a piecewise fashion. Write program in Standard Code first,
and then once you have made sure it is bug-free, convert it to macro logic one
feature at a time. Much easier to debug program when it is
not in a macro or does not reference macro variables.

[From Carptenter’s Complete Guide to the SAS Macro Language; Little SAS Book]
*/

/* CREATING AND REFERENCING MACRO VARIABLES
Two types of macro variables
1) Automatic – provided by SAS
2) User defined.
*/
/*
The %let defines a global macro variable – we can then use that
variable throughout our program;

/* SUBSTITUTING TEXT WITH %LET – %LET simply assigns a value to a
macro variable – stored as text.
Note: mathematical expression are not evaluated*/

%let pathname=c:\BIS679A\Lecture5and6\;

*Read in grocery data;
data grocery;
infile “&pathname.grocery sales.txt” firstobs=2 missover;
input YEAR JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC;
run;

*The MISSOVER statement tells SAS that when you try to read past the end of the line
just return a missing value. The default behavior is the FLOWOVER option in which case
SAS will move on to the next line to look for enough values to satisfy the input statement;

*Read in the retail sales data;
data retail;
infile “&pathname.retail sales.txt” firstobs=2 missover;
input YEAR JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC;
run;

*When else might %LET come in handy?;

*Example where we want to print monthly reports of sales;
*You could change the program every time you wanted a different data set used,
or a different month, or you could use the %let statement and only need to
modify the initial value;

*No quotation marks are needed around the value even when it contains
characters, quotation marks will be part of the text;

*Everything between the equals sign and the semicolon i.e. pathname=c:\BIS679A\Lecture5and6\;

*(except for blanks which will be trimmed)
becomes part of the value of the macro variable;
*These are global variables because they are created
outside a macro in open code;
%Let month=Oct; * general form %LET macro-variable-name = value;
%Let data=Grocery; *the case of the macro variable,
is the same as the case in which it was typed in the %LET statement;
* Case sensitivity will come into play only if it impacts how the macro is
executed. If it is executed as a variable name – it will not matter,
but if it is excuted as a variable value, it will matter;

*This option prints in your log how the macro variable is resolved.
Without it, it is harder to debug your code. When
you have completely debugged your macro, you can use option nosymbolgen;
options symbolgen;

*Plot the sales over time for a given year;
proc sgplot data=&data;
xaxis type=discrete;
series x=year y=&month;
title “Annual &data Sales for the Month of &month”;
*When using a macro variable need to use double quotes;
yaxis label=”Sales for the Month of &month”;
run;

*TYPE= SERIES | STEP
specifies how the data points for the lower and upper band boundaries are connected. You can specify one of the following:
SERIES
the data points are connected directly using line segments, as in a series plot.
STEP
the data points are connected using a step function, as in a step plot.
Default:SERIES
X2AXIS
assigns the variable that is assigned to the primary (bottom) horizontal axis to the secondary (top) horizontal axis.
Y2AXIS
assigns the variable that is assigned to the primary (left) vertical axis to the secondary (right) vertical axis.
;

*Look at the log – by using the symbolgen option –
you see how SAS resolves the macro variables included in the above code;

*Using the UPCASE function – this is one of many functions that can
have a macro variable as an argument, and can
be used in statements, in which DATA step functions cannot be used;
*%Upcase is a SAS macro;

proc sgplot data=&data;
xaxis type=discrete;
series x=year y=&month;
title “Annual %Upcase(&data) Sales for the Month of %Upcase(&month)”;
yaxis label=”Sales for the Month of %Upcase(&month)”;
run;

*Suppose that our program was set as above.
Now if we wanted to get the same plot for a different month
and data set only need to change the variable assignment
and rerun the code instead of modify within the code.;
%let month=Nov;
%let data=Retail;

*NOTE: The example above is short, but imagine if you had a long
program with many occurrences of the macro variable
and wanted to produce a lot of tables and figures for the data
set for a given month;
*Where might you encounter something like this?;

*Another example;
*Read in the Berkeley Guidance data;
proc import datafile=”&pathname.Berkeley Guidance Data.txt” DBMS=CSV out=bgd replace;
label wt2=”Weight at age 2 (kg)”;
label ht2=”Height at age 2 (cm)”;
label wt9=”Weight at age 9″;
label ht9=”Height at age 9″;
label lg9=”Leg circumference at age 9 (cm)”;
label st9=”A composite measure of strength at age 9 (high values=stronger)”;
label wt18=”Weight at age 18″;
label ht18=”Height at age 18″;
label lg18=”Leg circumference at age 18″;
label SOMA=”Somatotype, seven point scale, 1=slender, 7=obese”;
run;

proc format;
value gender 0=”Male”
1=”Female”;
run;

data bgd; set bgd;
format sex gender.;
run;

*We want to subset the data by gender;
%Let gender=0;
%Let age=2;

*Concatenating Macro Variables with Other text;
*What do you notice about the title and how gender is displayed?;
title “Univariate distribution of Height and Weight for AGE &age for Gender=&gender”;
proc means data=bgd;
where sex=&gender;
var wt&age ht&age; *Since we have many variable names with the same prefix;
run;

*Use the %Sysfunc to make sure that the format is printed
out instead of the variable value i.e. instead of seeing gender=0 see gender=male;

*http://support.sas.com/documentation/cdl/en/mcrolref/62978/HTML/default/viewer.htm#p1o13d7wb2zfcnn19s5ssl2zdxvi.htm;
*http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212564.htm;

title “Univariate distribution of Height and Weight for AGE &age
for Gender=%Sysfunc(putn(&gender, gender.))”;
proc means data=bgd;
where sex=&gender;
var wt&age ht&age;
run;

/*Note that Syntax is
formatted-val=PUTC(char-val,format);
formatted-val=PUTN(num-val,format); */

/* Another reference about %Sysfunc: http://www2.sas.com/proceedings/sugi23/Advtutor/p44.pdf */

*Other ways to concatenate macro variable
1) Putting the macro variable first;
%Let variable=wt;
title “Means of weight”;
proc means data=bgd;
var &variable.2 &variable.9 &variable.18;
*When putting the macro variable first,
need a period to tell SAS where the variable name ends (like with pathname)-note how this is different
from above where the macro variable was placed second;
run;

*2) Can put two macro variables back to back;
title “Mean weight of age &age”;
proc means data=bgd;
var &variable&age;
run;

*3) If your variable has a period in the name, then you need to include two periods
Where this may be important is in file naming;

*Exporting the bgd data set;
proc export data=bgd outfile=”&pathname&variable..csv” dbms=csv replace;
run;

*Using a list of variables;
%let weight=wt2 wt9 wt18;
title “Correlation between weight variables”;
proc corr data=bgd;
var &weight;
run;

*—————————————–;

*Create a categorical variable within the bgd data set
(pretend this already existed in the data set);
proc format;
value $bt “S”=”Slender”
“F”=”Obese”
“O”=”Other”;
run;

data bgd; set bgd;
if soma =1 then type=”S”;
else if soma=7 then type=”F”;
else type=”O”;
/*if soma =. then type=.;*/
format type $bt.;
run;

%Let bodytype=S;

*Subset by Body Type category – Since the value of bodytype is categorical,
we need to put it in quotes;
title “Mean Weights for Body Type=%Sysfunc(putc(&bodytype, $bt.))”;
proc means data=bgd;
where type=”&bodytype”;
var &weight;
run;

*Footnote the output with the date that the output
was created using SAS internal macro variable,
&SYSDATE (or &SYSDATE9) and &SYSDAY;
title “Mean Weights for Body type=%Sysfunc(putc(&bodytype, $bt.))”;
proc means data=bgd;
where type=”&bodytype”;
var &weight;
footnote “This output was created &SYSDAY, &SYSDATE9”;
run;

title “Mean Weights for Body type=%Sysfunc(putc(&bodytype, $bt.))”;
proc means data=bgd;
where type=”&bodytype”;
var &weight;
footnote “This output was created &SYSDAY, %sysfunc(putn(‘&SYSDATE’d, mmddyy10.))”;
*What if you wanted to change the format of the date?;
run;

*How did I figure that out?

Need to navigate the help
http://support.sas.com/documentation/cdl/en/mcrolref/61885/HTML/default/viewer.htm#a000489463.htm
*/

——————————————–

/* DATA-DRIVEN PROGRAMS

Cannot use the %LET statement to store data set variables in a macro variable
– so one option is the CALL SYMPUT routine.

*https://support.sas.com/resources/papers/proceedings/proceedings/sugi29/052-29.pdf

*CALL SYMPUT is a SAS® language routine that assigns a value produced in a DATA step to a macro variable. It is
one of the DATA step interface tools that provides a dynamic link for communication between the SAS language and
the macro facility.

CALL SYMPUT(“macro-variable-name”, value): macro variable needs to be in quotes,
value is what you want to assign to that macro
variable. Value can be the name of a variable whose value SAS will use,
or it can be a constant value enclosed in quotation marks

Often used in IF-THEN statements

NOTE: Cannot create a macro variable with CALL SYMPUT
and use it in the same DATA step. Sas does not assign a value to
the macro variable until the data step executes.

*We are interested in calculating the mean weight at age 2 of
participants in the Berkeley data set
and printing out those who are below the 10th percentile;

*Here we create macro variables within our code;

*First we want to calculate the 10th percentile;
proc means data=bgd;
var wt2;
output out=stats p10=p10cutoff;
run;
*FYI – SAS automatically generates percentiles
p1= p5=p10= p25=p50= p75=p90= p95=p99= in proc means;

*Second, we want to use a data step, but do not create a new data set.
_NULL_- useful when you only want to create the macro variable;

proc print data=stats; run;

data _NULL_;
set stats;
date=put(today(), mmddyy10.);
*First argument is name of the macro variable, second argument is the value
to store in macro variable
Macro variable is constant, so enclosed in quotes.
The value to be stored, is a data set variable, thus not enclosed in quotes;
call symput(‘date’, date);
call symput(‘P10’, put(p10cutoff, 4.1));
run;

*Third, print out the individuals whose weight is below the 10th percentile;
title “Those individuals who have a weight at age 2 below &p10”;

*note that instead of a %Let statement you have made p10 be a macro (&p10) using call symput;

footnote “This output was created on &date”;
proc print data=bgd;
where wt2 < &P10; run; /* Can clear the footnote */ footnote; *this statement cancels all existing footnotes; *-------------------; *It is possible to make the code more efficient and dynamic using additional macro variables than just those created within our code; *This is especially important if you want to run the code over different data set for different variables and statistics.; %let library=work; *Defaulting to the working directory - but you could link it to your path e.g library=&pathname; %let dataset=bgd; %let var=wt2; %let statistic=mean; *remember when putting the macro variable first, need a period to tell SAS where the variable name ends (like with pathname); proc means data=&library..&dataset; var &var; output out=stats &statistic=stat; run; *this gives you proc means data=work.bgd; *var wt2; *output out=stats mean=stat; *run; proc print data=stats; run; data _NULL_; set stats; call symput('date', put(today(), worddate.)); *Combined two of the statements above into one statement, i.e. date=put(today(), mmddyy10.) and call symput('date', date); call symput('stat', put(stat, 4.1)); run; title "Those individuals who have weight at age 2 below the &statistic &var cutoff of &stat"; footnote "This output was created on &date"; proc print data=&library..&dataset; where &var<&stat; run; *Could also have used the %sysfunc instead of the call symput for the date; *Remember %SYSFUNC is a macro fucntion that allows us to convert a macro variable using a format without having to resort to a data step; footnote "Created on %sysfunc(today(), mmddyy8.)"; proc print data=&library..&dataset; where &var<&stat; run; *---------------------------; *Or could make the code into a macro definition (e.g. a function) - even more dynamic and reusable; /* MODULAR CODE WITH MACROS - Any time repeating the same program statements over and over, consider a macro. Any time want group of statements run in your program, you use the name instead of re-typing all of the statements %MACRO: mark beginning of macro %MEND: marks the end of a macro Can invoke the macro by %macro-name (and does not need a semicolon to call) */ *This macro will not take any parameters/arguments; *Thus it requires that all macro variables defined before use outside of the macro creation; %macro statistic; *beginning of the macro; proc means data=&library..&dataset noprint; var &var; output out=stats &statistic=stat; run; data _NULL_; set stats; call symput('date', put(today(), mmddyy10.)); call symput('stat', put(stat, 10.2)); run; title "Those individuals who are below the &statistic &var cutoff of &stat"; footnote "This output was created on &date"; proc print data=&library..&dataset; where &var<&stat; run; %mend statistic; *end of the macro; *Call the statistic macro - need to define all of the macro variables first; %let library=work; %let dataset=bgd; %let var=wt18; %let statistic=p5; *MUST MAKE SURE TO RUN THE MACRO BEFORE YOU CALL THE MACRO; %statistic * do not need to use a semicolon to use the macro definition; *------------------------------------------------; /* ADDING PARAMETERS Parameters allow you to repeat the same statements, but for a different data set or product (without having to redefine global macro variables). Parameters are macro variables whose value you set when you invoke the macro. (within the call - these are local macro variables - only exist for the macro) */ *Adding parameters to the statistic macro code; *These are positional parameters (need to be entered in the order in which they appear) and all are required for the macro to run (no default options); %macro statistic(library, dataset, var, statistic); proc means data=&library..&dataset noprint; var &var; output out=stats &statistic=stat; run; data _NULL_; set stats; call symput('date', put(today(), mmddyy10.)); call symput('stat', put(stat, 10.2)); run; title "Those individuals who are below the &statistic &var cutoff of &stat"; footnote "This output was created on &date"; proc print data=&library..&dataset; where &var<=&stat; run; title; footnote; %mend statistic; *Must always run the macro code first before can invoke its use; *Use of positional parameters -need to specify values in the correct order; %statistic(work, bgd, ht2, median) *You can provide default values for the macro variables; *NOTE: If defaults are provided, then these become optional parameters. WHY?; %macro statistic(library=work, dataset=bgd, var=wt2, statistic=mean); *list keyword and then the default value; proc means data=&library..&dataset; var &var; output out=stats &statistic=stat; run; data _NULL_; set stats; call symput('date', put(today(), mmddyy10.)); call symput('stat', put(stat, 10.2)); run; title "Those individuals who are below the &statistic &var cutoff of &stat"; footnote "This output was created on &date"; proc print data=&library..&dataset; where &var<=&stat; run; title; footnote; %mend statistic; *If you want to use the default value; %statistic() *If you want to change one of the parameters; %statistic(statistic=p20) * Default is used for all those keywords not specified; *Do not need to specify the value in the order when use keyword; %statistic(statistic=median, var=st18) *--------------------------------------------------------------------------------; * %DO statement; proc import datafile="&pathname.TreatOnly3QOL.csv" out=qol DBMS=CSV replace; run; /***************************************************************************************************** Macro name: qol Purpose: This macro will do one of two things depending on whether you specify a value of 0 for indicator. It will conduct a Kaplan-Meier analysis for continuous survival data by a treatment variable or it will do a vertical bar graph for grouped survival data by treatment (does not incorporate censoring). ROADTRIP HERE TO STAT/SAS review!! (did on 9/27/2020) proc lifetest; proc phreg; proc sqplot-what do these procs do? when to use? Author: Susie Q Creation Date: October 8, 2015 Revision Date: September 30, 2020 SAS version: 9.4 Required Parameters: data=Name of the data set var= Name of the survival variable (e.g. time) treat = treatment variable Optional Parameters: censor = censoring variable (when continuous survival) indicator = specify as 0 when want K-M plot, otherwise don't need to include or can include another number Cens_ind = value that indicates censoring, cens_ind default is 0 Sub-macros called: Example: %qol(data=qol, var=surv, censor=ind, indicator=0, treat=treat) *****************************************************************************************************/ %macro qol(data=, var=, censor=, treat=, indicator=, cens_ind=0); %if &indicator=0 %then %do; title " output of &var by &treat"; proc lifetest data=&data; time &var.*&censor(&cens_ind); strata &treat; run; %end; %else %do; title "Vertical Bar Graph of &var by &treat"; proc sgplot data=&data; vbar &var/group=&treat groupdisplay=cluster; run; %end; title; %mend qol; %qol(data=qol, var=surv, censor=ind, indicator=0, treat=treat) %qol(data=qol, var=ef, treat=treat) *---------------------------------------------------------------------------------------------;