程序代写 CLM95;

SAS BASE PROGRAMMING
– Lecture 10 –

Objectives

Copyright By PowCoder代写 加微信 powcoder

 PROC GPLOT
 SYMBOL statement  PLOT statement
 Output Delivery System  HTML
 More SAS Functions  Parse text data
 Truncate numeric data

PROC GPLOT
 Use the GPLOT procedure to produce scatterplots and line graphs
 General form
PROC GPLOT DATA=SAS-data-set;
PLOT vertical-variable*horizontal-variable ;
RUN; QUIT;
proc gplot data=data1.admit; plot weight * height;
run; quit;

GPLOT Example Output

GPLOT Example
 Produce a plot of the number of passengers by date for flight number 114 over a one-week period.
proc gplot data=data1.flight114;
*this selects one week of flights;
where date between ’02mar2001’d and ’08mar2001’d;
plot Boarded*Date;
run; quit;

GPLOT Example Output

SYMBOL Statement
 You can use the SYMBOL statement to do the following:
 Define plotting symbols
 Draw lines through the data points
 Specify the width and color of the plotting symbols and lines  General Form
 n = 1 – 255
SYMBOLn options;

SYMBOL Statement Options
 Options for the shape of the symbol
 Selected symbol values include the following
VALUE=symbol | V=symbol

SYMBOL Statement Options
 Interpolation
 Selected interpolation values
 Note: Combining symbol value=none with interpolation=join produces a line-only plot
I=interpolation

SYMBOL Statement Options
proc gplot data=data1.flight114;
where date between ’02mar2001’d and ’08mar2001’d; plot Boarded*Date;
symbol value=circle i=join color=red width=2;
run; quit;

SYMBOL Example Output

SYMBOL Statement Example
 Create one plot by modifying the previous code to use a red square as the plotting symbol, set the line width to 1, and join the symbols with straight lines.
 Create a second plot by modifying this code to use a blue star as the plotting symbol.
proc gplot data=data1.flight114;
*this selects one week of flights;
where date between ’02mar2001’d and ’08mar2001’d; plot Boarded*Date;
symbol c=red v=square w=1 i=join;
where date between ’02mar2001’d and ’08mar2001’d; plot Boarded*Date;
symbol c=blue v=star w=1 i=join;
run; quit;

SYMBOL Example Output

SYMBOL Statement
 SYMBOL statements have the following characteristics:
After they are defined, they remain in effect until changed or until the end of the SAS session.
Specifying the value of one option does not affect the values of other options.

Modify SYMBOL Options
 Set the attributes for SYMBOL1:
 Modify only the color of SYMBOL1, and not the
 To cancel SYMBOL statements
symbol1 c=blue v=diamond;
symbol1 c=green;
goptions reset=symbol;

Control the Appearance of the Axis
 You can modify the appearance of the axes that PROC GPLOT produces with the following
 PLOT statement options
 The LABEL statement
 The FORMAT statement

Modify Axis Scale
 Example: Define the scale on the vertical axis and display the axis text in blue.
proc gplot data=data1.flight114;
where date between ’02mar2001’d and ’08mar2001’d; plot Boarded*Date / vaxis=100 to 200 by 25
ctext=blue;
symbol value=square i=join;

Modify Axis Scale Output

Modify Axis Labels
Example: Display‘PassengersBoarded’forthe variable Boarded, and ‘Departure Date’ for the variable Date.
proc gplot data=data1.flight114;
where date between ’02mar2001’d and ’08mar2001’d; plot Boarded*Date / vaxis=100 to 200 by 25
ctext=blue;
symbol value=square i=join;
label Boarded=’Passengers Boarded’ Date=’Departure Date’;
run; quit;

Modify Axis Labels Output

Produce a Scatterplot
 A scatterplot typically plots two continuous variables  Example
proc gplot data=data1.admit;
plot weight*height;
run; quit;

Add Options
 You can modify the symbol, axis labels and axis tick marks
 You usually do not connect the dots in a scatterplot  Example
proc gplot data=data1.admit;
plot weight*height;
symbol v=dot color=blue;
run; quit;

Scatterplot with Regression
 You can also add a  Regression equation
 Regression line
 Regression line and prediction confidence interval

Regression Options
 Regression equation
 Use regeqn as an option to the plot statement
 Regression line (linear)
 Use an interpolation method of i=rl
 Regression line (linear) + Confidence limits  Use an interpolation method of i=rlclm__
 Set the confidence level by writing it at the end of the interpolation  i.e. 90% CL: i=rlclm90

Scatterplot with Regression
proc gplot data=data1.admit; plot weight*height / regeqn; symbol v=dot i=rlCLM95;
run; quit;

Scatterplot with Regression Output
I = rlclm95

The Output Delivery System

Generate a LST File
 The ODS LISTING statement opens, closes, and manages the LST file destination.
 General form of the ODS LISTING statement:
ODS LISTING FILE=’LST-file-specification’; SAS code that generates output
ODS LISTING CLOSE;

Generate a HTML File
 The ODS HTML statement opens, closes, and manages the HTML destination.
 General form of the ODS HTML statement:
ODS HTML FILE=’HTML-file-specification’; SAS code that generates output
ODS HTML CLOSE;

Generate a LST or a HTML File
 Output is directed to the specified LST or HTML file until you
 Close the LST or HTML destination
 Specify another destination file
Report Report Report
ods html file=’…’; proc print…
proc means…
proc freq…
ods html close;

Apply ODS Styles
 ODS Styles are pre-defined formats for output.  Example:
 Complete list of styles:
ods html file=’output.html’ style=analysis;
proc template; list styles;

ODS Style Examples

ODS File Formats
 With ODS you can create file formats:
 HTML: HyperText Markup Language – for web pages
 LST: Listing Reports
 RTF: Rich Text Format – for Word
 PDF: Portable Document Format – for Adobe  PS: Post-Script – for printers
 CSV: Comma Separated Values
 and many others

Write a Comma-Separated File
 Many programs can read in a comma-separated values (CSV) file, including Excel.
 Use the ODS CSVALL statement to convert a SAS data set to a CSV file
ods csvall file=’/home/user/admit.csv’; title;
proc print data=data1.admit noobs; run;
ods csvall close;
 You can use titles, footnotes, labels, and formats to change the appearance of the data.

CSVALL Output
“ID”,”Name”,”Sex”,”Age”,”Date”,”Height”,”Weight”,”ActLevel”,”Fee” 2458,”Murray, W”,”M”,27,1,72,168,”HIGH”,85.20
2462,”Almers, C”,”F”,34,3,66,152,”HIGH”,124.80
2501,”Bonaventure, T”,”F”,31,17,61,123,”LOW”,149.75 2523,”Johnson, R”,”F”,43,31,63,137,”MOD”,149.75
2539,”LaMance, K”,”M”,51,4,71,158,”LOW”,124.80 2544,”Jones, M”,”M”,29,6,76,193,”HIGH”,124.80 2552,”Reberson, P”,”F”,32,9,67,151,”MOD”,149.75 2555,”King, E”,”M”,35,13,70,173,”MOD”,149.75 2563,”Pitts, D”,”M”,34,22,73,154,”LOW”,124.80 2568,”Eberhardt, S”,”F”,49,27,64,172,”LOW”,124.80 2571,”Nunnelly, A”,”F”,44,19,66,140,”HIGH”,149.75 2572,”Oberon, M”,”F”,28,17,62,118,”LOW”,85.20 2574,”Peterson, V”,”M”,30,6,69,147,”MOD”,149.75 2575,”Quigley, M”,”F”,40,8,69,163,”HIGH”,124.80 2578,”Cameron, L”,”M”,47,5,72,173,”MOD”,124.80 2579,”Underwood, K”,”M”,60,22,71,191,”LOW”,149.75 2584,”Takahashi, Y”,”F”,43,29,65,123,”MOD”,124.80 2586,”Derber, B”,”M”,25,23,75,188,”HIGH”,85.20 2588,”Ivan, H”,”F”,22,20,63,139,”LOW”,85.20 2589,”Wilcox, E”,”F”,41,16,67,141,”HIGH”,149.75 2595,”Warren, C”,”M”,54,7,71,183,”MOD”,149.75

SAS Functions
A SAS function is often categorized by the type of data manipulation performed:
 truncation
 character
 date and time  mathematical  trigonometric
 sample statistics
 arithmetic
 financial
 random number
 state and ZIP code

Example: Mailing Labels
 The data2.freqflyers data set contains information about frequent flyers.
 How do we use this data set to create another data set suitable for mailing labels?

The LENGTH Function
 The LENGTH function returns the number of characters in a string
NewVar = LENGTH(string);
 LENGTH(‘SMITH, JOHN’) = 11

The INDEX Function
 Recall that the INDEX function returns the position of specific character (or characters) within a string
NewVar = INDEX(string,target);
 INDEX(‘SMITH-JOHN’, ‘-‘) = 6
 Returns ZERO (0) if the target isn’t in the string
 Recall that we previously used this function to mimic the CONTAINS special operator

The SUBSTR Function
 The SUBSTR function extracts a portion of a character variable:
 Example:
 SUBSTR(‘PSTAT130 M20’,6) = ‘130 M20’  SUBSTR(‘PSTAT130 M20’,6,3) = ‘130’
NewVar=SUBSTR(string,start<,length>);

Parse a Text String
 How can we turn ‘SMITH, JOHN’ into ‘JOHN SMITH’?
 Find the location of the comma
 Last Name = text before the comma  First Name = text after the comma

Put It All Together
DATA mail_labels;
input name $25.;
name_len = length(name);
comma_pos = index(name,’,’);
last_name = substr(name,1,comma_pos-1);
first_name = substr(name,comma_pos+2,name_len-comma_pos-1);
datalines; Smith, , , Elizabeth ;
proc print; run;

first_name
Smith, John

Johnson, Davy

Quincy, Elizabeth

The SCAN Function
 The SCAN function “parses” a character string into a set of “words” using a delimiter.
NewVar=SCAN(string,n<,delimiters>);
First “word”
 Example:
 SCAN(‘Smith, John’, 1) = ‘Smith’  SCAN(‘Smith, John’, 2) = ‘John’
Second “word”

The SCAN Function
 When the SCAN function is used
 The default delimiters include
blank . < ( + | & ! $ * ) ; ¬ - / , % | ¢  Delimiters before the first “word” have no effect  Any character or set of characters can serve as delimiters  Two or more contiguous delimiters are treated as a single delimiter  A missing value is returned if there are fewer than n words in the string  If n is negative, SCAN returns the “word” in the string starting from the end (of the string) Concatenation Operator  Use the || operator to “concatenate” or join two strings together  Examples  'John' || 'Smith' = 'JohnSmith'  'John' || ' ' || 'Smith' = ' ' A Better Mailing Label Program DATA mail_labels2; input name $25.; last_name = scan(name,1); first_name = scan(name,2); datalines; Smith, , , Elizabeth ; proc print; run; Smith, , , Elizabeth first_name Truncation Functions  Selected functions that truncate numeric values include  ROUND function  CEIL function  FLOOR function  INT function The ROUND Function  The ROUND function performs a traditional Round Up/Round Down operation:  Examples: NewVar = ROUND(argument<,round-off-unit>);

The CEIL Function
 The CEIL function performs a Round Up operation only
 Note: CEIL(4) = 4
NewVar = CEIL(argument);

The FLOOR Function
 The FLOOR function performs a Round Down operation only
 Note: FLOOR(4) = 4
NewVar = FLOOR(argument);

The INT Function
 The INT function removes any decimals from an number
 Examples:
 INT(3.2) = 3
 INT(-4.8) = -4
 For positive numbers, INT = FLOOR  For negative numbers, INT = CEIL
NewVar = INT(argument);

Class Exercise 1
 Use the pilots data set in the data1 folder
 Create a scatterplot of salary by age (assume the current
date is 1/1/82)
 Use a blue square symbols
 Label the axes as ‘Annual Salary’ and ‘Age’
 Display a regression line and 95% confidence limits.
 Create an HTML file (pilots.html) containing the following  The descriptor of the data set with an appropriate title
 The data portion of the data set with an appropriate title.

Class Exercise 2
 The data2.ffhistory data set contains
information about the history of each frequent flyer.
This history information consists of
 Each membership level the flyer has attained (bronze, silver, or gold)
 The year the flyer attained each level.
 Create a report that shows all frequent flyers who have attained silver membership status and the year each became silver members.

Class Exercise 2 – continued
Hint: Thinkabouthowyouwould  Parse the membership levels?
 Parse the year each level was attained?
 Select flyers that have achieved Silver status?

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com