STAT 3094: SAS Base Programming Homework #03(10points) 2018
Instructions
Provide the SAS programs to answer the questions below. Complete the exercises in the order in which they appear. Place a one-line comment above the SAS code specifying the exercise number to which it answers. Do not add anything more in the SAS code than what is being asked in the exercises (Note: if you come across something you don’t know, the notes don’t tell you, and you cannot find in the book, feel free to use the Internet and the SAS support online, as well as the SAS help.) Be sure to provide comments as part of the program to show that you know what the program is doing and provide answers to the questions.
Copy the SAS program to a WORD file and upload to canvas.
Below the ‘external drive’ is your flash drive.
Scenario
From canvas/resources/datasets download the permanent SAS datasets called stock onto your external drive. This dataset contains 88 observations with information on the complementary drinks used to feed the passengers on some flights. Here the interest is in the drinks specifically water, soda and juice. The data for these variables is the number of drinks that were handed out for each type. However for management purposes, interest is in the number of boxes of each (the drinks come in boxes of different sizes).
Exercises
1. Start your SAS code by creating a multi-line comment where the first line contains your first and last name, the second line contains the date of the assignment, the third line contains “STAT 3094: SAS Base Programming”, and the last line contains your last name.
2. Create a library called ia that accesses these SAS datasets from your external drive (flash drive).
3. Then, in one PROC statement, output the details of the stock permanent SAS dataset. List the variables in this SAS dataset by the order in which the variables appeared in the SAS dataset. Change the code so the variables are output in alphabetical order. Add comments to indicate what the code does.
4. Using and modifying the permanent dataset ia.stock, create a permanent SAS dataset called ia.usage that completes the following objectives:
• Packets of waters, soda, and juice each come in boxes containing 12, 12, and 6 units (cans
or bottles). Calculate the number of boxes associated with each drink. Use only one function to calculate each one. Store these values in SAS variables called water_boxes, soda_boxes, and juice_boxes. (Hint: water_boxes = ceil(waters/12); code is similar for soda_boxes and juice_boxes; find out in Help about the function ceil() ).
• Boxes of waters cost the airline company $5 each, soda cost $7 each, and juice cost $6 each. Calculate the total drink cost for each flight and store this value in the variable drink_cost. (Hint: from Inclass1, remember how you can create a new variable, say
drink_cost, using some existing variables and arithmetic operations: drink_cost=5.00
*water_boxes + …)
• Keep only the boxes variables (water, soda, juice), as well as the flight number, time of
day, and drink cost variables.
5. Using one PROC PRINT statement on the ia.usage SAS dataset, write the code that produces the same output in the picture below. Note you have to use a title statement, id statement, sum
statement, label and format. The output you get might look a little different as it should be in an HTML format.
International Airlines: Summary of Snack Costs
Shuttle Flights Only (9/4/2018)
Flight Number Water Soda Juice Cases Cases Cases Used Used Used
Flight Drink Cost
$159.00 $162.00 $161.00 $162.00 $139.00 $156.00 $136.00 $165.00 $173.00 $168.00 $169.00 $188.00 $181.00 $182.00 $171.00 $146.00 $183.00 $150.00 $184.00
$3,135.00
IA401 3
IA402 4
IA403 5
IA404 4
IA405 3
IA406 5
IA407 5
IA408 3
IA409 5
IA410 4
IA480 3
IA481 4
IA482 4
IA483 3
IA484 3
IA485 3
IA486 3
IA487 4
IA488 5
12 10 10 12 10 11 10 12 10 9 11 9
9 8 12 11 10 13 10 13 10 14 12 14 11 14 11 15 12 12 11 9 12 14 10 10
9 16
6. Using one PROC FREQ statement, write the code that produces the same output shown in the picture below. Try to make the output nicer by adding labels for time of day and number of
boxes of water.
water Box Usage (by Time of Day)
Frequency Percent Row Pct Col Pct
The FREQ Procedure
Table of timeofday by water_boxes timeofday water_boxes
3 4 A21 16 23.86 18.18 46.67 35.56 56.76 53.33
5 Total
8 45 9.09 51.14
17.78 38.10
4 22 4.55 25.00
18.18 19.05
9 21 10.23 23.86
42.86 42.86
21 88 23.86 100.00
E
M
Total
9 9 10.23 10.23 40.91 40.91 24.32 30.00
7 5 7.95 5.68 33.33 23.81 18.92 16.67
37 30 42.05 34.09
7. The following data set has a mix of street addresses. In applications one might want to have a consistent set of names. Below is an example data set on addresses that we wish to have a consistent format.
data mail;
input address $25.; datalines;
81 CANTOR ave
922 NEW_HAVEN AVE. 22 HARING AVE.
ONE WILLOW AVE
727 RAVEN ave.
35 PL AVE
42 W.26TH AVENUE
15 S. SWAN AVE
129 NORTH AVE
r; un;
Some ways to fix things
a. COMPRESS is a function that will remove certain strings. The syntax is
variable = compress(oldvariable, “what you want removed”) use compress to remove the
periods.
b. TRANWRD is a search and replace function. Use TRANWRD to find AVE and replace it with
A VENUE
c. TRANSLATE is something that is not used much. It is useful for replacing a single character
with one or more characters. For example address = TRANSLATE(address, ‘.’ , ’—‘). One
can also use if then to make changes.
Use these functions (you need not use TRANSLATE) to make changes so that all addresses have AVENUE at the end. CAUTION: if you change ave to AVENUE then change AVE to AVENUE you may run into problems. What should you do here to prevent getting AVENUENUE? Also note that if you use compress to remove all ‘.’ Then you may have to put it back into any other abbreviations (S. for ex).
8. Do Chapter 4+5 Quizzes and provide the answers. Be sure to read the Chapters first and then figure out your answers.