SAS: Input/Reading Data
STAT 342 – Fall 2020 Tutorial – 2
Objectives
• To learn different ways of inputting/reading data with SAS 1. Toidentifyvaryingformatsofrawdata
2. Toconvertrawdataintomeaningfuldata 3. Printing data in SAS environment
The data can be read into SAS in two ways
• Enter data Manually
• Read data from an external data sources (e.g. text/csv files)
• Works only if we are working with limited amount of data
• Impractical most of the time
• Most common way of reading data into SAS
• Need to understand the format of the raw data
How to manually enter data?
• Need to provide a NAME for the data set along with DATA command
• Need to specify the VARIABLE names along with INPUT command
• DATALINES command
• Enter raw data manually
• RUN command
Sample SAS code
data NAME;
input Var1 $ Var2 $ Var3 $ Var4; datalines;
Var1_value Var2_value Var3_value Var4_value ;
run;
Printing Data
proc print data = NAME; run;
Input data from an External Source
• Manual entry is impractical
• Most of the time data will be imported from external data sources
such as text / csv files etc.
• We need to keep an eye on the format of the data • Input type depend on the format of the raw data
1. 2. 3. 4.
Four ways of importing/inputting data depending on the Format of the Data
List Input Column Input Formatted Input Named Input
Input type vs Raw Data Format
Input type
List Column
Raw data Separated by at least one space / constant amount of
spaces
No spaces or varying amount of spaces, column length should be constant
Missing data indicated by Period Blank
Column numbers of Not required Need to specify variables
Variable names: ID (string), Assignments (out of 30), Mid (out of 30), Final (out of 40), Total (out of 100), Grade (string)
Sample SAS code for List Input
data NAME;
infile ‘path to external data source’; input Var1 $ Var2;
run;
NOTE:
When specifying path make sure to use forward slash which is common to all operating systems
Sample SAS code for Column Input
data NAME;
infile ‘path to external data source’; input Var1 $ (col index) Var2 (col index);
run;
NOTE:
When specifying path make sure to use forward slash which is common to all operating systems
Use of $ symbol with the Input Command
data NAME;
infile ‘path to external data source’; input Var1 $ (col index) Var2 (col index);
run;
If a particular variable takes string values, you need to specify the $ symbol following the name of that variable within the line begins with the input command
$ symbol should appear prior to column specification