MET MA 603: SAS Programming and Applications
MET MA 603:
SAS Programming and Applications
Datasets and Data Types
1
1
A dataset is a structure for storing information.
Example of a dataset:
A dataset is organized into rows and columns (i.e., tabular data). Each row is called an observation and each column is called a variable. Each variable must have a name. Observations are identified by number.
Variables can be either of two types: numeric or character. An observation can include variables or different types, but all of the observations for a particular variable must be of the same type.
Datasets
2
2
Data Types
Numeric data can only contain valid combinations of the following characters:
0 1 2 3 4 5 6 7 8 9 + – . E
Numeric data has a size limit of 8 bytes, which corresponds to maximum number of around 9×1015 (i.e., 9E15). Missing data is represented by a dot (.)
There is no restriction on what can be stored as Character data. The size limit (>32k) is almost never an issue in practice. Missing data is represented by a blank (note: not a space). Note that character data is case-sensitive.
Generally, only use the numeric type on data for which mathematical operations may be performed. Otherwise, use the character type, which stores data more efficiently.
3
3
Dates
Dates are a special case of numeric data.
In the SAS date system, each date is represented by an integer, with 0 corresponding to January 1, 1960.
Examples of the SAS date system:
January 10, 1960 corresponds to 9.
October 31, 1995 corresponds to 13087.
May 15, 1810 corresponds to –18128.
Since dates are numeric data, missing values are represented with a dot.
Date data is usually displayed using a date format so as to be more intuitive to work with.
4
4
Naming Rules
The following rules apply to naming datasets as well as to naming variables:
Only letters, digits, and underscores may be used.
Names cannot begin with digits.
Names cannot be longer than 32 characters.
Name are not case sensitive.
Example of a valid name: MyData_2016_10_11
Example of an invalid name: 2016 10 11.MyData
5
5
Datasets are listed in the Explorer Window. Double-click a dataset to open it in the Viewtable Window.
Edit Mode can be used to modify datasets in the Viewtable Window. Right-click a dataset to open Properties Window.
Dataset Properties
6
6
Practice
Of the eight items listed below, identify the ones that could be stored as numeric data:
36105
3.14159
hello
1.1E4
ten
.
123…4
7
7
Practice
Of the eight items below, identify the ones that are valid names in SAS:
Losss-2016
losses
Losses_January_01_2016_to_December_31_2016
_2016_Losses
Losses
Losses!
Losses_2016
2016_Losses
8
8
Readings
Textbook section 1.2, 1.11, 1.12
http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001397898.htm
9
9
/docProps/thumbnail.jpeg