Using SAS for Epidemiology

Charles DiMaggio, PhD

cjd11@columbia.edu

Tour of the course and materials

Why SAS?

Any computing platform 

Any type of data 

Any statistical procedure.  

Disadvantages 

Why Syntax?

Menu driven version available, but with syntax …

Alternatives to SAS?

You bet:

1. INTRODUCING SAS

The SAS Environment

The Program Editor

Where you spend most of your time

Two flavors

Writing SAS Statements

SAS statements start with a keyword and end with a semi-colon ;

Two ways to write comments (not run):

Tip:  Use comments liberally.

/* -------------*/  
* --------------------;

 

Submitting Statements

DEMONSTRATION: AGE AT MI

Two Basic Types of SAS Programs

DATA steps create or manipulate SAS data sets

PROC steps conduct analyses

The tyranny of the semi-colon

Interfacing With Windows

Changing System Options:

Help! (with procedures)

Help! (with functions)

More Help…

SAS

Institutional

Books

Google

(of course)

Coming soon

Letting go of the spreadsheet…

“Where’s my data?”

“Why can’t I just work with a spreadsheet like in Excel or SPSS?”

(only) two parts to a SAS dataset

Descriptor portion

Data portion

(only) 2 kinds of SAS data

Numeric

Character

Data set and variable names

2. SAS LIBRARIES

LIBNAME

LIBNAME statement creates libref

sparcs.mortality

must be re-issued each SAS session

 Two special libraries

SASUSER

WORK

Browsing SAS libaries

proc contents data=sparcs. _all_ nods;  
*/*note the space between _all_ and nods*/*
run;

INPUTTING AND READING DATA

SAS files are created in a DATA step using INPUT

Reading data from editor window

space-delimited and column input

*INPUT var1 var2 var3 \$ ;*
*INPUT var1 1-5 var2 6-10 var3 $ 16-26 ;*

reading data from external files: the INFILE statement

INFILE 'F:\\K Data\\SPARCS\\ADR02NY.TXT' LRECl=450 obs=100;

formatting input of external data: the INPUT statement

INPUT 
@18 DATE yymmn6.
@44 AGE 3.
@50 COUNTY $CHAR2.
;

informats

*\<\$\>informat_namew.\<d\>*

data documentation 

SAS Dates

behind the scenes of a DATA step 

deciphering error messages

*_ERROR_=1 _N_=10*

common errors

ALWAYS CHECK YOUR LOG

data input tips

some informats

taming wild data sets

importing Excel spreadsheets