SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
SAS CHEAT SHEET
April 17, 2017
Dr. Ali Ajouz
Data analyst with a
strong math
background and
experience in machine
learning, computational
number theory , and
statistics. Passionate
about explaining data
science to non-technical
business audiences.
● SAS and all other SAS Institute Inc. product or service
names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates
USA registration.
● The following example codes often works. Refer to SAS
docs for all arguments.
Linkedin: https://www.linkedin.com/in/ajouz/ Email: aliajouz@gmail.com
Version: 1.0.2
From Raw Data to Final Analysis
1. You begin with raw data, that is, a collection of data that
has not yet been processed by SAS.
2. You use a set of statements known as a DATA step to get
your data into a SAS data set.
3. Then you can further process your data with additional
DATA step programming or with SAS procedures.
Reading raw data from instream data lines
DATA STEP
INFILE: identifies an external file to read
INPUT: specifies the variables in the new data
LENGTH: the number of bytes for storing
variables
DATALINES: indicates that data lines follow
RUN: execute the DATA step.
data names;
infile datalines delimiter=',';
length first last $25.;
input first $ last $;
datalines;
Ali, Ajouz
Karam, Ghawi
;
run;
Some useful infile options are: DELIMITER,
MISSOVER, DLM, FIRSTOBS
Reading raw data from external file
Using: DATA STEP
using INFILE statement to identify an
external file to read with an
INPUT statement
Using: PROC IMPORT
DATAFILE: the location of file you want
to read.
DBMS : specifies the type of data to
import.
proc import
datafile="C:file.csv"
out=outdata
dbms=csv
replace;
getnames=yes;
data names;
infile "file-path" delimiter=',';
length first last $25.;
input first $ last $;
run;
Reading raw data (3)
USING FORMAT AND INFORMAT
data names;
infile datalines delimiter=',';
informat first $15. last $15. birthday ddmmyy10.;
format birthday date9.;
input first $ last $ birthday;
datalines;
Ali,Ajouz, 04/11/1980
;
INFORMATS
Informats tell SAS how to read the
data.
FORMATS
formats tell SAS how to display the
data.
$w. Reads in
character data of
length w.
w.d Reads in numeric
data of length w
with d decimal
points
MMDDYYw. Reads in date
data in the form
of 04-11-80
POPULAR INFORMATS
VS
Single observations from
multiple records
data multilines;
infile datalines;
length FullName Addresse $36;
input FullName $36.;
input Addresse $36.;
datalines;
Ali Ajouz
Danziger str.7
Karam Ghawi
Berliner str.32
;
Reading raw data
(4)
Multiple observations
from single record
data multilines;
infile datalines delimiter=",";
length FullName Addresse $36;
input FullName$ Addresse $ @@;
datalines;
Ali Ajouz, Danziger str.7, Karam Ghawi,Berliner
str.32
;
run;
We can use multiple INPUT
statements to read group of
records as single observation.
We can use double trailing @@ to read a
record into a group of observations
Selecting columns by name
data selected;
set sashelp.cars;
keep model type;
Deleting columns by name
data selected;
set sashelp.cars;
drop model type;
Change column label
data selected;
set sashelp.cars;
label model="car model"
type="car type";
Selecting columns by position
%let lib= sashelp;
%let mydata= class;
%let to_keep=(1:3); /* range */
proc sql noprint;
select name into :vars separated by ' '
from dictionary.columns
where libname="%upcase(&lib.)" and
memname="%upcase(&mydata)"
and varnum in &to_keep;
quit;
data want;
set &lib..&mydata (keep=&vars.);
%let to_keep=(1,3,5); /*list */
OR
Selecting columns
Ignore the first N rows
data want;
set sashelp.buy (firstobs=5);
Select rows by using IF statement
data want;
set sashelp.buy;
if amount >=-1000;
Limit numbers of rows to be read
data want;
set sashelp.buy (obs=5);
Selecting rows by index
data want;
do Slice=2,1,7,4;
set sashelp.buy point=Slice;
output;
end;
stop;
run;
Random sample
proc surveyselect data=sashelp.cars
method=srs n=10 seed=12345
out=SampleSRS;
run;
Selecting Rows
Peek at the data set contentsData attributes + columns list
proc datasets ;
contents
data=SASHELP.HEART
order=collate;
quit;
Data header (first 5 rows)
proc print data=sashelp.class (obs=5);
Data summary statistics
proc means data=sashelp.iris
N mean std min max q1 median
q3;
Frequency tables
proc freq data=sashelp.class;
tables SEX/ plots=freqplot;
Number of rows
data _NULL_;
if 0 then set sashelp.class nobs=x;
call symputx('nobs', x);
stop;
run;
%put #rows = &nobs;
Sorting a data set
The Sort procedure order SAS
data set observations by the
values of one or more character
or numeric variables .
data Score ;
infile datalines dsd;
input Name$ Score;
datalines;
Ali,66
Karam,82
Roaa,57
Hala,57
;
proc sort data=Score out=sorted;
by descending score;
run;
proc print data=sorted;
var Name Score;
title 'Listed by descending Score';
run;
+ addition
- subtraction
* multiplicatio
n
** exponentiat
ion
/ division
Arithmetic Operators
= EQ Equal to
^= NE Not Equal to
> GT greater than
< LT less than
>= GE greater than
or equal to
<= LE less than or
equal to
IN is one of
Comparison operators
& AND
| OR
^ NOT
Logical operators
A SAS operator is a
symbol that
represents a
comparison,
arithmetic
calculation, or logical
operation; a SAS
function; or grouping
parentheses.
Some SAS
Operators
If statement
SAS evaluates the expression in
an IF-THEN/ELSE statement to
produce a result that is either
nonzero, zero, or missing. A
nonzero and nonmissing result
causes the expression to be true;
a result of zero or missing causes
the expression to be false.
data names;
infile datalines;
input Name$;
datalines;
Ali
Karam
Roaa
Hala
;
data titles;
set names;
if name="Ali" then
do;
title="Papa";
end;
else if name="Karam" then
do;
title="Mama";
end;
else
title="Kid";
Select statement
Select VS If:
When you have a long series of
mutually exclusive conditions,
using a SELECT group is more
efficient than using a series of
IF-THEN statements because
CPU time is reduced.
data names;
infile datalines;
input Name$;
datalines;
Ali
Karam
Roaa
Hala
;
data titles;
set names;
select (name);
when ("Ali") title="Papa";
when ("Karam")
title="Mama";
otherwise title="Kid";
end;
run;
DO Statement
The basic iterative DO
statement in SAS has the syntax
DO value = start TO stop. An END
statement marks the end of the
loop, as shown in the following
examples.
data df;
do x=1 to 3;
y=x**2;
output;
end;
run;
data df;
x=1;
do while (x<4);
y=x**2;
output;
x=x + 1;
end;
run;
data df;
do x=1,2,3;
y=x**2;
output;
end;
run;
data df;
x=1;
do until (x>3);
y=x**2;
output;
x=x + 1;
end;
run;
Transposing data with the data step
data raw;
infile datalines
delimiter=",";
input ID $ Name$ Try
result;
datalines;
1,Ali,1,160
1,Ali,2,140
2,Karam,1,141
2,Karam,3,161
;
proc sort
data=raw;
by id;
run;
data rawflat;
set raw;
by id;
keep id Name Try1-Try3;
retain Try1-Try3;
array myvars {*} Try1-Try3;
if first.id then
do i = 1 to
dim(myvars);
myvars{i} = .;
end;
myvars{Try} = result;
if last.id;
run;
Step 1: Sample data
Step 2: Sorting by ID
Step 3: Transposing the data
Labels and Formats
We can use LABEL Statement to
assigns descriptive labels to
variables. Also, FORMAT Statement
to associates formats with variables.
data Salary;
infile datalines dsd;
input Name$ Salary;
format Salary dollar12.;
Label Name="First Name"
Salary="Annual Salary";
datalines;
Ali,45000
Karam,50000
;
run;
proc print data=Salary label;
run;
Ranking
The RANK procedure computes
ranks for one or more numeric
variables across the
observations of a SAS data set
and writes the ranks to a new
SAS data set.
data Score ;
infile datalines dsd;
input Name$ Score;
datalines;
Ali,66
Karam,82
Roaa,57
Hala,57
;
proc rank data=Score
out=order descending ties=low;
var Score ;
ranks ScoreRank;
run;
proc print data=order;
title "Rankings of Participants'
Scores";
run;
to all variables x1 to xn
function-name(of x1-xn)
Working with numeric values
SUM Sum of nonmissing
arguments
MEAN Average of arguments
MIN Smallest value
MAX Largest value
CEIL Smallest integer
greater than or equal
to argument
FLOOR Greatest integer less
than or equal to
argument
Some useful functions
to all variables begin with same string
function-name(of x:)
to special name list
function-name(of _ALL_)
function-name(of _Character_)
function-name(of _Numeric_)
Apply functions
function-name (var1, var2, ........., varn)
Accumulating variable using (+)
data total;
set sashelp.buy;
TotalAmount + Amount;
Summarizing data by groups
proc sort data=sashelp.class
out=Class;
by sex;
data Class (keep= sex count);
set Class;
by sex;
if first.sex then
do;
count=0;
end;
count+1;
if last.sex;
Summarizing
data
In the DATA step, SAS identifies
the beginning and end of each BY
group by creating two temporary
variables for each BY variable:
FIRST.variable and LAST.variable.
Working with
missing values
A missing value is a valid value in SAS. A missing
character value is displayed as a blank, and a
missing numeric value is displayed as a period.
STDIZE procedure with the REPONLY option can be
used to replace only the missing values. The
METHOD= option enables you to choose different
location measures such as the mean, median, and
midrange. Only numeric input variables should be
used in PROC STDIZE.
Step 1: Setup
/* data set*/
%let mydata
=sashelp.baseball;
/* vars to impute */
%let inputs = salary;
/* missing Indicators*/
%let MissingInd =
MIsalary;
data MissingInd
(drop=i);
set &mydata;
array mi{*}
&MissingInd;
array x{*} &inputs;
do i=1 to dim(mi);
mi{i}=(x{i}=.);
end;
run;
Step 2: Missing
indicators
proc stdize data=MissingInd
reponly
method=median
out=imputed;
var &inputs;
run;
Step 3: Impute missing
values
Last observation carried forward (LOCF)
data raw;
infile datalines
delimiter=",";
input ID $ LastTry;
datalines;
1,160
1,.
1,140
2,141
2,.
;
proc sort
data=raw;
by id;
run;
data rawflat;
set raw;
by id;
retain LOCF;
* set to missing to avoid being carried
to the next ID;
if first.id then
do;
LOCF=.;
end;
* update from last non missing value;
if LastTry not EQ . then
LOCF=LastTry;
run;
Step 1: Sample data
Step 2: Sorting by ID
Step 3: Compute LOCF
LOCF
One method of
handling missing
data is simply to
impute, or fill in,
values based on
existing data.
Working with character values
Function Details Example
SUBSTR(char, position ,n) Extracts a substring
from an argument
phone = "(01573) 3114800" ;
area = substr(phone, 2, 5) ;
SCAN(char, n, 'delimiters') Returns the nth word
from a string.
Name = "Dr. Ali Ajouz" ;
title = scan(Name, 1, ".") ;
INDEX(source,excerpt) Searches a character
for a string
INDEX('Ali Ajouz','Ajo');
COMPRESS Remove characters
from a string
phone="(01573)- 3114800";
phone_c= compress(phone,'(-) ');
Some useful functions
Working with character values (2)
Function Details Example
TRANWRD Replaces or removes
given word within a string
old ="I eat Apple";
new =
TRANWRD(old,"Apple","Pizza");
CATX(sep, str1,str2, .) Removes leading and
trailing blanks, inserts
delimiters, and returns a
concatenated character
string.
First="Ali";
Last = "Ajouz";
UserName=catx(".", First, Last);
STRIP(char) Removes leading and
trailing blanks
STRIP(" aliajouz@gmail.com ");
Also UPCASE for uppercase and LOWCASE for lowercase
Some useful functions
SAS dates and times
SAS DATE VALUE: is a value that
represents the number of days between
January 1, 1960, and a specified date.
December 31, 1959 -1
January 1, 1960 0
January 2, 1960 1
Raw date to SAS date
data Birthday;
infile datalines;
input date ddmmyy10.;
datalines;
04/11/1980
;
Converting a text date to SAS
date USING INFORMATS
Refer to SAS docs for all
available INFORMATS.
INPUT INFORMAT
04/11/1980 mmddyy10.
04/11/80 mmddyy8.
04nov1980 date9.
nov 04, 1980 worddate12
.
Example
data Me (drop=DateTime);
format Date ddmmyy10. Time
time.;
DateTime='04Nov80:12:15'dt;
Date=datepart(DateTime);
Time=timepart(DateTime);
Day=day(Date);
Month=month(Date);
Year=year(Date);
run;
Working with dates, and times
DATEPART(datetime) Extracts the date from a
SAS datetime value.
TIMEPART(datetime) Extracts a time value
from a SAS datetime
value.
DAY(date) Returns the day of the
month from a SAS date
value.
MONTH(date) Returns the month
from a SAS date value.
YEAR(date) Returns the year from a
SAS date value.
DATE / TODAY Returns the current
date
Some useful functions
Concatenating
data sets rows
+ =
data result;
set data1 data2;
run;
We can use SET statement
to combine data sets
vertically.
Concatenating
data sets
columns
+
=
data result;
merge data1 data2;
run;
We can use Merge
statement to combine data
sets horizontally
(One-to-One) .
Merging data sets on a
common column
data result;
merge data1(in=InData1)
data2(in=InData2);
by ID;
/* code here */
if InData1=1 and InData2=0;
+
if InData1=0 and InData2=1; if InData1=1 and InData2=1;
We can use IN option in
the MERGE statement
with IF statement to
control the output.
to be continued ..
I welcome comments,
suggestions and corrections.
aliajouz@gmail.com

Weitere ähnliche Inhalte

Was ist angesagt?

Sas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXWSas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXWTHARUN PORANDLA
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sashalasti
 
Base sas interview questions
Base sas interview questionsBase sas interview questions
Base sas interview questionsDr P Deepak
 
Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processingguest2160992
 
Where Vs If Statement
Where Vs If StatementWhere Vs If Statement
Where Vs If StatementSunil Gupta
 
Report procedure
Report procedureReport procedure
Report procedureMaanasaS
 
SAS Macros part 2
SAS Macros part 2SAS Macros part 2
SAS Macros part 2venkatam
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SASizahn
 
A Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureA Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureYesAnalytics
 

Was ist angesagt? (20)

Sas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXWSas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXW
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sas
 
Base sas interview questions
Base sas interview questionsBase sas interview questions
Base sas interview questions
 
SAS Proc SQL
SAS Proc SQLSAS Proc SQL
SAS Proc SQL
 
Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processing
 
Where Vs If Statement
Where Vs If StatementWhere Vs If Statement
Where Vs If Statement
 
Sas practice programs
Sas practice programsSas practice programs
Sas practice programs
 
Report procedure
Report procedureReport procedure
Report procedure
 
SAS Macros part 2
SAS Macros part 2SAS Macros part 2
SAS Macros part 2
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
 
Sas Plots Graphs
Sas Plots GraphsSas Plots Graphs
Sas Plots Graphs
 
SAS BASICS
SAS BASICSSAS BASICS
SAS BASICS
 
Sas summary guide
Sas summary guideSas summary guide
Sas summary guide
 
SAS Macro
SAS MacroSAS Macro
SAS Macro
 
Sas
SasSas
Sas
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
 
SAS Internal Training
SAS Internal TrainingSAS Internal Training
SAS Internal Training
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
A Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureA Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report Procedure
 
Proc report
Proc reportProc report
Proc report
 

Ähnlich wie SAS cheat sheet

Sas-training-in-mumbai
Sas-training-in-mumbaiSas-training-in-mumbai
Sas-training-in-mumbaiUnmesh Baile
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8thotakoti
 
98765432345671223Intro-to-PostgreSQL.ppt
98765432345671223Intro-to-PostgreSQL.ppt98765432345671223Intro-to-PostgreSQL.ppt
98765432345671223Intro-to-PostgreSQL.pptHastavaramDineshKuma
 
Introduction to SAS Data Set Options
Introduction to SAS Data Set OptionsIntroduction to SAS Data Set Options
Introduction to SAS Data Set OptionsMark Tabladillo
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxprakashvs7
 
I need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfI need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfMadansilks
 
Reason - introduction to language and its ecosystem | Łukasz Strączyński
Reason - introduction to language and its ecosystem | Łukasz StrączyńskiReason - introduction to language and its ecosystem | Łukasz Strączyński
Reason - introduction to language and its ecosystem | Łukasz StrączyńskiGrand Parade Poland
 
Bringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASBringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASRick Watts
 
Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2rowensCap
 
Graph Database Query Languages
Graph Database Query LanguagesGraph Database Query Languages
Graph Database Query LanguagesJay Coskey
 
SAS Mainframe -Program-Tips
SAS Mainframe -Program-TipsSAS Mainframe -Program-Tips
SAS Mainframe -Program-TipsSrinimf-Slides
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottPyData
 
Mysqlppt
MysqlpptMysqlppt
MysqlpptReka
 
Introducción al Software Analítico SAS
Introducción al Software Analítico SASIntroducción al Software Analítico SAS
Introducción al Software Analítico SASJorge Rodríguez M.
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6Mahesh Vallampati
 

Ähnlich wie SAS cheat sheet (20)

Sas classes in mumbai
Sas classes in mumbaiSas classes in mumbai
Sas classes in mumbai
 
Sas-training-in-mumbai
Sas-training-in-mumbaiSas-training-in-mumbai
Sas-training-in-mumbai
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8
 
Data Management in Python
Data Management in PythonData Management in Python
Data Management in Python
 
98765432345671223Intro-to-PostgreSQL.ppt
98765432345671223Intro-to-PostgreSQL.ppt98765432345671223Intro-to-PostgreSQL.ppt
98765432345671223Intro-to-PostgreSQL.ppt
 
Introduction to SAS Data Set Options
Introduction to SAS Data Set OptionsIntroduction to SAS Data Set Options
Introduction to SAS Data Set Options
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptx
 
I need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfI need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdf
 
Reason - introduction to language and its ecosystem | Łukasz Strączyński
Reason - introduction to language and its ecosystem | Łukasz StrączyńskiReason - introduction to language and its ecosystem | Łukasz Strączyński
Reason - introduction to language and its ecosystem | Łukasz Strączyński
 
Bringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASBringing OpenClinica Data into SAS
Bringing OpenClinica Data into SAS
 
Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2
 
Graph Database Query Languages
Graph Database Query LanguagesGraph Database Query Languages
Graph Database Query Languages
 
SAS Mainframe -Program-Tips
SAS Mainframe -Program-TipsSAS Mainframe -Program-Tips
SAS Mainframe -Program-Tips
 
Pig
PigPig
Pig
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Mysqlppt
MysqlpptMysqlppt
Mysqlppt
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
Introducción al Software Analítico SAS
Introducción al Software Analítico SASIntroducción al Software Analítico SAS
Introducción al Software Analítico SAS
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
 

Kürzlich hochgeladen

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Kürzlich hochgeladen (20)

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

SAS cheat sheet

  • 1. SAS CHEAT SHEET April 17, 2017 Dr. Ali Ajouz Data analyst with a strong math background and experience in machine learning, computational number theory , and statistics. Passionate about explaining data science to non-technical business audiences. ● SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. ● The following example codes often works. Refer to SAS docs for all arguments. Linkedin: https://www.linkedin.com/in/ajouz/ Email: aliajouz@gmail.com Version: 1.0.2
  • 2. From Raw Data to Final Analysis 1. You begin with raw data, that is, a collection of data that has not yet been processed by SAS. 2. You use a set of statements known as a DATA step to get your data into a SAS data set. 3. Then you can further process your data with additional DATA step programming or with SAS procedures.
  • 3. Reading raw data from instream data lines DATA STEP INFILE: identifies an external file to read INPUT: specifies the variables in the new data LENGTH: the number of bytes for storing variables DATALINES: indicates that data lines follow RUN: execute the DATA step. data names; infile datalines delimiter=','; length first last $25.; input first $ last $; datalines; Ali, Ajouz Karam, Ghawi ; run; Some useful infile options are: DELIMITER, MISSOVER, DLM, FIRSTOBS
  • 4. Reading raw data from external file Using: DATA STEP using INFILE statement to identify an external file to read with an INPUT statement Using: PROC IMPORT DATAFILE: the location of file you want to read. DBMS : specifies the type of data to import. proc import datafile="C:file.csv" out=outdata dbms=csv replace; getnames=yes; data names; infile "file-path" delimiter=','; length first last $25.; input first $ last $; run;
  • 5. Reading raw data (3) USING FORMAT AND INFORMAT data names; infile datalines delimiter=','; informat first $15. last $15. birthday ddmmyy10.; format birthday date9.; input first $ last $ birthday; datalines; Ali,Ajouz, 04/11/1980 ; INFORMATS Informats tell SAS how to read the data. FORMATS formats tell SAS how to display the data. $w. Reads in character data of length w. w.d Reads in numeric data of length w with d decimal points MMDDYYw. Reads in date data in the form of 04-11-80 POPULAR INFORMATS VS
  • 6. Single observations from multiple records data multilines; infile datalines; length FullName Addresse $36; input FullName $36.; input Addresse $36.; datalines; Ali Ajouz Danziger str.7 Karam Ghawi Berliner str.32 ; Reading raw data (4) Multiple observations from single record data multilines; infile datalines delimiter=","; length FullName Addresse $36; input FullName$ Addresse $ @@; datalines; Ali Ajouz, Danziger str.7, Karam Ghawi,Berliner str.32 ; run; We can use multiple INPUT statements to read group of records as single observation. We can use double trailing @@ to read a record into a group of observations
  • 7. Selecting columns by name data selected; set sashelp.cars; keep model type; Deleting columns by name data selected; set sashelp.cars; drop model type; Change column label data selected; set sashelp.cars; label model="car model" type="car type"; Selecting columns by position %let lib= sashelp; %let mydata= class; %let to_keep=(1:3); /* range */ proc sql noprint; select name into :vars separated by ' ' from dictionary.columns where libname="%upcase(&lib.)" and memname="%upcase(&mydata)" and varnum in &to_keep; quit; data want; set &lib..&mydata (keep=&vars.); %let to_keep=(1,3,5); /*list */ OR Selecting columns
  • 8. Ignore the first N rows data want; set sashelp.buy (firstobs=5); Select rows by using IF statement data want; set sashelp.buy; if amount >=-1000; Limit numbers of rows to be read data want; set sashelp.buy (obs=5); Selecting rows by index data want; do Slice=2,1,7,4; set sashelp.buy point=Slice; output; end; stop; run; Random sample proc surveyselect data=sashelp.cars method=srs n=10 seed=12345 out=SampleSRS; run; Selecting Rows
  • 9. Peek at the data set contentsData attributes + columns list proc datasets ; contents data=SASHELP.HEART order=collate; quit; Data header (first 5 rows) proc print data=sashelp.class (obs=5); Data summary statistics proc means data=sashelp.iris N mean std min max q1 median q3; Frequency tables proc freq data=sashelp.class; tables SEX/ plots=freqplot; Number of rows data _NULL_; if 0 then set sashelp.class nobs=x; call symputx('nobs', x); stop; run; %put #rows = &nobs;
  • 10. Sorting a data set The Sort procedure order SAS data set observations by the values of one or more character or numeric variables . data Score ; infile datalines dsd; input Name$ Score; datalines; Ali,66 Karam,82 Roaa,57 Hala,57 ; proc sort data=Score out=sorted; by descending score; run; proc print data=sorted; var Name Score; title 'Listed by descending Score'; run;
  • 11. + addition - subtraction * multiplicatio n ** exponentiat ion / division Arithmetic Operators = EQ Equal to ^= NE Not Equal to > GT greater than < LT less than >= GE greater than or equal to <= LE less than or equal to IN is one of Comparison operators & AND | OR ^ NOT Logical operators A SAS operator is a symbol that represents a comparison, arithmetic calculation, or logical operation; a SAS function; or grouping parentheses. Some SAS Operators
  • 12. If statement SAS evaluates the expression in an IF-THEN/ELSE statement to produce a result that is either nonzero, zero, or missing. A nonzero and nonmissing result causes the expression to be true; a result of zero or missing causes the expression to be false. data names; infile datalines; input Name$; datalines; Ali Karam Roaa Hala ; data titles; set names; if name="Ali" then do; title="Papa"; end; else if name="Karam" then do; title="Mama"; end; else title="Kid";
  • 13. Select statement Select VS If: When you have a long series of mutually exclusive conditions, using a SELECT group is more efficient than using a series of IF-THEN statements because CPU time is reduced. data names; infile datalines; input Name$; datalines; Ali Karam Roaa Hala ; data titles; set names; select (name); when ("Ali") title="Papa"; when ("Karam") title="Mama"; otherwise title="Kid"; end; run;
  • 14. DO Statement The basic iterative DO statement in SAS has the syntax DO value = start TO stop. An END statement marks the end of the loop, as shown in the following examples. data df; do x=1 to 3; y=x**2; output; end; run; data df; x=1; do while (x<4); y=x**2; output; x=x + 1; end; run; data df; do x=1,2,3; y=x**2; output; end; run; data df; x=1; do until (x>3); y=x**2; output; x=x + 1; end; run;
  • 15. Transposing data with the data step data raw; infile datalines delimiter=","; input ID $ Name$ Try result; datalines; 1,Ali,1,160 1,Ali,2,140 2,Karam,1,141 2,Karam,3,161 ; proc sort data=raw; by id; run; data rawflat; set raw; by id; keep id Name Try1-Try3; retain Try1-Try3; array myvars {*} Try1-Try3; if first.id then do i = 1 to dim(myvars); myvars{i} = .; end; myvars{Try} = result; if last.id; run; Step 1: Sample data Step 2: Sorting by ID Step 3: Transposing the data
  • 16. Labels and Formats We can use LABEL Statement to assigns descriptive labels to variables. Also, FORMAT Statement to associates formats with variables. data Salary; infile datalines dsd; input Name$ Salary; format Salary dollar12.; Label Name="First Name" Salary="Annual Salary"; datalines; Ali,45000 Karam,50000 ; run; proc print data=Salary label; run;
  • 17. Ranking The RANK procedure computes ranks for one or more numeric variables across the observations of a SAS data set and writes the ranks to a new SAS data set. data Score ; infile datalines dsd; input Name$ Score; datalines; Ali,66 Karam,82 Roaa,57 Hala,57 ; proc rank data=Score out=order descending ties=low; var Score ; ranks ScoreRank; run; proc print data=order; title "Rankings of Participants' Scores"; run;
  • 18. to all variables x1 to xn function-name(of x1-xn) Working with numeric values SUM Sum of nonmissing arguments MEAN Average of arguments MIN Smallest value MAX Largest value CEIL Smallest integer greater than or equal to argument FLOOR Greatest integer less than or equal to argument Some useful functions to all variables begin with same string function-name(of x:) to special name list function-name(of _ALL_) function-name(of _Character_) function-name(of _Numeric_) Apply functions function-name (var1, var2, ........., varn)
  • 19. Accumulating variable using (+) data total; set sashelp.buy; TotalAmount + Amount; Summarizing data by groups proc sort data=sashelp.class out=Class; by sex; data Class (keep= sex count); set Class; by sex; if first.sex then do; count=0; end; count+1; if last.sex; Summarizing data In the DATA step, SAS identifies the beginning and end of each BY group by creating two temporary variables for each BY variable: FIRST.variable and LAST.variable.
  • 20. Working with missing values A missing value is a valid value in SAS. A missing character value is displayed as a blank, and a missing numeric value is displayed as a period. STDIZE procedure with the REPONLY option can be used to replace only the missing values. The METHOD= option enables you to choose different location measures such as the mean, median, and midrange. Only numeric input variables should be used in PROC STDIZE. Step 1: Setup /* data set*/ %let mydata =sashelp.baseball; /* vars to impute */ %let inputs = salary; /* missing Indicators*/ %let MissingInd = MIsalary; data MissingInd (drop=i); set &mydata; array mi{*} &MissingInd; array x{*} &inputs; do i=1 to dim(mi); mi{i}=(x{i}=.); end; run; Step 2: Missing indicators proc stdize data=MissingInd reponly method=median out=imputed; var &inputs; run; Step 3: Impute missing values
  • 21. Last observation carried forward (LOCF) data raw; infile datalines delimiter=","; input ID $ LastTry; datalines; 1,160 1,. 1,140 2,141 2,. ; proc sort data=raw; by id; run; data rawflat; set raw; by id; retain LOCF; * set to missing to avoid being carried to the next ID; if first.id then do; LOCF=.; end; * update from last non missing value; if LastTry not EQ . then LOCF=LastTry; run; Step 1: Sample data Step 2: Sorting by ID Step 3: Compute LOCF LOCF One method of handling missing data is simply to impute, or fill in, values based on existing data.
  • 22. Working with character values Function Details Example SUBSTR(char, position ,n) Extracts a substring from an argument phone = "(01573) 3114800" ; area = substr(phone, 2, 5) ; SCAN(char, n, 'delimiters') Returns the nth word from a string. Name = "Dr. Ali Ajouz" ; title = scan(Name, 1, ".") ; INDEX(source,excerpt) Searches a character for a string INDEX('Ali Ajouz','Ajo'); COMPRESS Remove characters from a string phone="(01573)- 3114800"; phone_c= compress(phone,'(-) '); Some useful functions
  • 23. Working with character values (2) Function Details Example TRANWRD Replaces or removes given word within a string old ="I eat Apple"; new = TRANWRD(old,"Apple","Pizza"); CATX(sep, str1,str2, .) Removes leading and trailing blanks, inserts delimiters, and returns a concatenated character string. First="Ali"; Last = "Ajouz"; UserName=catx(".", First, Last); STRIP(char) Removes leading and trailing blanks STRIP(" aliajouz@gmail.com "); Also UPCASE for uppercase and LOWCASE for lowercase Some useful functions
  • 24. SAS dates and times SAS DATE VALUE: is a value that represents the number of days between January 1, 1960, and a specified date. December 31, 1959 -1 January 1, 1960 0 January 2, 1960 1 Raw date to SAS date data Birthday; infile datalines; input date ddmmyy10.; datalines; 04/11/1980 ; Converting a text date to SAS date USING INFORMATS Refer to SAS docs for all available INFORMATS. INPUT INFORMAT 04/11/1980 mmddyy10. 04/11/80 mmddyy8. 04nov1980 date9. nov 04, 1980 worddate12 .
  • 25. Example data Me (drop=DateTime); format Date ddmmyy10. Time time.; DateTime='04Nov80:12:15'dt; Date=datepart(DateTime); Time=timepart(DateTime); Day=day(Date); Month=month(Date); Year=year(Date); run; Working with dates, and times DATEPART(datetime) Extracts the date from a SAS datetime value. TIMEPART(datetime) Extracts a time value from a SAS datetime value. DAY(date) Returns the day of the month from a SAS date value. MONTH(date) Returns the month from a SAS date value. YEAR(date) Returns the year from a SAS date value. DATE / TODAY Returns the current date Some useful functions
  • 26. Concatenating data sets rows + = data result; set data1 data2; run; We can use SET statement to combine data sets vertically.
  • 27. Concatenating data sets columns + = data result; merge data1 data2; run; We can use Merge statement to combine data sets horizontally (One-to-One) .
  • 28. Merging data sets on a common column data result; merge data1(in=InData1) data2(in=InData2); by ID; /* code here */ if InData1=1 and InData2=0; + if InData1=0 and InData2=1; if InData1=1 and InData2=1; We can use IN option in the MERGE statement with IF statement to control the output.
  • 29. to be continued .. I welcome comments, suggestions and corrections. aliajouz@gmail.com