2. www.edureka.co/base-sas
Know your instructor
ï Hi, my name is Rakesh. And I am your trainer for tonight.
ï I have done my Masters in Statistics from IIT, Kanpur.
ï I am BASE SAS certified.
ï I have been training professionals and students for last 6.5 years.
3. www.edureka.co/base-sas
What will you learn today?
ïŒ Introduction to Analytics
ïŒ Why there is so much hype around SAS?
ïŒ Top 5 Features of SAS
ïŒ Types of SAS Datasets
ïŒ How SAS Language works!
ïŒ Reading Data into SAS
ïŒ Plotting graphs to understand the data
6. www.edureka.co/base-sas
What is SAS and why all the hype around it?
âą SAS (pronounced âsassâ) once stood for "statistical analysis system."
âą It began at North Carolina State University as a project to analyze agricultural research.
âą Demand for such software capabilities began to grow, and SAS was founded in 1976 to help
customers in all sorts of industries â from pharmaceutical companies and banks to academic and
governmental entities.
âą SAS â both the software and the company â thrived throughout the next few decades.
Development of the software attained new heights in the industry because it could run across all
platforms, using the multivendor architecture for which it is known today
Jim Goodnight
7. www.edureka.co/base-sas
Why SAS?
Is a mature development platform and has rich product documentation
Has great certification programs which carry a lot of weight in market
Has awesome product support
Been there since long and has been deployed in production of many mission critical jobs (like
churning out Pharma regulatory report and Clinical Trial analysis)
Is the first choice for Fraud Detection, Data Analytics involving huge amounts of data for any big
Enterprise
Is an undisputed Market leader in Data Analytics and Modeling (latest Gartner, Forrester Research)
9. www.edureka.co/base-sas
SAS Datasets
For SAS to read the dataset, it must be in a special form called SAS dataset
Variables & Observations: SAS dataset looks like a spreadsheet, where Variables are the columns
and Observations are the rows
Size of SAS datasets: Earlier, SAS datasets could contain 32,767 variables. So, RELAX !!
10. www.edureka.co/base-sas
SAS Datasets (Contd.)
Names must be less
than or equal to 32
characters in length
Rule 1
Names must start
with a letter or an
underscore ( _ )
Rule 2
Names contain only
letters, numbers or
underscore(s)
Rule 3
SAS names are
CASE-INSENSITIVE!
Rule 4
NAMING CONVENTION:
Rules for SAS names of Variables and Datasets:
SAS Dataset contains
Data
Documentation on data: name,
date of creation, information
about each variable etc.
14. www.edureka.co/base-sas
How SAS looks!
SAS has N=3 sub windows:- 1. Editor, 2. Log & 3. Output
Below is the SAS Interface. This
is how SAS main starting page
looks like.
15. www.edureka.co/base-sas
How SAS looks! - LIBRARIES
Primarily, SAS has N=2 libraries â 1. Temporary Library & 2. Permanent Library .Library:- Place to store Data Set.
WHAT IS TEMPORARY & PERMANENT LIBRARY?
Temporary Library:- It is the by-default library of SAS, where data is stored temporarily. The Data stored in Temp.
Lib. will be erased off as and when we close the SAS session. âWorkâ is the TEMPORARY LIBRARY.
Permanent Library:- These are user defined libraries in SAS. This is created by us and data is not erased off when
we end the session. Later we will learn âHow to create Permanent Lib.â
WHY DO WE NEED A LIBRARY?
We need a library to store our Data Sets. Here there is a very simple analogy between our day-to-day life and SAS
â Normally we have Files stored in Folders or Directories . Similarly, we have Data Sets and Libraries in SAS.
Temporary Library stores all current activities on RAM for fast processing.
And , Permanent Library is used to store Data Set for future use also.
18. www.edureka.co/base-sas
Reading Data in SAS
How to import / read data from Text file or CSV file?
We use âInfileâ statement to read CSV file
What is Text file or CSV file?
Comma-Separated Values (CSV) files.
This is data usually created in Notepad.
Vey useful in terms of Social media Analytics.
Like â Twitter or Facebook.
1.
2.
3.
What id âdsdâ ?
Delimiter Sensitive Data. It performs several functions. First, it changes the default delimiter from a blank to a
comma. Next, if there are two delimiters in a row, it assumes there is a missing value between. Finally, if
character values are placed in quotes (single or double quotes), the quotes are stripped from the value. Thatâs a
lot of mileage for just three letters!
20. www.edureka.co/base-sas
Reading Data in SAS
Why do we use âDatalinesâ in SAS?
Suppose you want to write a short test program in SAS. Instead of having to place your data in an
external file, you can place your lines of data directly in your SAS program by using a DATALINES
statement.
21. www.edureka.co/base-sas
Reading Data in SAS
We donât use
semicolon
after every
line.
We can also use Datalines and Infile together to read data.
ï¶ We also use âDLMâ to
define any other delimiter
other than space(default
delimiter).
ï¶ DLM defines 200 other
delimiters.
23. www.edureka.co/base-sas
Plotting graphs to understand the data
/*--SGPLOT proc statement--*/
proc sgplot data=SASHELP.IRIS;
/*--TITLE and FOOTNOTE--*/
title 'Iris Graph';
footnote2 j=l 'Did you like it!!!';
/*--Scatter plot settings--*/
scatter x=SepalLength y=SepalWidth / group=Species
transparency=0.0
name='Scatter';
/*--X Axis--*/
xaxis grid;
/*--Y Axis--*/
yaxis grid;
run;
24. www.edureka.co/base-sas
Using Data & Set statement
ï When do we use DATA statement?
When you write a DATA statement such as:
data test;
SAS creates a temporary SAS data set called Test.
ï When do we use SET statement?
SET statement is used for reading observations from a SAS data set instead of lines from a raw data file.
Difference b/w DATA and SET statement:-
There is a difference, however. Each time you read a line of data from a raw data file, the variables being read from the raw data file or
created by assignment statements in the DATA step are initialized to a missing value during each iteration of the DATA step. Variables that are
read from SAS data sets are not set to missing values during each iteration of the DATA stepâthey are said to be retained.