SlideShare ist ein Scribd-Unternehmen logo
1 von 182
The Essence of DATA Step Programming Arthur Li City of Hope Comprehensive Cancer Center Department of Information Science
INTRODUCTION SAS programming DATA step programming Understanding how SAS processes the data during the  compilation  and  execution  phases Fundamental: Essence:
A COMMON BEFUDDLEMENT  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
INTRODUCTION ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
DATA STEP PROCESSING OVERVIEW Compilation phase: Each statement is scanned for syntax errors.  Execution phase: The DATA step reads and processes the input data. If there is no syntax error A DATA step is processed in two-phase sequences :
DATA STEP PROCESSING OVERVIEW Program1: data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; Data Entry Error ,[object Object],[object Object],[object Object],[object Object],12-14 Weight 9-10 Height 1-7 Name Columns Variable names  Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; ,[object Object],[object Object],Input buffer
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV PDV is created Memory area where SAS builds its new data set, 1 observation at a time. Input buffer _N_ D _ERROR_ D
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV PDV is created Automatic variables: _N_ = 1: 1 st  observation is being processed _N_ = 2: 2 nd  observation is being processed Input buffer _N_ D _ERROR_ D
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV PDV is created Automatic variables: _ERROR_ = 1: signals the data error of the currently-processed observation Input buffer _N_ D _ERROR_ D
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV A space is added to the PDV for each variable Input buffer _N_ D _ERROR_ D Height K Name K Weight K
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV BMI is added to the PDV Input buffer _N_ D _ERROR_ D Height K Name K Weight K BMI K
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV D = dropped K = kept Input buffer _N_ D _ERROR_ D Height K Name K Weight K BMI K
COMPILATION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],[object Object],[object Object],Input buffer _N_ D _ERROR_ D Height K Name K Weight K BMI K
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],[object Object],Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],[object Object],.  .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 0
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D ,[object Object],[object Object],[object Object],1 0 .  .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D 1 0 ,[object Object],.  .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D 1 0 ,[object Object],Barbara .  .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D 1 0 ,[object Object],Barbara .  .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D 1 0 .  . ,[object Object],Barbara 61 Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D 1 0 ,[object Object],Barbara 61 .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D 1 0 ,[object Object],Barbara 61 .  . Input buffer _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Example1.txt 12345678901234567890
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D ,[object Object],[object Object],Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D ,[object Object],Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D ,[object Object],[object Object],Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D ,[object Object],[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1 st  Iteration: B  a  r  b  a  r  a  6  1  1  2  D ,[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 1. The SAS system returns to the beginning of the DATA step Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 2. The values of the variables in the PDV are reset to  missing   _N_  ↑  2 _ERROR_    0  Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 . .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: J  o  h  n  6  2  1  7  5 ,[object Object],[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 . .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: J  o  h  n  6  2  1  7  5 ,[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 . 62 John 175 1 . . 61 Barbara BMI Weight Height Name
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: J  o  h  n  6  2  1  7  5 ,[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 31.8678 62 John 175 1 . . 61 Barbara BMI Weight Height Name
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: J  o  h  n  6  2  1  7  5 ,[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 31.8678 62 John 175 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: J  o  h  n  6  2  1  7  5 ,[object Object],Ex1: Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 31.8678 62 John 175 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV Ex1: 1. The SAS system returns to the beginning of the DATA step Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 31.8678 62 John 175 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; PDV Ex1: 2. The values of the variables in the PDV are reset to  missing   _N_  ↑ 3 Input buffer Barbara 61 12D John   62 175 Example1.txt 12345678901234567890 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 3 0 . .  .
EXECUTION PHASE data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; proc   print   data =ex1; run ; ,[object Object],[object Object]
THE OUTPUT STATEMENT data  ex1; set  example1; BMI =  700 *weight/(height*height); run ; ,[object Object],[object Object],[object Object],output ;
THE OUTPUT STATEMENT data  ex1; set  example1; BMI =  700 *weight/(height*height); run ; ,[object Object],[object Object]
THE OUTPUT STATEMENT ,[object Object],[object Object]
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; infile   'C:rthurxample1.txt' ; input  name $  1 - 7  height  9 - 10  weight  12 - 14 ; BMI =  700 *weight/(height*height); output ; run ; SAS dataset ,[object Object],Input buffer PDV _N_ D _ERROR_ D Name K Height K Weight K BMI K Barbara 61 12D John   62 175 Raw data 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; SAS dataset ,[object Object],SAS dataset Input dataset: Example1 (after “set”) Output dataset: Ex1 (after “data”) PDV _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 1 175 62 John . 61 Barbara Weight Height Name 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET ,[object Object],[object Object],[object Object],[object Object],[object Object]
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],Example1: 2 1 175 62 John 170 61 Barbara Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 0 . .  .
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],Example1: 2 1 175 62 John 170 61 Barbara Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 0 . Barbara 170  61
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],Example1: 2 1 175 62 John 170 61 Barbara Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 0 Barbara 31.9807 170  61
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],Example1: Ex1: 2 1 175 62 John 170 61 Barbara Weight Height Name 170 Weight 1 31.9807 61 Barbara BMI Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 0 Barbara 31.9807 170  61
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: ,[object Object],Example1: Ex1: Variables exist in the  input  dataset ,[object Object],[object Object],2 1 175 62 John 170 61 Barbara Weight Height Name 170 Weight 1 31.9807 61 Barbara BMI Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 Barbara . 170  61
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV 2 nd  Iteration: ,[object Object],Example1: Ex1: Variables being created in the DATA step ,[object Object],2 1 175 62 John 170 61 Barbara Weight Height Name 170 Weight 1 31.9807 61 Barbara BMI Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 Barbara . 170  61
THE DIFFERENCE BETWEEN READING A RAW DATASET AND READING A SAS DATASET data  ex1; set  example1; BMI =  700 *weight/(height*height); output ; run ; PDV ,[object Object],[object Object],[object Object],Example1: Ex1: 2 1 175 62 John 170 61 Barbara Weight Height Name 170 Weight 1 31.9807 61 Barbara BMI Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 John . 175  62
THE RETAIN STATEMENT Consider the following dataset: ,[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID 7 3 3 TOTAL
THE RETAIN STATEMENT Consider the following dataset: ,[object Object],[object Object],[object Object],Problem : TOTAL is a new variable that you want to create    TOTAL will be set to missing in the PDV at the beginning of every iteration of the execution. 4 A03 3 . A02 2 3 A01 1 SCORE ID 7 3 3 TOTAL
THE RETAIN STATEMENT ,[object Object],RETAIN  VARIABLE <VALUE>; ,[object Object]
THE RETAIN STATEMENT ,[object Object],RETAIN  VARIABLE <VALUE>; Name of the variable that we will want to retain   ,[object Object],[object Object],[object Object]
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],_N_ D _ERROR_ D ID K Total K 4 A03 3 . A02 2 3 A01 1 SCORE ID Score K
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],[object Object],[object Object],1 st  Iteration: 4 A03 3 . A02 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score K 1 0 . 0
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],1 st  Iteration: 4 A03 3 . A02 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score K 1 0 3 0 A01
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],[object Object],1 st  Iteration: 4 A03 3 . A02 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score K 1 0 3 0 A01
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],1 st  Iteration: 4 A03 3 . A02 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score K 1 0 3 3 A01
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],1 st  Iteration: Ex2_2: 4 A03 3 . A02 2 3 A01 1 SCORE ID 3 SCORE 3 TOTAL A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 1 0 3 3 A01
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],[object Object],[object Object],2 nd  Iteration: Ex2_2: 4 A03 3 . A02 2 3 A01 1 SCORE ID 3 SCORE 3 TOTAL A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 2 0 3 3 A01
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],2 nd  Iteration: Ex2_2: 4 A03 3 . A02 2 3 A01 1 SCORE ID 3 SCORE 3 TOTAL A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 2 0 . 3 A02
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV ,[object Object],2 nd  Iteration: Ex2_2: 4 A03 3 . A02 2 3 A01 1 SCORE ID 3 SCORE 3 TOTAL A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 2 0 . 3 A02
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV 2 nd  Iteration: Ex2_2: ,[object Object],[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID . 3 SCORE 3 3 TOTAL A02 2 A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 2 0 . 3 A02
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV 3 rd  Iteration: Ex2_2: ,[object Object],[object Object],[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID . 3 SCORE 3 3 TOTAL A02 2 A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 3 0 . 3 A02
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV 3 rd  Iteration: Ex2_2: ,[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID . 3 SCORE 3 3 TOTAL A02 2 A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 3 0 4 3 A03
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV 3 rd  Iteration: Ex2_2: ,[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID . 3 SCORE 3 3 TOTAL A02 2 A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 3 0 4 7 A03
THE RETAIN STATEMENT data  ex2_2; set  ex2; retain  total  0 ; total = sum(total, score); run ; PDV 3 rd  Iteration: Ex2_2: ,[object Object],[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID 4 . 3 SCORE 7 3 3 TOTAL A03 3 A02 2 A01 1 ID _N_ D _ERROR_ D ID K Total K Score K 3 0 4 7 A03
THE SUM STATEMENT ,[object Object],VARIABLE + EXPRESSION; ,[object Object],[object Object],[object Object],[object Object],[object Object]
THE SUM STATEMENT data  ex2_2; set  ex2; run ; retain  total  0 ; total = sum(total, score); The previous program can be re-written as…
THE SUM STATEMENT data  ex2_2; set  ex2; run ; The previous program can be re-written as… total + score;
THE SUBSETTING IF STATEMENT ,[object Object],IF   EXPRESSION ; ,[object Object],[object Object],[object Object]
THE SUBSETTING IF STATEMENT ,[object Object],IF   EXPRESSION ; ,[object Object],[object Object],[object Object],[object Object]
THE BY-GROUP PROCESSING IN THE DATA STEP  One observation per subject Multiple observations per subject -- Longitudinal data ,[object Object],[object Object],4 A03 3 . A02 2 3 A01 1 SCORE ID 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID
THE BY-GROUP PROCESSING IN THE DATA STEP  ,[object Object],[object Object],[object Object],[object Object],SAS reads the 1 st  observation for ID = A01 SAS reads the last observation for ID = A01 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 0 1 0 0 1 FIRST.ID 1 0 1 0 0 LAST.ID
THE BY-GROUP PROCESSING IN THE DATA STEP  ,[object Object],proc   sort   data =ex3; by  id; run ; data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 6 A02 2 9 A01 1 TOTAL ID
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV 1 st  iteration: ,[object Object],[object Object],[object Object],[object Object],2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 1 0 1 1 . 0
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],[object Object],1 st  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 1 0 1 0 A01 3 0
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],1 st  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 1 0 1 0 A01 3 0
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],1 st  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 1 0 1 0 A01 3 3
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],1 st  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 1 0 1 0 A01 3 3
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],2 nd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 2 0 1 0 A01 3 3
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],[object Object],2 nd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 2 0 0 0 A01 4 3
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],2 nd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 2 0 0 0 A01 4 3
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],2 nd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 2 0 0 0 A01 4 7
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],2 nd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 2 0 0 0 A01 4 7
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],3 rd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 3 0 0 0 A01 4 7
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],[object Object],3 rd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 3 0 0 1 A01 2 7
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],3 rd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 3 0 0 1 A01 2 7
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],3 rd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 3 0 0 1 A01 2 9
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],3 rd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 3 0 0 1 A01 2 9
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],[object Object],Ex3_1: 3 rd  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 3 0 0 1 A01 2 9
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],Ex3_1: 4 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 4 0 0 1 A01 2 9
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],[object Object],Ex3_1: 4 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 4 0 1 0 A02 4 9
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],Ex3_1: 4 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 4 0 1 0 A02 4 0
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],Ex3_1: 4 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 4 0 1 0 A02 4 4
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV Ex3_1: ,[object Object],[object Object],4 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 4 0 1 0 A02 4 4
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],Ex3_1: 5 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 5 0 1 0 A02 4 4
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],[object Object],[object Object],Ex3_1: 5 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 5 0 0 1 A02 2 4
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],Ex3_1: 5 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 5 0 0 1 A02 2 4
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],Ex3_1: 5 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 5 0 0 1 A02 2 6
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV ,[object Object],Ex3_1: 5 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 5 0 0 1 A02 2 6
THE BY-GROUP PROCESSING IN THE DATA STEP  data  ex3_1 (drop=score); set  ex3; by  id; if  first.id =  1   then  total =  0 ; total + score; if  last.id =  1 ;  run ; PDV Ex3_1: ,[object Object],[object Object],5 th  iteration: 2 A02 5 4 A02 4 2 A01 3 4 A01 2 3 A01 1 SCORE ID 6 A02 2 9 A01 1 TOTAL ID _N_ D _ERROR_ D ID K Total K Score D FIRST.ID D LAST.ID D 5 0 0 1 A02 2 6
RESTRUCTURING DATASETS ,[object Object],data with one observation per subject  (the wide format)  data with multiple observations per subject  (the long format) 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
RESTRUCTURING DATASETS ,[object Object],data with one observation per subject  (the wide format)  data with multiple observations per subject  (the long format) S1 – S3 SCORE Distinguish different measurements for each subject 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
RESTRUCTURING DATASETS ,[object Object],[object Object],[object Object],[object Object]
FROM WIDE FORMAT TO LONG FORMAT Wide: Long: ,[object Object],[object Object],[object Object],[object Object],data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],[object Object],4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID K ID . D S1 . D S2 . D S3 . K TIME . K SCORE 1 K _N_
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 . K TIME . K SCORE 1 K _N_
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 1 K TIME . K SCORE 1 K _N_
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 1 K TIME 3 K SCORE 1 K _N_
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 1 K TIME 3 K SCORE 1 K _N_ 1 TIME 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 2 K TIME 3 K SCORE 1 K _N_ 1 TIME 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 2 K TIME 4 K SCORE 1 K _N_ 1 TIME 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 2 K TIME 4 K SCORE 1 K _N_ 2 1 TIME 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 3 K TIME 4 K SCORE 1 K _N_ 2 1 TIME 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 3 K TIME 5 K SCORE 1 K _N_ 2 1 TIME 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 3 K TIME 5 K SCORE 1 K _N_ 3 2 1 TIME 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 1 st  iteration: ,[object Object],[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 3 K TIME 5 K SCORE 1 K _N_ 3 2 1 TIME 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],[object Object],[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 3 D S1 4 D S2 5 D S3 . K TIME . K SCORE 2 K _N_ 3 2 1 TIME 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 . K TIME . K SCORE 2 K _N_ 3 2 1 TIME 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 1 K TIME . K SCORE 2 K _N_ 3 2 1 TIME 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 1 K TIME 4 K SCORE 2 K _N_ 3 2 1 TIME 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 1 K TIME 4 K SCORE 2 K _N_ 1 3 2 1 TIME 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 2 K TIME 4 K SCORE 2 K _N_ 1 3 2 1 TIME 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 2 K TIME . K SCORE 2 K _N_ 1 3 2 1 TIME 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 2 K TIME . K SCORE 2 K _N_ 1 3 2 1 TIME 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 3 K TIME . K SCORE 2 K _N_ 1 3 2 1 TIME 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 3 K TIME 2 K SCORE 2 K _N_ 1 3 2 1 TIME 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 3 K TIME 2 K SCORE 2 K _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM WIDE FORMAT TO LONG FORMAT Wide: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  not missing(score)  then   output ; time =  2 ; score = s2; if  not missing(score)  then   output ; time =  3 ; score = s3; if  not missing(score)  then   output ; run ; 2 nd  iteration: ,[object Object],[object Object],Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID A01 K ID 4 D S1 . D S2 2 D S3 3 K TIME 2 K SCORE 2 K _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM LONG FORMAT TO WIDE FORMAT 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM LONG FORMAT TO WIDE FORMAT ,[object Object],[object Object],[object Object],4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM LONG FORMAT TO WIDE FORMAT if  time = 1  then  s1 = score; else if  time = 2  then  s2 = score; else  s3 = score; ,[object Object],[object Object],[object Object],RETAIN 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID S3 S1 S3 S2 S1
FROM LONG FORMAT TO WIDE FORMAT proc   sort   data =long; by  id; run ; data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . . . . 1 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . . 3 1 A01 1 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . . 3 1 A01 0 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 3 3 1 A01 0 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 3 3 1 A01 0 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 3 3 1 A01 0 1 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 3 3 1 A01 0 1 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 3 4 2 A01 0 1 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 3 4 2 A01 0 0 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . 4 3 4 2 A01 0 0 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . 4 3 4 2 A01 0 0 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . 4 3 4 2 A01 0 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . 4 3 5 3 A01 0 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 5 3 A01 1 0 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 4 1 A02 1 0 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 4 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 4 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 4 4 1 A02 0 1 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 4 2 3 A02 0 1 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],How to fix this? 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 2 4 4 A02 2 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ;
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 5 3 A01 1 0 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 4 1 A02 1 0 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 5 4 3 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . . 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 4 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 4 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 4 4 1 A02 0 1 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 4 2 3 A02 0 1 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 . 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 . 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 S1 4 S2 5 A01 1 S3 ID
FROM LONG FORMAT TO WIDE FORMAT data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 =  . ; s3 =  . ; end ; if  time =  1   then  s1 = score; else   if  time =  2   then  s2 = score; else  s3 = score; if  last.id; run ; ,[object Object],[object Object],[object Object],3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 . 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 2 . 4 A02 2 3 S1 4 S2 5 A01 1 S3 ID
CONCLUSION  ,[object Object],[object Object]
REFERENCES ,[object Object]
ACKNOWLEDGEMENT ,[object Object]
CONTACT INFORMATION ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Weitere ähnliche Inhalte

Ähnlich wie The essence of data step programming

Labsheet 7 FP 201
Labsheet 7 FP 201Labsheet 7 FP 201
Labsheet 7 FP 201rohassanie
 
Stack - Data Structure - Notes
Stack - Data Structure - NotesStack - Data Structure - Notes
Stack - Data Structure - NotesOmprakash Chauhan
 
Pytest: escreva menos, teste mais
Pytest: escreva menos, teste maisPytest: escreva menos, teste mais
Pytest: escreva menos, teste maisErick Wilder
 
Oracle PL/SQL - Creative Conditional Compilation
Oracle PL/SQL - Creative Conditional CompilationOracle PL/SQL - Creative Conditional Compilation
Oracle PL/SQL - Creative Conditional CompilationScott Wesley
 

Ähnlich wie The essence of data step programming (8)

Labsheet 7 FP 201
Labsheet 7 FP 201Labsheet 7 FP 201
Labsheet 7 FP 201
 
ECMAScript 6
ECMAScript 6ECMAScript 6
ECMAScript 6
 
Plsql programs(encrypted)
Plsql programs(encrypted)Plsql programs(encrypted)
Plsql programs(encrypted)
 
Stack - Data Structure - Notes
Stack - Data Structure - NotesStack - Data Structure - Notes
Stack - Data Structure - Notes
 
Stacks.ppt
Stacks.pptStacks.ppt
Stacks.ppt
 
Stacks.ppt
Stacks.pptStacks.ppt
Stacks.ppt
 
Pytest: escreva menos, teste mais
Pytest: escreva menos, teste maisPytest: escreva menos, teste mais
Pytest: escreva menos, teste mais
 
Oracle PL/SQL - Creative Conditional Compilation
Oracle PL/SQL - Creative Conditional CompilationOracle PL/SQL - Creative Conditional Compilation
Oracle PL/SQL - Creative Conditional Compilation
 

Kürzlich hochgeladen

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 

Kürzlich hochgeladen (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 

The essence of data step programming

  • 1. The Essence of DATA Step Programming Arthur Li City of Hope Comprehensive Cancer Center Department of Information Science
  • 2. INTRODUCTION SAS programming DATA step programming Understanding how SAS processes the data during the compilation and execution phases Fundamental: Essence:
  • 3.
  • 4.
  • 5. DATA STEP PROCESSING OVERVIEW Compilation phase: Each statement is scanned for syntax errors. Execution phase: The DATA step reads and processes the input data. If there is no syntax error A DATA step is processed in two-phase sequences :
  • 6.
  • 7.
  • 8. COMPILATION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV PDV is created Memory area where SAS builds its new data set, 1 observation at a time. Input buffer _N_ D _ERROR_ D
  • 9. COMPILATION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV PDV is created Automatic variables: _N_ = 1: 1 st observation is being processed _N_ = 2: 2 nd observation is being processed Input buffer _N_ D _ERROR_ D
  • 10. COMPILATION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV PDV is created Automatic variables: _ERROR_ = 1: signals the data error of the currently-processed observation Input buffer _N_ D _ERROR_ D
  • 11. COMPILATION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV A space is added to the PDV for each variable Input buffer _N_ D _ERROR_ D Height K Name K Weight K
  • 12. COMPILATION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV BMI is added to the PDV Input buffer _N_ D _ERROR_ D Height K Name K Weight K BMI K
  • 13. COMPILATION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV D = dropped K = kept Input buffer _N_ D _ERROR_ D Height K Name K Weight K BMI K
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29. EXECUTION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV 1. The SAS system returns to the beginning of the DATA step Ex1: Input buffer Barbara 61 12D John 62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 1 1 Barbara 61 . .
  • 30. EXECUTION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV 2. The values of the variables in the PDV are reset to missing _N_ ↑ 2 _ERROR_  0 Ex1: Input buffer Barbara 61 12D John 62 175 Example1.txt 12345678901234567890 1 . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 . . .
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. EXECUTION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV Ex1: 1. The SAS system returns to the beginning of the DATA step Input buffer Barbara 61 12D John 62 175 Example1.txt 12345678901234567890 _N_ D _ERROR_ D Name K Height K Weight K BMI K 2 0 31.8678 62 John 175 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name
  • 37. EXECUTION PHASE data ex1; infile 'C:rthurxample1.txt' ; input name $ 1 - 7 height 9 - 10 weight 12 - 14 ; BMI = 700 *weight/(height*height); output ; run ; PDV Ex1: 2. The values of the variables in the PDV are reset to missing _N_ ↑ 3 Input buffer Barbara 61 12D John 62 175 Example1.txt 12345678901234567890 2 1 31.8678 175 62 John . . 61 Barbara BMI Weight Height Name _N_ D _ERROR_ D Name K Height K Weight K BMI K 3 0 . . .
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71. THE SUM STATEMENT data ex2_2; set ex2; run ; retain total 0 ; total = sum(total, score); The previous program can be re-written as…
  • 72. THE SUM STATEMENT data ex2_2; set ex2; run ; The previous program can be re-written as… total + score;
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101.
  • 102.
  • 103.
  • 104.
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
  • 110.
  • 111.
  • 112.
  • 113.
  • 114.
  • 115.
  • 116.
  • 117.
  • 118.
  • 119.
  • 120.
  • 121.
  • 122.
  • 123.
  • 124.
  • 125.
  • 126.
  • 127.
  • 128.
  • 129.
  • 130.
  • 131.
  • 132.
  • 133. FROM LONG FORMAT TO WIDE FORMAT 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  • 134.
  • 135.
  • 136. FROM LONG FORMAT TO WIDE FORMAT proc sort data =long; by id; run ; data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  • 137.
  • 138.
  • 139.
  • 140.
  • 141.
  • 142.
  • 143.
  • 144.
  • 145.
  • 146.
  • 147.
  • 148.
  • 149.
  • 150.
  • 151.
  • 152.
  • 153.
  • 154.
  • 155.
  • 156.
  • 157.
  • 158.
  • 159.
  • 160.
  • 161.
  • 162.
  • 163.
  • 164.
  • 165. FROM LONG FORMAT TO WIDE FORMAT data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ;
  • 166.
  • 167.
  • 168.
  • 169.
  • 170.
  • 171.
  • 172.
  • 173.
  • 174.
  • 175.
  • 176.
  • 177.
  • 178.
  • 179.
  • 180.
  • 181.
  • 182.