2. SURVIVAL:
• It is the probability of remaining alive for a specific length of time.
• If our point of interest : prognosis of disease i.e 5 year survival
e.g. 5 year survival for AML is 0.19, indicate 19% of patients with AML will
survive for 5 years after diagnosis
3. e.g For 2 year survival:
S= A-D/A= 6-1/6 =5/6 = .83=83%
4. CENSORING:
• Subjects are said to be censored
• if they are lost to follow up
• drop out of the study,
• if the study ends before they die or have an outcome of interest.
• They are counted as alive or disease-free for the time they were
enrolled in the study.
• In simple words, some important information required to make a
calculation is not available to us. i.e. censored.
5. Types of censoring:
Three Types of
Censoring
Right censoring Left censoring Interval censoring
6. Right Censoring:
• Right censoring is the most common of concern.
• It means that we are not certain what happened to people after
some point in time.
• This happens when some people cannot be followed the entire
time because they died or were lost to follow-up or withdrew from
the study.
7. Left Censoring:
• Left censoring is when we are not certain what
happened to people before some point in time.
• Commonest example is when people already have
the disease of interest when the study starts.
8. Interval/Random Censoring
• Interval/random censoring is when we know that something
happened in an interval (i.e. not before starting time and not after
ending time of the study ), but do not know exactly when in the
interval it happened.
• For example, we know that the patient was well at time of start of the
study and was diagnosed with disease at time of end of the study, so
when did the disease actually begin?
• All we know is the interval.
9.
10. 10
What is survival analysis?
• Statistical methods for analyzing longitudinal data on the occurrence
of events.
• Events may include death,onset of illness, recovery from illness
(binary variables) or failure etc.
• Accommodates data from randomized clinical trial or cohort study
design.
11. Need for survival analysis:
• Investigators frequently must analyze data before all patients have died;
otherwise, it may be many years before they know which treatment is better.
• Survival analysis gives patients credit for how long they have been in the study,
even if the outcome has not yet occurred.
• The Kaplan–Meier procedure is the most commonly used method to illustrate
survival curves.
12. 12
Objectives of survival analysis:
Estimate time-to-event for a group of individuals:
-such as time until death for heart transplant patients(mortality studies)
-Time of remission for leukemic patients(in therapy trials)
To compare time-to-event between two or more groups:
-such as treated vs. placebo MI patients in a randomized controlled
trial.
To assess the prognostic co-variables:(Survival models)
-such as: weight, insulin resistance, or cholesterol influence survival
time of MI patients?
13. 14
Survival Analysis: Terms
• Time-to-event: The time from entry into a study until a
subject has a particular outcome.
• Censoring: Subjects are said to be censored if they are
lost to follow up or drop out of the study, or if the
study ends before they die or have an outcome of
interest. They are counted as alive or disease-free for
the time they were enrolled in the study.
14. Importance of censoring in survival analysis?
• Example:
we want to know the survival rates of a disease in two groups and our
outcome interest is death due the disease?
group-1 group-2
Time in
months
event
5 death
6 death
8 death
9 death
10 death
12 death
16 death
Time in
months
event
9 death
8 death
12 death
20 death
6 death
7 death
4 death
This data can’t be analysed by
survival analysis method.As
there is no censored data.In this
case as all pts. died so we can
take mean time of death and
know which group has more
survival time
Also data shouldn’t have
>50% censored data
15. SURVIVAL FUNCTION:
Let T= Time of death(disease)
• Survival function S(t)=F(t)
=prob.(alive at time t)
=prob.(T>t)
In simple terms it can be defined as
No. of pts. Surviving longer than ‘t’
S(t)= ----------------------------------------------
Total no. of pts.
16. 18
Kaplan-Meier estimate of survival function:
• Calculate the survival of study population.
• Easy to calculate.
• Non-parametric estimate of the survival function.
• Commonly used to compare two study populations.
• Applicable to small,moderate and large samples.
17. Kaplan-Meier Estimate:
• The survival probability can be calculated in the following way:
P1 =Probability of surviving for atleast 1 day after transplant
P2 =Probability of surviving the second day after having survived the
first day.
P3 = Probability of surviving the third day after having survived the
second day
18. • To calculate S(t) we need to estimate each of P1,P2,P3 ……. Pt
probability of survival at time ‘t’ calculated as:
No. of pts. Followed for atleast (t-1)days and who also
survived day t
Pt = --------------------------------------------------------------------------
No. of patients alive at the end of day (t-1)
S(t) = P1 x P2 x P3 …….x Pt
19. Example: 10 Tumor patients(remission time)
Event Time
(T)
Number at Risk
ni
Number of
Events
di
(ni – di)/ni Survival
S(t)=흅(ni – di)/ni
3 10 1 9/10 9/10
4+
5.7+
6.5 7 2 5/7 9/10*5/7
• In this method first step is to list the times when a death or drop
out occurs, as in the column “Event Time”.
8.4+
10 4 1 3/4 9/10*5/7*3/4
10+
12 2 1 1/2 9/10*5/7*3/4*1/2
• One patient's disease progressed at 3 month and another at 6.5,
10, 12 & 15months, and they are listed under the column “Number
of Events” (di) and ni denotes No. of patients at risk at that point of
time.
• Then, each time an event or outcome occurs, probability of survival
15 1 0 0 0
at that point of time and survival times(t) calculated.
Denotes
censored
data
20. Survival Data (right-censored)
Subject A
Subject B
Subject C
Subject D
Subject E
1. subject E dies at 4
months
X
Beginning of study Time in months End of study
0
21. Corresponding Kaplan-Meier Curve
100%
Time in months
Probability of
surviving to 4
months is 100% =
5/5
Fraction
surviving this
death = 4/5
Subject E dies at 4
months
4
22. Survival Data
Subject A
Beginning of study End of study
Time in months
Subject B
Subject C
Subject D
Subject E
2. subject A
drops out after
6 months
1. subject E dies at 4
months
X
3. subject C dies
X at 7 months
24. Survival Data
Subject A
Beginning of study End of study
Time in months
Subject B
Subject C
Subject D
Subject E
2. subject A
drops out after
6 months
4. Subjects B
and D survive
for the whole
year-long
study period
1. subject E dies at 4
months
X
3. subject C dies
X at 7 months
25. 12
Corresponding Kaplan-Meier Curve
100%
Rule from probability theory:
P(A&B)=P(A)*P(B) if A and B independent
In kaplan meier : intervals are defined by failures(2 intervals leading to failures here).
P(surviving intervals 1 and 2)=P(surviving interval 1)*P(surviving interval 2)
Product limit estimate of survival =
P(surviving interval 1/at-risk up to failure 1) *
P(surviving interval 2/at-risk up to failure 2)
= 4/5 * 2/3= .5333
Time in months
0
The probability of surviving in the entire year, taking into account
censoring
= (4/5) (2/3) = 53%
26. Properties of survival function:
1.Step function
2.Median survival time estimate(i.e 50% of pts. survival time)
27. Median survival? 12 &22
Which has better survival? (2nd one)
What proportion survives 20days?(in 1st graph=around 35% and in
2nd onearound 62%)
28. Limitations of Kaplan-Meier:
1.Must have >50% uncensored observations.
2.Median survival time.
3. Doesn’t control for covariates.
4.Assumes that censoring occurs independent of survival
times.(what if the person who develops adverse effect due to some
treatment and forced to leave or died?)
30. Comparison between 2 survival curve
• Don’t make judgments simply on the
basis of the amount of separation
between two lines
31. Comparison between 2 survival curve:
• methods may be used to compare survival curves.
• Logrank statistic.
• Breslow Statistics
• Tarone-Ware Statistics
32. LOGRANK TEST:
• The log rank statistic is one of the most commonly used methods to
learn if two curves are significantly different.
• This method also known as Mantel-logrank statistics or Cox-Mantel-logrank
statistics.
• The logrank statistic is distributed as χ2 with a H0 that survival
functions of the two groups are the same
33. LOG-RANK TEST
• Emphasizes failures in the tail
of the survival curve,where
The no. at risk decreases over
time,yet equal weight is given
to each failure time.
• USUALLY GIVE STATISTICALLY
SIGNIFICANT RESULTS
BRESLOW STATISTICS
• Gives greater weight to early
observations. It is less
sensitive than the Log-Rank
test to late events when few
subjects remain in the study.
TARONE-WARE
STATISTICS
• Provide a compromise
between the Log-Rank
test and Breslow
Statistics with an
intermediate weighting
scheme.This test
maintains power across
a wider range of
alternatives than do the
other two tests.
• USUALLY APPLIED.
34. Hazard function:
• Opposite to survival function
• Hazard function is the derivative of the survival function over time
h(t)=dS(t)/dt
• instantaneous risk of event at time t (conditional failure rate)
• It is the probability that a person will die in the next interval of time,
given that he survived until the beginning of the interval.
35. Hazard function
• Hazard function given by
h(t,x1,x2…x5)=ƛ0 (t)eb1x1+b2x2+….b5x5
• ƛ0 is the baseline hazard at time t i.e. ƛ0(t)
• For any individual subject the hazard at time t is hi(t).
• hi(t) is linked to the baseline hazard h0(t) by
loge {hi(t)} = loge{ƛ0(t)} + β1X1 + β2X2 +……..+ βpXp
• where X1, X2 and Xp are variables associated with the subject
36. 38
Cox-Proportional hazards:
Hazard ratio
Hazard for person i (eg a smoker)
x x
t e
h t
( ) i j ik jk
( ) ... ( )
i k ik
e
i j ...
0
...
0
,
1 1 1 1
1 1
1 1
( )
( )
( )
j k jk
x x x x
x x
i
j
t e
h t
HR
Hazard for person j (eg
a non-smoker)
Hazard functions should be strictly parallel!
Produces covariate-adjusted hazard ratios!
37. 39
The model: binary predictor
h t
( )
( )
smoking
smoking
smoking age
smoking age
HR e
e
t e
t e
i
h t
HR
lung cancer smoking
j
lung cancer smoking
/
(1 0)
(0) (60)
0
(1) (60)
0
/
( )
( )
This is the hazard ratio for smoking adjusted for age.
38.
39. Importance
• Provides the only valid method of predicting a time dependent
outcome , and many health related outcomes related to time.
• Can be interpreted in relative risk or odds ratio
• Gives survival curves with control of confounding variables.
• Can be used with multiple events for a subject.
40. Take Home Message
• survival analysis Estimate time-to-event for a group of individuals and To
compare time-to-event between two or more groups.
• In survival data is transformed into censored and uncensored data
• all those who achieve the outcome of interest are uncensored” data
• those who do not achieve the outcome are “censored” data
41. Take Home Message
• The Kaplan-Meier method uses the next death, whenever it occurs, to
define the end of the last class interval and the start of the new class
interval.
• Log-Rank test used to compare 2 survival curves but does not control
for confounding.
• For control for confounding use another test called as ‘Cox
Proportional Hazards Regression.’