SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
PROBABILITY MODELS : PROJECT REPORT
Srikanth Popuri M12388241
Poorvi Deshpande M12388313
CONTENTS
PROBABILITY MODELS : PROJECT REPORT....................................................................................................1
CONTENTS...................................................................................................... Error! Bookmark not defined.
INTRODUCTION.........................................................................................................................................2
OVERVIEW OF DATASET............................................................................................................................2
PROBLEM STATEMENT..............................................................................................................................2
ANALYSIS...................................................................................................................................................3
ECDF OF SAMPLE.......................................................................................................................................4
HYPOTHESIS TESTING FOR DIFFERENCE IN MEANS : WALD TEST ............................................................5
CONCLUSION.............................................................................................................................................7
APPENDIX..................................................................................................................................................8
INTRODUCTION
The objective of this project is to translate our theoretical knowledge about the data into
practical use. We apply various methods learnt for non-parametric distribution on our
data and study their accuracy.
OVERVIEW OF DATASET
The data set used in this study is the Flights data available in R. It contains details about
the flights that departed from NYC such as the time of each flight, the time of departure
and arrival, distance of the flight etc. It has 336776 observations with 19 variables for the
flights data available for the airport on each day. We plan to study the variable ‘air_time’
which describes the time spent in air by a given flight.
PROBLEM STATEMENT
The air_time variable is being studied to determine how accurate is our sample mean to
our true population mean.
ANALYSIS
Distribution of sample of airtime:
Figure 2 : Histogram of the sample of air_time (Day 1 in January)
Figure 3 : Summary statistics of sample of air_time (Day 1 in January)
We see that the sample does not follow a normal distribution. Therefore we plot an
empirical distribution to see the cumulative distribution of sample.
ECDF OF SAMPLE
Figure 3 : ECDF of air_time sample with 95% confidence interval
We do not know the distribution of the sample. Thus, we proceed with the non-
parametric approach to compare population mean and sample mean.
Using non parametric bootstrap, we found the mean of the population which comes out
to be 169.6914.
Distribution of the bootstrap means :
Figure 4 : Histogram of the means obtained by bootstrap
HYPOTHESIS TESTING FOR DIFFERENCE IN MEANS : WALD TEST
The hypothesis testing is performed to see if there is any difference between the mean
air_time for the sample and the mean air_time for the population. Since the distribution of
the data is appearing to be not normal, the Wald – test is used for the hypothesis testing
condition. The hypothesis condition for the same is:
Ho : μ1 – μ2 = 0
Ha : μ1 – μ2 ≠ 0
where, μ1 is the mean air_time from bootstrap and μ2 is the mean air_time from
population.
The results for the test conducted in R is denoted in the figure:
Figure 1: Results from Wald test for hypothesis testing on difference of means
As observed, the p-value is very less as compared to 0.05. Thus, we have enough evidence
to reject the null hypothesis for the test. Hence, there is a difference in the sample mean
and the population mean of air_time.
BAYESIAN APPROACH
The frequentist approach says that the means are different. We will further confirm this
using a different approach i.e. Bayesian analysis, to determine if the conclusion we have
come to is correct or not.
We test the data for the difference of means of the sample and population. “Jeffreys”
method was used to test the equality of means. We assume that the variance of both
population and sample is the same (essentially, it is same as the sample comes from the
population itself).
The result of Bayesian Analysis was as follows:
According to the t-statistics observed above, there was a significant difference between
the population mean and sample mean.
CONCLUSION
Since the population of air_time of the Flights data was known, we can have a glimpse
at the population summary statistics which is as follows :
Comparing this to the sample statistics :
We can conclude that the sample mean and population mean are significantly different.
We confirm this by performing tests on this using two approaches namely Frequentist
and Bayesian approach. In the Wald test, we reject our null hypothesis that the means
are same. This is confirmed by the Bayesian test statistics.
APPENDIX
R Code :
library("nycflights13")
library(ACSWR)
library(bootstrap)
library(Bolstad)
airtime <- flights$air_time
airtime <- na.omit(airtime)
airtime_sample <- flights$air_time[flights$day==1 & flights$month==1]
airtime_sample <- na.omit(airtime_sample)
at <- hist(airtime)
summary(airtime)
sd(airtime)
hist(airtime_sample)
summary(airtime_sample)
####ecdf
airtime_ecdf <- ecdf(airtime_sample)
plot(airtime_ecdf)
Alpha=0.05
n=length(airtime_sample)
Eps=sqrt(log(2/Alpha)/(2*n))
grid<-seq(0,1000, length.out = 10000)
grid_eps_min <- airtime_ecdf(grid)+Eps
grid_eps_max <- airtime_ecdf(grid)-Eps
lines(grid, pmin(grid_eps_min,1))
lines(grid, pmax(grid_eps_max,0))
$$$boot strap
mean.boot<-bootstrap(airtime_sample, nboot = 1000, mean)
hist(mean.boot$thetastar)
mean(mean.boot$thetastar)
var(mean.boot$thetastar)
sd(mean.boot$thetastar)
#wald test
w_test <- (mean(mean.boot$thetastar) - mean(airtime))/sd(mean.boot$thetastar)
p_value <- 2*(1-pnorm(w_test))
# Bayesian analysis
bayes.t.test(as.vector(airtime_sample), y =as.vector(airtime), alternative = c("two.sided"),
mu = 0, paired = FALSE, var.equal = TRUE, conf.level = 0.95, prior = c("jeffreys"))
Probability Models Project

Weitere ähnliche Inhalte

Ähnlich wie Probability Models Project

Lecture 6 guidelines_and_assignment
Lecture 6 guidelines_and_assignmentLecture 6 guidelines_and_assignment
Lecture 6 guidelines_and_assignment
Daria Bogdanova
 
EC4417 Econometrics Project
EC4417 Econometrics ProjectEC4417 Econometrics Project
EC4417 Econometrics Project
Gearóid Dowling
 
ProjectWriteupforClass (3)
ProjectWriteupforClass (3)ProjectWriteupforClass (3)
ProjectWriteupforClass (3)
Jeff Lail
 
EC4417 Econometrics Project
EC4417 Econometrics ProjectEC4417 Econometrics Project
EC4417 Econometrics Project
Lonan Carroll
 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis Essay
Crystal Alvarez
 

Ähnlich wie Probability Models Project (20)

Assigment 1
Assigment 1Assigment 1
Assigment 1
 
Reasonable confidence limits for binomial proportions
Reasonable confidence limits for binomial proportionsReasonable confidence limits for binomial proportions
Reasonable confidence limits for binomial proportions
 
SAMPLING IN RESEARCH METHODOLOGY
SAMPLING IN RESEARCH METHODOLOGYSAMPLING IN RESEARCH METHODOLOGY
SAMPLING IN RESEARCH METHODOLOGY
 
inferencial statistics
inferencial statisticsinferencial statistics
inferencial statistics
 
Lecture 6 guidelines_and_assignment
Lecture 6 guidelines_and_assignmentLecture 6 guidelines_and_assignment
Lecture 6 guidelines_and_assignment
 
Chi square analysis
Chi square analysisChi square analysis
Chi square analysis
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Answers
AnswersAnswers
Answers
 
Bachelor_thesis
Bachelor_thesisBachelor_thesis
Bachelor_thesis
 
Math 300 MM Project
Math 300 MM ProjectMath 300 MM Project
Math 300 MM Project
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
EC4417 Econometrics Project
EC4417 Econometrics ProjectEC4417 Econometrics Project
EC4417 Econometrics Project
 
1608 probability and statistics in engineering
1608 probability and statistics in engineering1608 probability and statistics in engineering
1608 probability and statistics in engineering
 
Chapter 8
Chapter 8Chapter 8
Chapter 8
 
statistical estimation
statistical estimationstatistical estimation
statistical estimation
 
Exponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEExponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLE
 
ProjectWriteupforClass (3)
ProjectWriteupforClass (3)ProjectWriteupforClass (3)
ProjectWriteupforClass (3)
 
EC4417 Econometrics Project
EC4417 Econometrics ProjectEC4417 Econometrics Project
EC4417 Econometrics Project
 
Scaling and Measurement techniques
Scaling and Measurement techniquesScaling and Measurement techniques
Scaling and Measurement techniques
 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis Essay
 

Kürzlich hochgeladen

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 

Kürzlich hochgeladen (20)

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 

Probability Models Project

  • 1. PROBABILITY MODELS : PROJECT REPORT Srikanth Popuri M12388241 Poorvi Deshpande M12388313 CONTENTS PROBABILITY MODELS : PROJECT REPORT....................................................................................................1 CONTENTS...................................................................................................... Error! Bookmark not defined. INTRODUCTION.........................................................................................................................................2 OVERVIEW OF DATASET............................................................................................................................2 PROBLEM STATEMENT..............................................................................................................................2 ANALYSIS...................................................................................................................................................3 ECDF OF SAMPLE.......................................................................................................................................4 HYPOTHESIS TESTING FOR DIFFERENCE IN MEANS : WALD TEST ............................................................5 CONCLUSION.............................................................................................................................................7 APPENDIX..................................................................................................................................................8
  • 2. INTRODUCTION The objective of this project is to translate our theoretical knowledge about the data into practical use. We apply various methods learnt for non-parametric distribution on our data and study their accuracy. OVERVIEW OF DATASET The data set used in this study is the Flights data available in R. It contains details about the flights that departed from NYC such as the time of each flight, the time of departure and arrival, distance of the flight etc. It has 336776 observations with 19 variables for the flights data available for the airport on each day. We plan to study the variable ‘air_time’ which describes the time spent in air by a given flight. PROBLEM STATEMENT The air_time variable is being studied to determine how accurate is our sample mean to our true population mean.
  • 3. ANALYSIS Distribution of sample of airtime: Figure 2 : Histogram of the sample of air_time (Day 1 in January) Figure 3 : Summary statistics of sample of air_time (Day 1 in January) We see that the sample does not follow a normal distribution. Therefore we plot an empirical distribution to see the cumulative distribution of sample.
  • 4. ECDF OF SAMPLE Figure 3 : ECDF of air_time sample with 95% confidence interval We do not know the distribution of the sample. Thus, we proceed with the non- parametric approach to compare population mean and sample mean. Using non parametric bootstrap, we found the mean of the population which comes out to be 169.6914. Distribution of the bootstrap means :
  • 5. Figure 4 : Histogram of the means obtained by bootstrap HYPOTHESIS TESTING FOR DIFFERENCE IN MEANS : WALD TEST The hypothesis testing is performed to see if there is any difference between the mean air_time for the sample and the mean air_time for the population. Since the distribution of the data is appearing to be not normal, the Wald – test is used for the hypothesis testing condition. The hypothesis condition for the same is: Ho : μ1 – μ2 = 0 Ha : μ1 – μ2 ≠ 0 where, μ1 is the mean air_time from bootstrap and μ2 is the mean air_time from population. The results for the test conducted in R is denoted in the figure:
  • 6. Figure 1: Results from Wald test for hypothesis testing on difference of means As observed, the p-value is very less as compared to 0.05. Thus, we have enough evidence to reject the null hypothesis for the test. Hence, there is a difference in the sample mean and the population mean of air_time. BAYESIAN APPROACH The frequentist approach says that the means are different. We will further confirm this using a different approach i.e. Bayesian analysis, to determine if the conclusion we have come to is correct or not. We test the data for the difference of means of the sample and population. “Jeffreys” method was used to test the equality of means. We assume that the variance of both population and sample is the same (essentially, it is same as the sample comes from the population itself). The result of Bayesian Analysis was as follows:
  • 7. According to the t-statistics observed above, there was a significant difference between the population mean and sample mean. CONCLUSION Since the population of air_time of the Flights data was known, we can have a glimpse at the population summary statistics which is as follows : Comparing this to the sample statistics : We can conclude that the sample mean and population mean are significantly different. We confirm this by performing tests on this using two approaches namely Frequentist
  • 8. and Bayesian approach. In the Wald test, we reject our null hypothesis that the means are same. This is confirmed by the Bayesian test statistics. APPENDIX R Code : library("nycflights13") library(ACSWR) library(bootstrap) library(Bolstad) airtime <- flights$air_time airtime <- na.omit(airtime) airtime_sample <- flights$air_time[flights$day==1 & flights$month==1] airtime_sample <- na.omit(airtime_sample) at <- hist(airtime) summary(airtime) sd(airtime) hist(airtime_sample) summary(airtime_sample) ####ecdf
  • 9. airtime_ecdf <- ecdf(airtime_sample) plot(airtime_ecdf) Alpha=0.05 n=length(airtime_sample) Eps=sqrt(log(2/Alpha)/(2*n)) grid<-seq(0,1000, length.out = 10000) grid_eps_min <- airtime_ecdf(grid)+Eps grid_eps_max <- airtime_ecdf(grid)-Eps lines(grid, pmin(grid_eps_min,1)) lines(grid, pmax(grid_eps_max,0)) $$$boot strap mean.boot<-bootstrap(airtime_sample, nboot = 1000, mean) hist(mean.boot$thetastar) mean(mean.boot$thetastar) var(mean.boot$thetastar) sd(mean.boot$thetastar) #wald test w_test <- (mean(mean.boot$thetastar) - mean(airtime))/sd(mean.boot$thetastar) p_value <- 2*(1-pnorm(w_test)) # Bayesian analysis bayes.t.test(as.vector(airtime_sample), y =as.vector(airtime), alternative = c("two.sided"), mu = 0, paired = FALSE, var.equal = TRUE, conf.level = 0.95, prior = c("jeffreys"))