SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
Analyzing the Energy Efficiency Data Set
Ankit Ghosalkar
Introduction
The main purpose of the report is to study and analyze the Energy Efficiency Dataset. We have
performed linear regression on this dataset using response variable called Heating Load. Along with this
variable, we also performed an analysis on other variables in the dataset and studied the significance of
these variables to model the linear regression. We have performed and worked on three model
selection criterion: Ordinary Least squares(OLS), Akaike Information Criterion(AIC), Bayesian Information
Criterion(BIC) to represent the data linearly.
Data Set Information
We perform energy analysis using 12 different building shapes simulated in Ecotect. The buildings differ
with respect to Glazing Area, Glazing Area Distribution and Orientation, amongst other parameters. The
dataset comprises of 768 samples and 8 features aiming to predict one real valued response. The
dataset made use of X as the predictor variables and Y as response variables i.e. the predictor variables
are X1…..X8 and the response variables are Y1 and Y2.The variables have been renamed to the
following:
Information about the 10 variables is
1. Relative Compactness
2. Surface Area
3. Wall Area
4. Roof Area
5. Overall Height
6. Orientation
7. Glazing Area
8. Glazing Area Distribution
9. Heating Load
10. Cooling Load
From the above variables we are considering Heating Load and Cooling Load as response variables. First
of all we calculated the correlation between these two response variables and other variables and also
studied the correlation among all the variables.
>Cor(file)
As shown in correlation matrix, the correlation between two response variable is very high (i.e. 0.975).
Hence by studying one of the response variables, we can predict the value of other response variable. So
we have considered Heating Load as the only response variable. Overall Height variable has a very good
correlation with the response variable Heating Load (i.e. 0.889), hence we can say that Overall Height is
good predictor of response variable Heating Load. Also Relative Compactness has a good correlation
with the response variable, so it is also a good predictor of response variable. The two predictor
variables also have good correlation with each other. So the two predictor variable can themselves act
as a good predictor-response pair.
Study of Variables
Before we do the linear regression for the given dataset and variables, we will need to study the
variables of the datasets. And this will be done by creating the pairwise scatterplot of the variables in
dataset. The scatterplot is useful to visualize the relationship between the variables of the dataset and
scatterplots are used to make an initial diagnoses before any statistical analysis are conducted.
The Scatterplot of Dataset is as follow:
The scatterplot concurs with the fact that there is a linear relationship between two response variables.
There is a very high correlation between the two variables. So instead of analyzing both the variables we
are analyzing only one response variable. The same analysis will apply to the other variable. The
correlation between most variables is not linear because most of the variables have a few distinct
values.
Linear Regression
lm.mo<-
lm(Heating.Load~Relative.Compactness+Surface.Area+Wall.Area+Roof.Area+Overall.Height+Orientation
+Glazing.Area+Glazing.Area.Distribution,data=file)
summary(lm.mo)
Linear Regression was performed with the response variable as Heating Load and the rest of the
variables as predictors. The summary shows us that Roof Area does not have any relationship with the
response variable. It does not predict the value of response variable. The p-value of Orientation is not
significant. All other attributes have a significant p-value. Also the R-squared value is very high. It is
91.62% which implies that it fits the model data. During linear regression we analyze the model based
on hypothesis that: All regression coefficients are 0. So,
Null Hypothesis=All regression coefficients are zero.
Alternate Hypothesis=Atleast one coefficient is not zero.
Since the p-value for the model is less than alpha i.e. 2.2e-16 we can reject the Null Hypothesis.
Residual Analysis
The Residual Analysis on the model resulted in following output
> lm.mo<-
lm(Heating.Load~Relative.Compactness+Surface.Area+Wall.Area+Roof.Area+Overall.Height+Orientation
+Glazing.Area+Glazing.Area.Distribution,data=file)
> par(mfrow=c(2,2))
> plot(lm.mo)
The Residuals v/s Fitted values graph shows that the variables are scattered on the both side of the
reference line. Also there is a very high variance in the values. This is due to the fact that most variables
have few distinct values because of which the values in graph are scattered. Also the graph appears to
be fanning out to the right. Hence the model is not a good fit for the data. The normal Q-Q graph is not
linear.
Stepwise Regression
We have performed stepwise regression in the backward direction.
Akaike Information Criterion (AIC)
>step (lm.mo, direction=”backward”)
During each step the attribute with the lowest p-value is removed. The AIC of the model increases when
the attribute with the lowest p-value is removed. After removing Roof Area the AIC remains unchanged.
This happens due to the fact that Roof Area does not play a part in determining the response variable
but after removing Orientation the model improves as AIC changes by 1.94 to 1659.48. We can see that
attributes like Relative Compactness, Surface Area, Wall Area, Overall Height, Glazing Area and Glazing
Area Distribution play a significant part in deciding the value of response variable. This concurs with the
p-values which we got during regression.
Bayesian Information Criterion (BIC)
>step(lm.mo,direction=”backward”,k=log(nrow(file)))
The behavior of BIC is similar to AIC. Initially the value of BIC was 1698.57. It remains unchanged after
removal of Roof Area since it is not significant in determining the value of response variable. The model
improves after the removal of Orientation by 6.58. Thus AIC and BIC give similar outputs and the
significant variables remain the same for both.
Conclusion
Heating Load and Cooling Load depend on same variables because of high correlation.The variables
which play an important part in predicting their values are Relative Compactness, Surface Area, Wall
Area, Overall Height, Glazing Area and Glazing Area Distribution.
The paper which cites this dataset is:
A. Tsanas, A. Xifara: 'Accurate quantitative estimation of energy performance of residential buildings
using statistical machine learning tools', Energy and Buildings, Vol. 49, pp. 560-567, 2012

Weitere ähnliche Inhalte

Was ist angesagt?

Sun Power Presentation
Sun Power PresentationSun Power Presentation
Sun Power PresentationHitReach
 
Corporate Green Energy Sourcing
Corporate Green Energy SourcingCorporate Green Energy Sourcing
Corporate Green Energy SourcingLeonardo ENERGY
 
Wind Energy Project Analysis
Wind Energy Project AnalysisWind Energy Project Analysis
Wind Energy Project AnalysisLeonardo ENERGY
 
Jess Hrivnak RIBA 2030 Climate Challenge
Jess Hrivnak RIBA 2030 Climate ChallengeJess Hrivnak RIBA 2030 Climate Challenge
Jess Hrivnak RIBA 2030 Climate ChallengeIES VE
 
case study on energy conservation and utilization.
case study on energy conservation and utilization.case study on energy conservation and utilization.
case study on energy conservation and utilization.tamboliameer
 
Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)
Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)
Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)Steve Lojuntin
 
Zero Energy Buildings
Zero Energy BuildingsZero Energy Buildings
Zero Energy BuildingsJeffrey Funk
 
My triz competition part 1
My triz competition   part 1My triz competition   part 1
My triz competition part 1Yip Willis
 
Sustainable building materials in Green building construction.
Sustainable building materials in Green building construction.Sustainable building materials in Green building construction.
Sustainable building materials in Green building construction.Tendai Mabvudza
 
"Generating Energy from Road Ribs" for IEEE Abhivyakti'14
"Generating Energy from Road Ribs" for IEEE Abhivyakti'14 "Generating Energy from Road Ribs" for IEEE Abhivyakti'14
"Generating Energy from Road Ribs" for IEEE Abhivyakti'14 Gaurav Raj Anand
 

Was ist angesagt? (14)

Concentrator Solar Power Plants
Concentrator Solar Power PlantsConcentrator Solar Power Plants
Concentrator Solar Power Plants
 
Sun Power Presentation
Sun Power PresentationSun Power Presentation
Sun Power Presentation
 
ZERO ENERGY BUILDING ENVELOPE COMPONENTS
ZERO ENERGY BUILDING ENVELOPE COMPONENTSZERO ENERGY BUILDING ENVELOPE COMPONENTS
ZERO ENERGY BUILDING ENVELOPE COMPONENTS
 
Corporate Green Energy Sourcing
Corporate Green Energy SourcingCorporate Green Energy Sourcing
Corporate Green Energy Sourcing
 
Optimazation energy ppt
Optimazation energy pptOptimazation energy ppt
Optimazation energy ppt
 
Wind Turbine
Wind TurbineWind Turbine
Wind Turbine
 
Wind Energy Project Analysis
Wind Energy Project AnalysisWind Energy Project Analysis
Wind Energy Project Analysis
 
Jess Hrivnak RIBA 2030 Climate Challenge
Jess Hrivnak RIBA 2030 Climate ChallengeJess Hrivnak RIBA 2030 Climate Challenge
Jess Hrivnak RIBA 2030 Climate Challenge
 
case study on energy conservation and utilization.
case study on energy conservation and utilization.case study on energy conservation and utilization.
case study on energy conservation and utilization.
 
Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)
Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)
Zero Energy Building Concept, Methodology and Assessment tool in Malaysia (2021)
 
Zero Energy Buildings
Zero Energy BuildingsZero Energy Buildings
Zero Energy Buildings
 
My triz competition part 1
My triz competition   part 1My triz competition   part 1
My triz competition part 1
 
Sustainable building materials in Green building construction.
Sustainable building materials in Green building construction.Sustainable building materials in Green building construction.
Sustainable building materials in Green building construction.
 
"Generating Energy from Road Ribs" for IEEE Abhivyakti'14
"Generating Energy from Road Ribs" for IEEE Abhivyakti'14 "Generating Energy from Road Ribs" for IEEE Abhivyakti'14
"Generating Energy from Road Ribs" for IEEE Abhivyakti'14
 

Ähnlich wie Energy efficiency dataset

Prediction of house price using multiple regression
Prediction of house price using multiple regressionPrediction of house price using multiple regression
Prediction of house price using multiple regressionvinovk
 
Predicting US house prices using Multiple Linear Regression in R
Predicting US house prices using Multiple Linear Regression in RPredicting US house prices using Multiple Linear Regression in R
Predicting US house prices using Multiple Linear Regression in RSotiris Baratsas
 
Linear logisticregression
Linear logisticregressionLinear logisticregression
Linear logisticregressionkongara
 
Analysis of the Boston Housing Data from the 1970 census
Analysis of the Boston Housing Data from the 1970 censusAnalysis of the Boston Housing Data from the 1970 census
Analysis of the Boston Housing Data from the 1970 censusShuai Yuan
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptxtruptikulkarni2066
 
Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...ANIRBANMAJUMDAR18
 
Linear Regression
Linear Regression Linear Regression
Linear Regression Rupak Roy
 
dimensional_analysis.pptx
dimensional_analysis.pptxdimensional_analysis.pptx
dimensional_analysis.pptxDinaSaad22
 
Multivariate Linear Regression.ppt
Multivariate Linear Regression.pptMultivariate Linear Regression.ppt
Multivariate Linear Regression.pptTanyaWadhwani4
 
Experimental design
Experimental designExperimental design
Experimental designSandip Patel
 
Statistical Analysis of Rent Paid by U.S. Households
Statistical Analysis of Rent Paid by U.S. Households Statistical Analysis of Rent Paid by U.S. Households
Statistical Analysis of Rent Paid by U.S. Households Riddhima Kartik
 
Assignment 2 Cloud SolutionsCloud-based computing allows busine.docx
Assignment 2 Cloud SolutionsCloud-based computing allows busine.docxAssignment 2 Cloud SolutionsCloud-based computing allows busine.docx
Assignment 2 Cloud SolutionsCloud-based computing allows busine.docxsherni1
 
SupportVectorRegression
SupportVectorRegressionSupportVectorRegression
SupportVectorRegressionDaniel K
 

Ähnlich wie Energy efficiency dataset (20)

Regression
RegressionRegression
Regression
 
Regression
RegressionRegression
Regression
 
Prediction of house price using multiple regression
Prediction of house price using multiple regressionPrediction of house price using multiple regression
Prediction of house price using multiple regression
 
Predicting US house prices using Multiple Linear Regression in R
Predicting US house prices using Multiple Linear Regression in RPredicting US house prices using Multiple Linear Regression in R
Predicting US house prices using Multiple Linear Regression in R
 
Linear logisticregression
Linear logisticregressionLinear logisticregression
Linear logisticregression
 
Analysis of the Boston Housing Data from the 1970 census
Analysis of the Boston Housing Data from the 1970 censusAnalysis of the Boston Housing Data from the 1970 census
Analysis of the Boston Housing Data from the 1970 census
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptx
 
Multiple regression
Multiple regressionMultiple regression
Multiple regression
 
Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...
 
Linear Regression
Linear Regression Linear Regression
Linear Regression
 
dimensional_analysis.pptx
dimensional_analysis.pptxdimensional_analysis.pptx
dimensional_analysis.pptx
 
Multivariate Linear Regression.ppt
Multivariate Linear Regression.pptMultivariate Linear Regression.ppt
Multivariate Linear Regression.ppt
 
Ch14 multiple regression
Ch14 multiple regressionCh14 multiple regression
Ch14 multiple regression
 
Experimental design
Experimental designExperimental design
Experimental design
 
Transformation of variables
Transformation of variablesTransformation of variables
Transformation of variables
 
1607.01152.pdf
1607.01152.pdf1607.01152.pdf
1607.01152.pdf
 
Statistical Analysis of Rent Paid by U.S. Households
Statistical Analysis of Rent Paid by U.S. Households Statistical Analysis of Rent Paid by U.S. Households
Statistical Analysis of Rent Paid by U.S. Households
 
Chapter 14
Chapter 14 Chapter 14
Chapter 14
 
Assignment 2 Cloud SolutionsCloud-based computing allows busine.docx
Assignment 2 Cloud SolutionsCloud-based computing allows busine.docxAssignment 2 Cloud SolutionsCloud-based computing allows busine.docx
Assignment 2 Cloud SolutionsCloud-based computing allows busine.docx
 
SupportVectorRegression
SupportVectorRegressionSupportVectorRegression
SupportVectorRegression
 

Kürzlich hochgeladen

Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 

Kürzlich hochgeladen (20)

Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

Energy efficiency dataset

  • 1. Analyzing the Energy Efficiency Data Set Ankit Ghosalkar Introduction The main purpose of the report is to study and analyze the Energy Efficiency Dataset. We have performed linear regression on this dataset using response variable called Heating Load. Along with this variable, we also performed an analysis on other variables in the dataset and studied the significance of these variables to model the linear regression. We have performed and worked on three model selection criterion: Ordinary Least squares(OLS), Akaike Information Criterion(AIC), Bayesian Information Criterion(BIC) to represent the data linearly. Data Set Information We perform energy analysis using 12 different building shapes simulated in Ecotect. The buildings differ with respect to Glazing Area, Glazing Area Distribution and Orientation, amongst other parameters. The dataset comprises of 768 samples and 8 features aiming to predict one real valued response. The dataset made use of X as the predictor variables and Y as response variables i.e. the predictor variables are X1…..X8 and the response variables are Y1 and Y2.The variables have been renamed to the following: Information about the 10 variables is 1. Relative Compactness 2. Surface Area 3. Wall Area 4. Roof Area 5. Overall Height 6. Orientation 7. Glazing Area 8. Glazing Area Distribution 9. Heating Load 10. Cooling Load From the above variables we are considering Heating Load and Cooling Load as response variables. First of all we calculated the correlation between these two response variables and other variables and also studied the correlation among all the variables. >Cor(file)
  • 2. As shown in correlation matrix, the correlation between two response variable is very high (i.e. 0.975). Hence by studying one of the response variables, we can predict the value of other response variable. So we have considered Heating Load as the only response variable. Overall Height variable has a very good correlation with the response variable Heating Load (i.e. 0.889), hence we can say that Overall Height is good predictor of response variable Heating Load. Also Relative Compactness has a good correlation with the response variable, so it is also a good predictor of response variable. The two predictor variables also have good correlation with each other. So the two predictor variable can themselves act as a good predictor-response pair. Study of Variables Before we do the linear regression for the given dataset and variables, we will need to study the variables of the datasets. And this will be done by creating the pairwise scatterplot of the variables in dataset. The scatterplot is useful to visualize the relationship between the variables of the dataset and scatterplots are used to make an initial diagnoses before any statistical analysis are conducted.
  • 3. The Scatterplot of Dataset is as follow: The scatterplot concurs with the fact that there is a linear relationship between two response variables. There is a very high correlation between the two variables. So instead of analyzing both the variables we are analyzing only one response variable. The same analysis will apply to the other variable. The correlation between most variables is not linear because most of the variables have a few distinct values. Linear Regression lm.mo<- lm(Heating.Load~Relative.Compactness+Surface.Area+Wall.Area+Roof.Area+Overall.Height+Orientation +Glazing.Area+Glazing.Area.Distribution,data=file) summary(lm.mo)
  • 4. Linear Regression was performed with the response variable as Heating Load and the rest of the variables as predictors. The summary shows us that Roof Area does not have any relationship with the response variable. It does not predict the value of response variable. The p-value of Orientation is not significant. All other attributes have a significant p-value. Also the R-squared value is very high. It is 91.62% which implies that it fits the model data. During linear regression we analyze the model based on hypothesis that: All regression coefficients are 0. So, Null Hypothesis=All regression coefficients are zero. Alternate Hypothesis=Atleast one coefficient is not zero. Since the p-value for the model is less than alpha i.e. 2.2e-16 we can reject the Null Hypothesis. Residual Analysis The Residual Analysis on the model resulted in following output > lm.mo<- lm(Heating.Load~Relative.Compactness+Surface.Area+Wall.Area+Roof.Area+Overall.Height+Orientation +Glazing.Area+Glazing.Area.Distribution,data=file) > par(mfrow=c(2,2)) > plot(lm.mo)
  • 5. The Residuals v/s Fitted values graph shows that the variables are scattered on the both side of the reference line. Also there is a very high variance in the values. This is due to the fact that most variables have few distinct values because of which the values in graph are scattered. Also the graph appears to be fanning out to the right. Hence the model is not a good fit for the data. The normal Q-Q graph is not linear.
  • 6. Stepwise Regression We have performed stepwise regression in the backward direction. Akaike Information Criterion (AIC) >step (lm.mo, direction=”backward”)
  • 7. During each step the attribute with the lowest p-value is removed. The AIC of the model increases when the attribute with the lowest p-value is removed. After removing Roof Area the AIC remains unchanged. This happens due to the fact that Roof Area does not play a part in determining the response variable but after removing Orientation the model improves as AIC changes by 1.94 to 1659.48. We can see that attributes like Relative Compactness, Surface Area, Wall Area, Overall Height, Glazing Area and Glazing Area Distribution play a significant part in deciding the value of response variable. This concurs with the p-values which we got during regression. Bayesian Information Criterion (BIC) >step(lm.mo,direction=”backward”,k=log(nrow(file)))
  • 8. The behavior of BIC is similar to AIC. Initially the value of BIC was 1698.57. It remains unchanged after removal of Roof Area since it is not significant in determining the value of response variable. The model improves after the removal of Orientation by 6.58. Thus AIC and BIC give similar outputs and the significant variables remain the same for both. Conclusion Heating Load and Cooling Load depend on same variables because of high correlation.The variables which play an important part in predicting their values are Relative Compactness, Surface Area, Wall Area, Overall Height, Glazing Area and Glazing Area Distribution. The paper which cites this dataset is: A. Tsanas, A. Xifara: 'Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools', Energy and Buildings, Vol. 49, pp. 560-567, 2012