SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Guru Nanak Dev Engineering College,
Bidar
(Department of Information Science & Engineering)
HEALTH INSURANCE COST PREDICTION BY
USING REGRESSION MODELS.
Major Project
Arjun Singh (3GN18IS007)
Gourishanker (3GN18IS009)
Prabhakar (3GN18IS015)
Sai Krishna (3GN18IS029)
Under the guidance of,
Prof. Sangameshwar Kawdi
AIM
• The main aim of this project is to identify or predict the nearest value of
the health insurances of the citizens based on the collected data.
• This model ensures the predicted amount for the health insurance gives
maximum accuracy to the people by implementing various different
algorithms.
Objective
• To implement the efficient algorithms which provide more accuracy in
terms of predicting the right insurance amount.
• Comparing different algorithms to achieve the accurate outcome
through regression models.
Problem Statement
• The amount of the premium for a health insurance policy depends from
person to person, as many factors affect the amount of the premium for
a health insurance policy. Let’s say age, a young person is very less likely
to have major health problems compared to an older person. Thus,
treating an older person will be expensive compared to a young one.
That is why an older person is required to pay a high premium
compared to a young person. The right prediction model is a must in
consideration with their daily habits, such that an idea is given to the
people about their health insurance
L
Paper title : Predict Health Insurance Cost by using Machine
Learning and DNN Regression Models. (Publisher: Ieee,
source: https://ieeexplore.ieee.org/document/703922)
 Major Observations:
• Regression analysis allows us to quantify the
relationship between outcome and
associated variables. Many techniques for
performing statistical predictions have been
developed, but, in this project, three models
- Multiple Linear Regression (MLR), Decision
tree regression and Gradient Boosting
Regression were tested and compared
Paper title : Health Insurance Amount Prediction.
(Publisher: International Journal of Engineering Research &
Technology (IJERT))
 Major Observations:
• In this paper, a method was developed, using
large-scale health insurance claims data, to
predict the number of hospitalization days in
a population. They utilized a regression
decision tree algorithm, along with insurance
claim data from 242 075 individuals over
three years, to provide predictions. The
proposed method performs well in the
general population as well as in
subpopulations.
Literature Survey
Hardware & Software
Requirements:
Hardware Requirements:
 Standard Pentium Series Processor
 Minimum 4 GB RAM
 256 GB HDD Storage capacity.
Software Requirements:
 Windows 7
 Chrome or Any Search Engine
 Text Editor
 Anaconda Software
Important Methods &
Approaches:
Below listed are the different regression models which are used:
1. Multiple Linear Regression.
2. Decision Tree Regression.
3. Gradient Boosting Regression.
What is regression?
Regression analysis is primarily used for two conceptually distinct purposes. First,
regression analysis is widely used for prediction and forecasting, where its use
has substantial overlap with the field of machine learning. Second, in some
situations regression analysis can be used to infer causal relationships between
the independent and dependent variables. Importantly, regressions by
themselves only reveal relationships between a dependent variable and a
collection of independent variables in a fixed dataset. To use regressions for
prediction or to infer causal relationships, respectively, a researcher must carefully
justify why existing relationships have predictive power for a new context or why
a relationship between two variables has a causal interpretation. The latter is
especially important when researchers hope to estimate causal relationships
using observational data.
Multiple Linear Regression?
Multiple linear regression (MLR), also known simply as multiple regression,
is a statistical technique that uses several explanatory variables to predict
the outcome of a response variable. The goal of multiple linear regression
is to model the linear relationship between the explanatory (independent)
variables and response (dependent) variables. In essence, multiple
regression is the extension of ordinary least-squares
(OLS) regression because it involves more than one explanatory variable.
Key Takeaways
 Multiple linear regression (MLR), also known simply as multiple
regression, is a statistical technique that uses several explanatory
variables to predict the outcome of a response variable.
 Multiple regression is an extension of linear (OLS) regression that uses
just one explanatory variable.
 MLR is used extensively in econometrics and financial inference.
Decision Tree Regression?
 Decision tree builds regression or classification models in the form of a
tree structure. It breaks down a dataset into smaller and smaller
subsets while at the same time an associated decision tree is
incrementally developed. The final result is a tree with decision
nodes and leaf nodes. A decision node (e.g., Outlook) has two or
more branches (e.g., Sunny, Overcast and Rainy), each representing
values for the attribute tested. Leaf node (e.g., Hours Played)
represents a decision on the numerical target. The topmost decision
node in a tree which corresponds to the best predictor called root
node. Decision trees can handle both categorical and numerical data.
Gradient Boosting
Regression?
 Gradient boosting is a machine learning technique used in regression
and classification tasks, among others. It gives a prediction model in
the form of an ensemble of weak prediction models, which are
typically decision trees. When a decision tree is the weak learner, the
resulting algorithm is called gradient-boosted trees; it usually
outperforms random forest. A gradient-boosted trees model is built in
a stage-wise fashion as in other boosting methods, but it generalizes
the other methods by allowing optimization of an arbitrary
differentiable loss function.
Thank You :)

Weitere ähnliche Inhalte

Was ist angesagt?

prepaid energy meter using gsm
prepaid energy meter using gsm prepaid energy meter using gsm
prepaid energy meter using gsm
udaypallyreddy
 
Microcontroller based-substation-monitoring-and-controlling-system
Microcontroller based-substation-monitoring-and-controlling-systemMicrocontroller based-substation-monitoring-and-controlling-system
Microcontroller based-substation-monitoring-and-controlling-system
Mahmud Hasan Uday
 

Was ist angesagt? (14)

Ppt power theft identification and detection using gsm
Ppt power theft identification and detection using gsmPpt power theft identification and detection using gsm
Ppt power theft identification and detection using gsm
 
IRJET- Solar Power Monitoring System using IoT
IRJET-  	  Solar Power Monitoring System using IoTIRJET-  	  Solar Power Monitoring System using IoT
IRJET- Solar Power Monitoring System using IoT
 
automatic plant irrigation using aurdino and gsm technology
automatic plant irrigation using aurdino and gsm technologyautomatic plant irrigation using aurdino and gsm technology
automatic plant irrigation using aurdino and gsm technology
 
Thermo electric coolers
Thermo electric coolersThermo electric coolers
Thermo electric coolers
 
prepaid energy meter using gsm
prepaid energy meter using gsm prepaid energy meter using gsm
prepaid energy meter using gsm
 
IoT Based Smart Energy Meter
IoT Based Smart Energy MeterIoT Based Smart Energy Meter
IoT Based Smart Energy Meter
 
data science applications in finance.pptx
data science applications in finance.pptxdata science applications in finance.pptx
data science applications in finance.pptx
 
FLOOD PPT 1.pptx
FLOOD PPT 1.pptxFLOOD PPT 1.pptx
FLOOD PPT 1.pptx
 
Microcontroller based-substation-monitoring-and-controlling-system
Microcontroller based-substation-monitoring-and-controlling-systemMicrocontroller based-substation-monitoring-and-controlling-system
Microcontroller based-substation-monitoring-and-controlling-system
 
Smart meters
Smart metersSmart meters
Smart meters
 
Grid Voltage Regulation
Grid Voltage RegulationGrid Voltage Regulation
Grid Voltage Regulation
 
Supply system (Electrical Power System)
Supply system (Electrical Power System)Supply system (Electrical Power System)
Supply system (Electrical Power System)
 
Energy Meters
Energy Meters Energy Meters
Energy Meters
 
input output characteristics of thermal plant
input output characteristics of thermal plantinput output characteristics of thermal plant
input output characteristics of thermal plant
 

Ähnlich wie Presentation 5.pptx

Screening of Mental Health in Adolescence.pptx
Screening of Mental Health in Adolescence.pptxScreening of Mental Health in Adolescence.pptx
Screening of Mental Health in Adolescence.pptx
NitishChoudhary23
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
AnanthReddy38
 
Not sure how to do this case analysis please help me do it!1.Are t.pdf
Not sure how to do this case analysis please help me do it!1.Are t.pdfNot sure how to do this case analysis please help me do it!1.Are t.pdf
Not sure how to do this case analysis please help me do it!1.Are t.pdf
amitbagga0808
 
Atharva_Joshis_Presentation_on_Regression.pptx
Atharva_Joshis_Presentation_on_Regression.pptxAtharva_Joshis_Presentation_on_Regression.pptx
Atharva_Joshis_Presentation_on_Regression.pptx
Atharva Joshi
 

Ähnlich wie Presentation 5.pptx (20)

HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
 
Doc 20190909-wa0025
Doc 20190909-wa0025Doc 20190909-wa0025
Doc 20190909-wa0025
 
prediction using data mining.pdf
prediction using data mining.pdfprediction using data mining.pdf
prediction using data mining.pdf
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
 
Health Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep LearningHealth Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep Learning
 
Regression and Artificial Neural Network in R
Regression and Artificial Neural Network in RRegression and Artificial Neural Network in R
Regression and Artificial Neural Network in R
 
Advanced Statistical Manual for Ayurveda Research
Advanced Statistical Manual for Ayurveda ResearchAdvanced Statistical Manual for Ayurveda Research
Advanced Statistical Manual for Ayurveda Research
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
 
Efficiency of Prediction Algorithms for Mining Biological Databases
Efficiency of Prediction Algorithms for Mining Biological  DatabasesEfficiency of Prediction Algorithms for Mining Biological  Databases
Efficiency of Prediction Algorithms for Mining Biological Databases
 
CUSTOMER CHURN PREDICTION
CUSTOMER CHURN PREDICTIONCUSTOMER CHURN PREDICTION
CUSTOMER CHURN PREDICTION
 
Introductionedited
IntroductioneditedIntroductionedited
Introductionedited
 
Forecasting
ForecastingForecasting
Forecasting
 
Dissertation
DissertationDissertation
Dissertation
 
Screening of Mental Health in Adolescence.pptx
Screening of Mental Health in Adolescence.pptxScreening of Mental Health in Adolescence.pptx
Screening of Mental Health in Adolescence.pptx
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
 
Exam Short Preparation on Data Analytics
Exam Short Preparation on Data AnalyticsExam Short Preparation on Data Analytics
Exam Short Preparation on Data Analytics
 
Not sure how to do this case analysis please help me do it!1.Are t.pdf
Not sure how to do this case analysis please help me do it!1.Are t.pdfNot sure how to do this case analysis please help me do it!1.Are t.pdf
Not sure how to do this case analysis please help me do it!1.Are t.pdf
 
Atharva_Joshis_Presentation_on_Regression.pptx
Atharva_Joshis_Presentation_on_Regression.pptxAtharva_Joshis_Presentation_on_Regression.pptx
Atharva_Joshis_Presentation_on_Regression.pptx
 

Kürzlich hochgeladen

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 

Kürzlich hochgeladen (20)

module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 

Presentation 5.pptx

  • 1. Guru Nanak Dev Engineering College, Bidar (Department of Information Science & Engineering) HEALTH INSURANCE COST PREDICTION BY USING REGRESSION MODELS. Major Project Arjun Singh (3GN18IS007) Gourishanker (3GN18IS009) Prabhakar (3GN18IS015) Sai Krishna (3GN18IS029) Under the guidance of, Prof. Sangameshwar Kawdi
  • 2. AIM • The main aim of this project is to identify or predict the nearest value of the health insurances of the citizens based on the collected data. • This model ensures the predicted amount for the health insurance gives maximum accuracy to the people by implementing various different algorithms.
  • 3. Objective • To implement the efficient algorithms which provide more accuracy in terms of predicting the right insurance amount. • Comparing different algorithms to achieve the accurate outcome through regression models.
  • 4. Problem Statement • The amount of the premium for a health insurance policy depends from person to person, as many factors affect the amount of the premium for a health insurance policy. Let’s say age, a young person is very less likely to have major health problems compared to an older person. Thus, treating an older person will be expensive compared to a young one. That is why an older person is required to pay a high premium compared to a young person. The right prediction model is a must in consideration with their daily habits, such that an idea is given to the people about their health insurance
  • 5. L Paper title : Predict Health Insurance Cost by using Machine Learning and DNN Regression Models. (Publisher: Ieee, source: https://ieeexplore.ieee.org/document/703922)  Major Observations: • Regression analysis allows us to quantify the relationship between outcome and associated variables. Many techniques for performing statistical predictions have been developed, but, in this project, three models - Multiple Linear Regression (MLR), Decision tree regression and Gradient Boosting Regression were tested and compared Paper title : Health Insurance Amount Prediction. (Publisher: International Journal of Engineering Research & Technology (IJERT))  Major Observations: • In this paper, a method was developed, using large-scale health insurance claims data, to predict the number of hospitalization days in a population. They utilized a regression decision tree algorithm, along with insurance claim data from 242 075 individuals over three years, to provide predictions. The proposed method performs well in the general population as well as in subpopulations. Literature Survey
  • 6. Hardware & Software Requirements: Hardware Requirements:  Standard Pentium Series Processor  Minimum 4 GB RAM  256 GB HDD Storage capacity. Software Requirements:  Windows 7  Chrome or Any Search Engine  Text Editor  Anaconda Software
  • 7. Important Methods & Approaches: Below listed are the different regression models which are used: 1. Multiple Linear Regression. 2. Decision Tree Regression. 3. Gradient Boosting Regression.
  • 8. What is regression? Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.
  • 9. Multiple Linear Regression? Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression is to model the linear relationship between the explanatory (independent) variables and response (dependent) variables. In essence, multiple regression is the extension of ordinary least-squares (OLS) regression because it involves more than one explanatory variable.
  • 10. Key Takeaways  Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable.  Multiple regression is an extension of linear (OLS) regression that uses just one explanatory variable.  MLR is used extensively in econometrics and financial inference.
  • 11. Decision Tree Regression?  Decision tree builds regression or classification models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. A decision node (e.g., Outlook) has two or more branches (e.g., Sunny, Overcast and Rainy), each representing values for the attribute tested. Leaf node (e.g., Hours Played) represents a decision on the numerical target. The topmost decision node in a tree which corresponds to the best predictor called root node. Decision trees can handle both categorical and numerical data.
  • 12. Gradient Boosting Regression?  Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees. When a decision tree is the weak learner, the resulting algorithm is called gradient-boosted trees; it usually outperforms random forest. A gradient-boosted trees model is built in a stage-wise fashion as in other boosting methods, but it generalizes the other methods by allowing optimization of an arbitrary differentiable loss function.