Start
Entdecken
Suche senden
Hochladen
Einloggen
Registrieren
Anzeige
Anzeige
Nächste SlideShare
HealthBuddy
Wird geladen in ... 3
1
von
5
Top clipped slide
A Medical Price Prediction System using Boosting Algorithms through Machine Learning Techniques
9. Feb 2023
•
0 gefällt mir
0 gefällt mir
×
Sei der Erste, dem dies gefällt
Mehr anzeigen
•
13 Aufrufe
Aufrufe
×
Aufrufe insgesamt
0
Auf Slideshare
0
Aus Einbettungen
0
Anzahl der Einbettungen
0
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Melden
Ingenieurwesen
https://www.irjet.net/archives/V10/i1/IRJET-V10I1174.pdf
IRJET Journal
Folgen
Fast Track Publications
Anzeige
Anzeige
Anzeige
Recomendados
HealthBuddy
IRJET Journal
3 Aufrufe
•
4 Folien
HealthCare Chatbot
IRJET Journal
46 Aufrufe
•
4 Folien
A Smart and Connected Ecosystem for Faster Return to Work for Disability and ...
Cognizant
739 Aufrufe
•
11 Folien
Smart Healthcare Prediction System Using Machine Learning
IRJET Journal
15 Aufrufe
•
4 Folien
IRJET- Clinical Medical Knowledge Extraction using Crowdsourcing Techniques
IRJET Journal
27 Aufrufe
•
5 Folien
Pricing Optimization using Machine Learning
IRJET Journal
12 Aufrufe
•
5 Folien
Más contenido relacionado
Similar a A Medical Price Prediction System using Boosting Algorithms through Machine Learning Techniques
(20)
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
IJITCA Journal
•
31 Aufrufe
Tax Prediction Using Machine Learning
IRJET Journal
•
6 Aufrufe
What Can We Do With The ArchiMate Language?
Iver Band
•
3.2K Aufrufe
Bank Customer Segmentation & Insurance Claim Prediction
IRJET Journal
•
3 Aufrufe
Medic - Artificially Intelligent System for Healthcare Services ...
IRJET Journal
•
45 Aufrufe
IRJET- Future Prediction of World Countries Emotions Status to Understand Eco...
IRJET Journal
•
8 Aufrufe
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET Journal
•
10 Aufrufe
GDP Prediction and Forecasting using Machine Learning
IRJET Journal
•
15 Aufrufe
IRJET- Data Mining Techniques to Predict Diabetes
IRJET Journal
•
27 Aufrufe
A Machine Learning Approach to Predict the Consumer Purchasing Behavior on E-...
IRJET Journal
•
7 Aufrufe
HealthOrzo – Your Health Matters
IRJET Journal
•
5 Aufrufe
Predicting Employee Attrition using various techniques of Machine Learning
IRJET Journal
•
16 Aufrufe
Loan Analysis Predicting Defaulters
IRJET Journal
•
6 Aufrufe
A Novel Approach for Forecasting Disease Using Machine Learning
IRJET Journal
•
7 Aufrufe
B05840510
IOSR-JEN
•
181 Aufrufe
B05840510
IOSR-JEN
•
327 Aufrufe
OPY-150702_BigData_Healthcare_070915
Ravi Sripada
•
110 Aufrufe
VIRTUAL CLINIC: A CDSS ASSISTEDTELEMEDICINE FRAMEWORK
IRJET Journal
•
3 Aufrufe
In Banking Loan Approval Prediction Using Machine Learning
IRJET Journal
•
39 Aufrufe
STOCK MARKET ANALYZING AND PREDICTION USING MACHINE LEARNING TECHNIQUES
IRJET Journal
•
21 Aufrufe
Más de IRJET Journal
(20)
Electronic Passport Verification System using IOT
IRJET Journal
•
3 Aufrufe
MULTI-FUNCTIONAL BLIND STICK BY USING SOLAR CHARGING SYSTEM
IRJET Journal
•
4 Aufrufe
Smart Safety Vest For Miners
IRJET Journal
•
2 Aufrufe
RUBBERISED FIBRE REINFORCED CONCRETE
IRJET Journal
•
2 Aufrufe
Experimental Investigation on E- Waste Concrete Using Non Woven Fabric Liner
IRJET Journal
•
2 Aufrufe
Behavior of Hot Asphalt Mixture Modified with Carbon Nanotube and Reclaimed A...
IRJET Journal
•
3 Aufrufe
SELF CURING CONCRETE
IRJET Journal
•
3 Aufrufe
A Comparative Study of Different Models on Image Colorization using Deep Lear...
IRJET Journal
•
2 Aufrufe
Graphical Password Authentication
IRJET Journal
•
5 Aufrufe
COLLEGE ENQUIRY CHATBOT SYSTEM IN JAVASCRIPT
IRJET Journal
•
2 Aufrufe
A REVIEW OF THE EXPERIMENTAL INVESTIGATION OF THE EFFECT OF FIBER REINFORCEME...
IRJET Journal
•
3 Aufrufe
Intrusion Detection System Using Face Recognition
IRJET Journal
•
3 Aufrufe
Experimental Investigation on Durability Properties of Silica Fume blended Hi...
IRJET Journal
•
3 Aufrufe
Strength Improvement in the Soil Using Waste Materials
IRJET Journal
•
4 Aufrufe
PCE Connect
IRJET Journal
•
3 Aufrufe
Ameliorating Depression through Deep Learning Conversational Agent: A Novel A...
IRJET Journal
•
2 Aufrufe
Admiring Gift Web App
IRJET Journal
•
2 Aufrufe
SECURING AND STRENGTHENING 5G BASED INFRASTRUCTURE USING ML
IRJET Journal
•
3 Aufrufe
Unleashing The Mechanical properties of Synthesized HAp (Eggshell) Particulat...
IRJET Journal
•
2 Aufrufe
Spinesense: Advanced Pain Detection System for Spinal Cord
IRJET Journal
•
2 Aufrufe
Anzeige
Último
(20)
Nanomaterial for renewable energy presentation-2.ppt
SAMTECH ELECTRONICS CONCEPT
•
0 Aufrufe
Cessna 150 Aircraft Owner's Manual PDF.pdf
TahirSadikovi
•
0 Aufrufe
Ds2_ans.docx
AngelinJeyaseeliD
•
0 Aufrufe
GENERAL POWER LIMITED, INC
General Power Limited, Inc
•
0 Aufrufe
Modern Building Materials.pptx
SHAIKMURTHUJAVALI4
•
0 Aufrufe
MNB minor bridge 14587-E-DD-335 (SH 1 OF 2).pdf
hemachandra59
•
0 Aufrufe
meetstore5.pdf
Mike Nakhimovich
•
0 Aufrufe
Anti_Capillary by Rohit Damodaran
Rohit Damodaran
•
0 Aufrufe
SHEIKH ASIM HUSSAIN.pptx
lovish34
•
0 Aufrufe
Q3M2-RADAR AND LASER-2.pptx
MariaElenaLydiaDaqui
•
0 Aufrufe
RISK ANALYSIS FOR SEVERE TRAFFIC ACCIDENTS IN ROAD TUNNELS
Franco Bontempi
•
0 Aufrufe
6.1_IO-Principles.ppt
dedanndege
•
0 Aufrufe
Chapter02.PPT
Chaitanya Jambotkar
•
0 Aufrufe
Unit01_Session_04.pdf
Sanjay Gunjal
•
0 Aufrufe
CondensateFeedwaterSystem Part1.ppt
ArslanAbbas36
•
0 Aufrufe
2.Introduction to Network Devices.ppt
jaba kumar
•
0 Aufrufe
Unit01_Session_03.pptx
Sanjay Gunjal
•
0 Aufrufe
INTRODUCTION OF CNC MACHINES.docx
VasantAmbig
•
0 Aufrufe
CS3481 - DBMS LAB QUES_SPLITUPS.docx (6).pdf
AngelinJeyaseeliD
•
0 Aufrufe
Introduction to network and internet.ppt
AngelCharles7
•
0 Aufrufe
A Medical Price Prediction System using Boosting Algorithms through Machine Learning Techniques
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1035 A Medical Price Prediction System using Boosting Algorithms through Machine Learning Techniques Renu Manoj Prof. Ajeet Shrivastava prof. Rohit Bansal M.Tech. Scholar, Department of CSE Professor, Department of CSE Asst. Professor, Department of CSE SISTec-R, Bhopal (M.P), India SISTec-R, Bhopal (M.P) SISTec-R, Bhopal (M.P) ------------------------------------------------------------------------***----------------------------------------------------------------------- ABSTRACT The health care insurance cost plays a vital role in developing medical facilities. To provide better medical facilities, it is very essential to forecast the cost of medical insurance which is one of the possibilities to enhance medical facilities. The paper deals with predicting the cost of the health insurance which has to be paid by the patient. Here various data mining regression algorithms such as decision tree, random forest, polynomial regression and linear regression are implemented to achieve the best prediction analysis. A comparison has been done between the actual and predicted expenses of the prediction premium and eventually, a graph has been plotted on this basis which will enlighten us to choose the best-suited regression algorithm for the insurance policy prediction. After the execution of these regression algorithms for prediction, correctness has been measured by the Coefficient of determination (r2_score), Root Mean Squared Error (RMSE) and Mean Squared Error (MSE) of each algorithm to check for the best-suited algorithm. Random Forest Regression was the best algorithm with an r2_score of 0.862533 which can be used in its best possible way for the correct prediction of the health insurance cost. Keywords: Keywords: Natural Language Processing, Tokens, Features, Training & Testing Data, Model or classifier, Naïve Bayes, SVM, Bernoulli. I. INTRODUCTION Health insurance market is a crucial market because one- third of GDP [1] is spent on health insurance in the United States, and everyone needs some level of health care. Health insurance is one of the most significant investment an individual makes every year. This study is an effort to find mathematical models to predict future premiums and verify results using regression models. Medical costs that occur due to illness, accidents or any other health reasons are considerably expensive, by having health insurance, an individual is not liable for paying the entire medical costs of the procedure. According to the Office of Health and Human Services (HHS) [2], the total health service budget for the fiscal year 2015 is around 1100 billion dollars. There are several health care systems around the world. For example, single payer system followed by Canada where premiums are paid by taxes, government health care system followed by the United Kingdom where healthcare is the responsibility of central government. There are no existing tools to the best of our knowledge that can predict future premiums based on historic data. Therefore, there is a need to conduct research to find the premiums across the United States. 1.1 Health Service Area Here Researches explained the major domain or service area where Health Service will be utilized. They found many areas out of them important area is given below: Figure 1: Service Area Premium: Generally, every member from family some amount has to paid for the health insurance every month. This amount are knowns as premium, other costs are also paid for health care, including a deductible, co-payments, and coinsurance. Qualified Health Plan (QHP): QHP is defined as a government certified insurance plan that provides essential benefits such as emergency services, maternity care etc. In India we can say that is similar to Ayushman Bharat Yojana. Deductible: The minimum amount has to pay for covered health care services before any insurance plan starts to pay. Service Area Premium QHP Deductible Co-Insurance
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1036 1.2 Data Mining Tools Figure 2: Data Mining Tools In Figure 2 researchers explained the different data mining tools available for processing the given data. Generally, Weka is Java based tools which is very efficient for Data processing and finding the behaviour and patterns from the taken data. Apache Mahout is very efficient for Big Data. 1.3 Objective of Work Sentiment The Motto of This researcher is that In Today’s hectic life everybody has to take concert to their Health status because this will define that how much amount will be required in near future. So, any Insurance company comes to some conclusion after behaviour of Health of anyone. We all know that in previous some years through out the globe from poor people to rich people everybody is taking health plan because they know if they come under illness their economic condition may suffer them and their family. So, for their safety they also try to book any good plan that is less in premium cast and will give better cover in near future. This is very fast and growing industries. II. RELATED WORK Here In this section, research efforts from the finding the information and machine learning techniques are explained. Many papers have explained the issue of claim prediction. Authors suggested, "Predicting motor insurance claims using telematics data" in 2019. This research Explained for better performance of logistic regression and XGBoost techniques to forecast the presence of accident claims by a small number and results showed. LR is very Effective in some cases. this system takes pictures of the damaged car as inputs and produces relevant details, such as costs of repair to decide on the amount of insurance claim and locations of damage. Thus, the predicted car insurance claim was not taken into account in the present analysis but was focussed on calculating repair costs [4]. Health care sector is one of the biggest sectors in terms of the global economy. According to the World Bank, in 21 century, health care expenditures accounted for 10% approx. of the world’s total gross domestic product (GDP). Additionally, per capita health expenditures have been increasing day by day from last 10-20 years. In the many countries, the Centres for Medicare & Medicaid Services (CMS) reported that health care accounted for 17.5% of the national GDP [5]. III. PROBLEM IDENTIFICATION Many research’s is doing their work in this area. Authors have learned several things from this study. We come to conclusion that Insurance Sector is very growing in terms of number of customers growing as well as every year’s number of new Insurance company comes into existence. These industries have huge contribution into GDP. So here huge concern and analysis needed. Besides looking at various approaches and models, we also focus on important aspects in the Machine Learning. This will enable you to use these methodologies in the future. IV. ALGORITHMS Step 01: Store Data from Kaggle Repository Step 02: Import Prior Libraries Step03: Now Import our Required Dataset Step04: Apply Feature Extraction a) Bivariate Analysis b) Multivariate Analysis Step 05: Visualize Data for better understanding a) Descriptive Features b) Distribution of Different Features Step06: Applying Machine Learning Algorithms Step07: Apply Different Model a) Linear Regression b) Random Forest c) Gradient Boosting Step08: Repeat Step07 for many times with different Algorithms Step09: Finally Compare Results with performance parameters like RMSE Score & R2 Score. Step10: Stop Rapid Miner Orange Weka KMINE SSDT Apache Mahout
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1037 IV. Flow Diagram Figure 3: Flow Diagram In In figure 3 At first step, we need to fetched Data from any external source or we can collect Data from Local Market but for better Analysis we are Fetching our Data from Kaggle. That is very reliable Data Source through Word Wide. In Next Step we need to Fetched Different Libraries for processing our Data. At very next Step that is Third Step we need to process our Data for next step Here we have many processing Mechanism. We are using Numerical to Nominal Data Conversion or also using Uni- Variant and Multi-Variant Data Processing. At Next Step i.e., Fourth Step we need to repeat it for different Data split. Then after we will reach at step 5 where we will apply Different Machine Learning Algorithms and Finally, we will apply our own Proposed Methods i.e., Gradient Boosting with Weighted Average values. Here we will adding average values of previous implemented mechanism. At Final Step i.e., Sith step we will have to find Performance Measures i.e., RMSE Score & R2 Score will give clear views of proposed method and Existing one. At Final Step we will compare these given Results. We can say that Our Proposed Methods gives better Result. V. PROCESS OF IMPLEMENTATION Figure 4: Distribution of smoker, children and region (Pia & Bar Graph) In the Figure 4 reserchers Explained How Distribution gives clear Results : 1) We can say that we have an equal number of people of all ages 2.) Where maximum people have bmi around 30 3.)Finally Expenses are seem to be right skewed.(Learn Skewness from probablity and statistical) Note: Because Expenses are skewed in nature so that can be either transform this column using log transformation or square root transformation. And this can be converted into normal Distribution. Figure 5: Distribution of smoker, children and region (Area Graph) + Preprocessing Desired Data Repeat Input dataset Numerical to nominal Training Data Applied App. Different Algo Performance Parameter RMSE Score & R2 Score Evaluate Final Comparison 1. Linear Regression 2. Random Forest 3. Gradient Boosting with Weighted Average values. (Proposed Method) Compare
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1038 In figure 5 its clear picture that how Distribution of Smokers, Children and region gets effected. Output Results In the table 1 we are trying to show the Accuracy Results of Different Algorithms which we implemented. Table: 1 Algorithms Train Size (in %) Test Size (in %) Accuracy Linear Regression 70 30 0.79 Random Forest 70 30 0.87 Gradient Boosting with Weighted Average values (Proposed Methods) 70 30 0.89 Figure 6: Different Algorithms Comparison Line Graph In figure 6 By the analysis of above graph we can say that we find that Liner Regression ,Random Forest, Gradient Boosting with Weighted Average values (Proposed Method )gives Different Results respectively. Here We are going to proposed some tuning mechanism so that Results get improved. Here we are Implemented an weighted averaging model We tooked 50% weight to gradient boosting ,30% weight to random forest and 20% weight to linear regression. Due to to the above Data selection we are getting better Results. VI. CONCLUSION Here researches conclude that when we Implemented Number of Machine Learning Algorithms for finding best results in terms of performance. We Finds the major Features of given Data set are Smoking Behaviour, Age Group, Region Where They Lived and many more. Most Important Factors to Predict the Medical Expenses of a Patient is Smoking Behaviour, Age, Gender, Number of Children, the Region also have a good impact on determining the Medical Expenses. Finally, we Implemented Different Machine Learning Algorithms Like Linear Regression, Random Forest & Finally our Proposed Method gives results respectively i.e., 79%, 87 % and finally 89%. When we looked into our proposed methods then we can claim that with existing system our proposed methods give better results in terms of Accuracy. Apart from Accuracy we explained various Graphs from which we come for better understanding about our Implemented Models. VII. FUTURE SCOPE In this investigation the future works focus on applying some other techniques to improving the performances of these methods for up to maximum extent. Another concept that can be implemented Deep learning in place of machine learning technology. The reason behind this is best and efficient techniques using nowadays. Deep learning is also introduced nowadays which is becoming more popular for classification purpose. So, we can also implement deep learning in future work also. REFERENCES [1] Sonali Vyas, Rajeev Ranjan, Navdeep Singh, Arohan Mathur , "Review of Predictive Analysis Techniques for Analysis Diabetes Risk" , 978-1-5386-9346-9 ©2019 IEEE. [2] Gaurav Tripath, Rakesh Kumar , "Early Prediction of Diabetes Mellitus Using Machine Learning " ,2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) Amity University, Noida, India. June 4-5, 2020. [3] Messan Komi, J un Li ,"Application of Data Mining Methods in Diabetes Prediction",2017 2nd International Conference on Image, Vision and Computing. [4] Bakshi Rohit Prasad,Sonali Agarwal ,"Modeling Risk Prediction of Diabetes – A Preventive Measure" [5] J. N. Myhre, I. K. Launonen, S. Wei and F. Godtliebsen, "Controlling blood glucose levels in patients with type 1 diabetes using fitted qiterations and functional features," 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, pp. 1-6, 2018. 0.79 0.87 0.89 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 Linear Regression Random Forest Gradient Boosting with Weighted Average values (Proposed Methods) Accuract in % Different Algorithms Accuracy
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1039 [7] Zhang, Y., Lin, Z., Kang, Y., Ning, R., & Meng, Y. A FeedForward Neural Network Model For The Accurate Prediction Of Diabetes Mellitus. [8] Kadhm, M. S., Ghindawi, I. W., &Mhawi, D. E. (2018). An Accurate Diabetes Prediction System Based on K-means Clustering and Proposed Classification Approach. International Journal of Applied Engineering Research, 13(6), 4038-4041. [9] Sundaram, N. M. (2018). An Improved Elman Neural Network Classifier for classification of Medical Data for Diagnosis of Diabetes. International Journal of Engineering Science, 16317. [10] Deeraj Shetty,Kishor Rit, Sohail Shaikh, Nikita Patil,"Diabetes Disease Prediction Using Data Mining",2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS) [11] Muhammad Azeem Sarwar,Nasir Kamal,Wajeeha Hamid,Munam Ali Shah,"Prediction of Diabetes Using Machine Learning Algorithms in Healthcare",Proceedings of the 24th International Conference on Automation & Computing, Newcastle University,Newcastle upon Tyne, UK, 6-7 September 2018 [12]Md. Faisal Faruque,Asaduzzaman,Iqbal H. Sarker,"Performance Analysis of Machine Learning Techniques to Predict Diabetes Mellitus",2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), 7-9 February, 2019 [13] C. C. a. A. Semanskee, "Analysis of UnitedHealth Group’s Premiums and Participation in ACA Marketplaces," 2016. [14] "How the Affordable Care Act Has Improved America's Ability to Buy Health Insurance on Their Own," The Commonwealth Fund, 2017. [15] I. D. Henry G. Dove, and Arthur Robb, "A prediction model for targeting low-cost,highrisk members of managed care organizations," The American Journal of Managed Care, 2003.
Anzeige