following topics are discussed inside the PPT:
Introduction
Objective
Motivation
Literature Survey
Some Key Features of Disease
Plan of Action
Methodology Adopted
Data Collection
Steps to be Performed
Functional Architecture
1. DISEASE PREDICTION SYSTEM USING
DATA MINING
Under Guidance
of
asst. prof. Ashutosh Pandey
Presented By:-
Anand kumar mishra (1616210020)
Siddhesh shukla
(1616210112)
Shivani yadav
(1616210103)
B.Tech. (CS) – 4th year.
2. OUTLINE
• Introduction
▪ Objective
▪ Motivation
• Literature Survey
• Some Key Features of Disease
• Plan of Action
• Methodology Adopted
▪ Data Collection
▪ Steps to be Performed
▪ Functional Architecture
• Expected Result
• Conclusion
• References
3. OBJECTIVE
Identifying hidden patterns and relationships
among various attributes that can lead to:
▪ better diagnosis,
▪ better medicines,
▪ better treatment
▪ Early diagnosis may predict the chances of
Disease and lead to take
preventive measures before the situation
becomes critical.
4. MOTIVATION
• The prevalence of Diabetes is increasing in all
countries and its prevention has become a public
health priority.
• The predictors of Diabetes risk are insufficiently
understood.
5. WORLDWIDE STATISTICS
❏ The number of people with diabetes has risen
from 108 million in 1980 to 422 million in 2014.
❏ The global prevalence of diabetes among adults
over 18 years of age has risen from 4.7% in 1980
to 8.5% in 2014
❏ Diabetes is a major cause of blindness, kidney
failure, heart attacks, stroke and lower limb
amputation.
❏ In 2016, an estimated 1.6 million deaths were
directly caused by diabetes.
❏ Almost half of all deaths attributable to high
blood glucose occur before the age of 70 years.
6.
7. INDIAN STATISTICS
� Diabetes currently affects more than 62 million
Indians, which is more than 7.1% of the adult
population.
� Nearly 1 million Indians die due to diabetes
every year.
TOTAL
DEATHS
(in million)
SEX % DEATH DUE
TO DIABETES
40.2 MALE 8
23.8 FEMALE 12
8. MOTIVATION
❖ Recent research has shown that the onset of can be
postponed or prevented with lifestyle intervention or by
medication.
❖ Identifying individuals at high risk of cancer has therefore
become a priority for targeting preventive measures
effectively.
❖ Symptoms are often less marked, thus the disease may be
diagnosed several years after onset, once complications
have already arisen
9. SYMPTOMS OF DIABETES
� increased urine output,
� excessive thirst,
� weight loss,
� hunger,
� fatigue,
� skin problems
� slow healing wounds,
� yeast infections, and
� tingling or numbness in the feet or toes.
10.
11. PLAN OF WORK
Data Collection / Data Analysis
& Literature Review
(3-4 months)
Data Warehouse Construction
(1-2 months)
Building Classifier
(3-3.5 months)
Experimental Result Analysis
(1-2 months)
Report Preparation
(1-1.5 months)
12. METHODOLOGY ADOPTED
Step 1- Preprocessing of data
Step 2-Dividing the patients into different group
Step 3- Apply the Fuzzy Inference
Step 4-Using Apriori Algorithms to find the relative pattern
Step 5- Build Classifier
14. PREPROCESSING
-- MERGING DATA FROM MULTIPLE SOURCES INTO
UNIQUE FORMAT
-- MISSING VALUE HANDLING
Use the attribute mean for all samples belonging to the
same as the given tuple.
15. DIVIDE THE DATA SET INTO 3 CLUSTER
Dividing the patients into 3 three different group
according to different condition of patients
� Very serious
� Serious
� Normal
16. FUZZY LOGIC
� Intuitionistic Fuzzy Set : claim that an
element x belongs to a given degree μA(x) to a fuzzy
set x should not belong to A to the extent 1-
μA(x)
α - Cut:
Let α be a number between 0 and 1. The α-cut of fuzzy
set A at level α is the set of those elements of A where
membership function is greater than or equal to α.
16
17. APRIORI ALGORITHM
We use apriori algorithm to find relative pattern
Suppose we get association rule
A→B confidence 95%
Means if patient has attribute A then it will has
attribute B also with confidence of 95 %
18. BUILD CLASSIFIER
� Various techniques may be applied –
✔ Multilayer Back propagation feed-forward ANN
-- Train ANN using the weight of attributes
calculate
from association rules confidence value.
✔ Simple Weighted Sum Method
� Expected Result :
� -- Class Label prediction of the patient as either:
� normal
� serious or
� very serious
✔
19. EXPECTED RESULT
� Class Label prediction of the patient as either:
� normal
� serious or
� very serious
o Association among various attributes with
respective confidence level (A🡪B , CL)
20. CONCLUSION
We are finding the relative pattern of patients in a
hospital we uses IFS, α-cuts, and Apriori algorithm for
discovering the knowledge of patients.
Our approach will successfully protect the patients’
personal data privacy and will achieve some gratifying
results from the experiments.
Certainly, the approach is not limited in a disease, it
can be used in other fields in the long run.
21. REFERENCES
• “Mining Cancer data with Discrete Particle Swarm Optimization and Rule
Pruning “
Yao Liu and Yuk Ying Chun
• “Identifying HotSpots in Lung Cancer Data Using Association Rule
Mining “
Ankit Agrawal and Alok Choudhary
• “Comparison of feature selection methods for multiclass cancer
classification based on microarray data” Xiaobo Li1,2*,
Sihua Peng3, Xiaosi Zhan1
• “Lung cancer statistics,” centers for Disease Control and Prevention,
URL:http://www.cdc.gov/cancer/lung/statistics
• en.wikipedia.org/wiki/World_Health_Organization
• www.whoindia.org
• A. Jemal, F. Bray, M.M. Center, J. Ferlay, E. Ward, D. Forman(2011).
"Global cancer tatistics". CA: a cancer journal forclinicians61