SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Data Mining With Big
Data
Guide: Prof. Prashant G. Ahire
Presented by :
Miss.Rupa Solapure
Roll no. 259
Agenda
Problem Definition
Objectives
Literature Survey
Architecture/Big Data mining algorithm
Existing System/Mathematical model
Advantages
Disadvantages/Limitations
Characteristics of Big Data
Big Data and it’s challenges
Big Data mining Tools
Applications of Big Data
References
Problem Definition:
Big Data consists of huge modules, difficult, growing data sets with
numerous and , independent sources. With the fast development of
networking, storage of data, and the data gathering capacity, Big Data are
now quickly increasing in all science and engineering domains, as well as
animal, genetic and biomedical sciences. This paper elaborates a HACE
theorem that states the characteristics of the Big Data revolution, and
proposes a Big Data processing model from the data mining view.
Objective:
This requires carefully designed algorithms to analyze model correlations
between distributed sites, and fuse decisions from multiple sources to gain a best
model out of the Big Data. Developing a safe and sound information sharing
protocol is a major challenge.
To support Big Data mining, high-performance computing platforms are
required, which impose systematic designs to unleash the full power of the Big
Data. Big data as an emerging trend and the need for Big data mining is rising in
all science and engineering domains.
Literature Survey
Title/Year Keywords Concept/Abstract Author
“Data Mining With Big
Data,Jan 2014”
Big Data,data
Mining,Heterogeneity,Au
tonomous
sources,Complex,and
Evolving associations.
This paper presents a HACE
theorem that characterizes the
features of Big Data
revolutions,processing model
from data mining.
Xindong Wu, Fellow,
IEEE, Xingquan Zhu,
Senior Member, IEEE,
Gong-Qing Wu, and Wei
Ding
“The Survey of Data
Mining Applications
And Feature
Scope,,June 2012”
Data mining task, Data
mining life cycle ,
Visualization of the data
mining model , Data
mining Methods,s
Data mining applications.
This paper imparts more
number of applications of the
data mining and also o focuses
scope of the data mining which
will helpful in the further
research.
Neelamadhab Padhy1,
Dr. Pragnyaban Mishra 2,
and Rasmita Panigrahi3
“Review on Data
Mining with Big
Data..Dec 2014”
Big Data, data mining,
heterogeneity,
autonomous sources,
complex and evolving
associations.
This data-driven model involves
demand-driven aggregation of
information sources, mining and
analysis, security and privacy
considerations.
Savita Suryavanshi, Prof.
Bharati Kale.
“SURVEY ON BIG
DATA MINING
PLATFORMS,
ALGORITHMS AND
CHALLENGES.sep201
4”
big data, big data mining
platforms, big data
mining algorithms, big
data mining challenges,
data mining.
This paper gives A review on
various big data mining
platforms, algorithms and
challenges is also discussed in
this paper.
SHERIN A1, Dr S UMA2,
SARANYA K3, SARANYA
VANI M4.
Architecture:
Fig.: Big data Memory evolution
Data Mining Algorithm
 Decision tree induction classification algorithms
 Evolutionary based classification algorithms
 Partitioning based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Model based clustering algorithms
Existing System:
The rise of Big Data applications where data collection has grown tremendous
doubly and is beyond the ability of commonly used software tools to capture,
manage, and process within a “tolerable elapsed time.”
The most fundamental challenge for Big Data applications is to explore the large
volumes of data and extract useful information or knowledge for future actions.
In many situations, the knowledge extraction process has to be very efficient and
close to real time because storing all observed data is nearly infeasible.
The unprecedented data volumes require an effective data analysis and prediction
platform to achieve fast response and real-time classification for such Big Data.
In model level it will produce local pattern. This pattern will be produced after
mined local data.
By sharing these local patterns with other local sites, we can produce a single
global pattern.
At the knowledge level, model correlation analysis investigates the relevance
between models generated from various data sources to determine how related
the data sources are correlated to each other, and how to form accurate decisions
based on models built from autonomous sources
Continue…
Big Data
Big Data is a comprehensive term for any collection of data sets so large and multifarious
that it becomes difficult to process them using conventional data processing applications.
There are two types of Big Data: structured and unstructured.
Structured data
Structured data are numbers and words that can be easily categorized and analyzed.
These data are generated by things like network sensors embedded in electronic
devices, smart phones, and global positioning system (GPS) devices. Structured data
also include things like sales figures, account balances, and transaction data.
Unstructured data
Unstructured data include more multifarious information, such as customer reviews
from feasible websites, photos and other multimedia, and comments on social
networking sites. These data can not be separated into categorized or analyzed
numerically.
Big Data Characteristic(HACE Theorem)
Figure . The blind men and the enormous elephant: the restricted view
of each blind man leads to a biased conclusion.
HACE theorem suggests that the key characteristics of the
Big Data are:
A. Huge with various and miscellaneous data sources
B. Autonomous Sources with circulated & disperse Control
C. Complex and Evolving associations
Applications of Data Mining
Marketing
 Analysis of consumer behaviour
 Advertising campaigns
 Targeted mailings
 Segmentation of customers, stores, or products
Finance
 Creditworthiness of clients
 Performance analysis of finance investments
 Fraud detection
Manufacturing
 Optimization of resources
 Optimization of manufacturing processes
 Product design based on customer requirements
Health Care
 Discovering patterns in X-ray images
 Analyzing side effects of drugs
 Effectiveness of treatments
Big Data Mining Algorithm
Big data applications have so many sources to gather information.
 If we want to mine data, we need to gather all distributed data to the
centralized site.But it is prohibited because of high data transmission cost
and privacy concerns.
Most of the mining levels order to achieve the pattern of correlations, or
patterns can be discovered from combined variety of sources.
The global data mining is done through two steps process.
 Model level
Knowledge level.
Each and every local sites use local data to calculate the data statistics
and it share this information in order to achieve global data distribution in
their data level.
Data Mining Challenges With Big Data
Fig. a conceptual view of the Big Data processing framework
DISADVANTAGES OF EXISTING
SYSTEM
To explore Big Data, we have analysed several challenges at the
data, model, and system levels.
The challenges at Tier I focus on data accessing and arithmetic
computing procedures. Because Big Data are often stored at
different locations and data volumes may continuously grow, an
effective computing platform will have to take distributed large-
scale data storage into consideration for computing.
PROPOSED SYSTEM
We propose a HACE theorem to model Big Data characteristics. The
characteristics of HACH make it an extreme challenge for
discovering useful knowledge from the Big Data.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
Characteristics of Big Data
Fig. Five Vs of BIG DATA
Volume- The quantity of data
Variety - categorizing the data
Velocity- speed of generation of data or the speed
of processing the data
Variability- Inconsistency
Complexity- Managing the data
Continue…
BIG Data Mining Tools
Hadoop
Apache S4
Strom
Apache Mahout
MOA
Fig.: Big Data processing
Conclusion:
Because of Increase in the amount of data in the field of genomics,
meteorology, biology, environmental research, it becomes difficult to handle
the data, to find Associations, patterns and to analyze the large data sets.
As an organization collects more data at this scale, formalizing the process of
big data analysis will become paramount.The paper describes methods for
different algorithms used to handle such large data sets. And it gives an
overview of architecture and algorithms used in large data sets.
References
 McKinsy Global Institute, Big Data: The next frontier for
innovation, competition and productivity- May 2011
Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013,
Data Mining with Big Data
 Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis,
Algorithms for mining the evolution of conserved relational states in
dynamic network
 IEEE, Data Mining with Big Data, January 2014
 Oracle, June 2013,Unstructured Data Management with Oracle
Database 12c
Data minig with Big data analysis

Weitere ähnliche Inhalte

Was ist angesagt?

Big data and data mining
Big data and data miningBig data and data mining
Big data and data miningPolash Halder
 
Big Data Case Study on Walmart
Big Data Case Study on WalmartBig Data Case Study on Walmart
Big Data Case Study on WalmartJainamParikh3
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01Aseem Chakrabarthy
 
Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Big data analytics in healthcare
Big data analytics in healthcareBig data analytics in healthcare
Big data analytics in healthcareJoseph Thottungal
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data ManagementBhavendra Chavan
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data PresentationMatthew Urdan
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?Thanakrit Lersmethasakul
 
Database management systems
Database management systemsDatabase management systems
Database management systemsJoel Briza
 
Big data analytics in healthcare industry
Big data analytics in healthcare industryBig data analytics in healthcare industry
Big data analytics in healthcare industryBhagath Gopinath
 

Was ist angesagt? (20)

Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Big Data Case Study on Walmart
Big Data Case Study on WalmartBig Data Case Study on Walmart
Big Data Case Study on Walmart
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big data analytics in healthcare
Big data analytics in healthcareBig data analytics in healthcare
Big data analytics in healthcare
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data Management
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?
 
Database management systems
Database management systemsDatabase management systems
Database management systems
 
Metadata framework for agricultural resources information system (ag ris)
Metadata framework for agricultural resources information system (ag ris)Metadata framework for agricultural resources information system (ag ris)
Metadata framework for agricultural resources information system (ag ris)
 
Big data analytics in healthcare industry
Big data analytics in healthcare industryBig data analytics in healthcare industry
Big data analytics in healthcare industry
 

Andere mochten auch

Data mining with big data
Data mining with big dataData mining with big data
Data mining with big datakk1718
 
Big Data
Big DataBig Data
Big DataNGDATA
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsAgile Technologies
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think DifferentlyDia Lao
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3Frank Henry
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS InfotechManju Nath
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE TitlesManju Nath
 
Pattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparabilityPattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparability Athiq Ahamed
 

Andere mochten auch (20)

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data
Big DataBig Data
Big Data
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our Organizations
 
Data is Currency
Data is CurrencyData is Currency
Data is Currency
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think Differently
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
Pattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparabilityPattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparability
 

Ähnlich wie Data minig with Big data analysis

ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGcscpconf
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutionscsandit
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfDr. Radhey Shyam
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Toolsijsrd.com
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmIRJET Journal
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSijistjournal
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378Parag Kapile
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective ApproachIRJET Journal
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)NikitaRajbhoj
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesKaran Deep Singh
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyAnthonyOtuonye
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
 

Ähnlich wie Data minig with Big data analysis (20)

ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
 
BigData
BigDataBigData
BigData
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE Theorem
 

Kürzlich hochgeladen

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 

Kürzlich hochgeladen (20)

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 

Data minig with Big data analysis

  • 1. Data Mining With Big Data Guide: Prof. Prashant G. Ahire Presented by : Miss.Rupa Solapure Roll no. 259
  • 2. Agenda Problem Definition Objectives Literature Survey Architecture/Big Data mining algorithm Existing System/Mathematical model Advantages Disadvantages/Limitations Characteristics of Big Data Big Data and it’s challenges Big Data mining Tools Applications of Big Data References
  • 3. Problem Definition: Big Data consists of huge modules, difficult, growing data sets with numerous and , independent sources. With the fast development of networking, storage of data, and the data gathering capacity, Big Data are now quickly increasing in all science and engineering domains, as well as animal, genetic and biomedical sciences. This paper elaborates a HACE theorem that states the characteristics of the Big Data revolution, and proposes a Big Data processing model from the data mining view.
  • 4. Objective: This requires carefully designed algorithms to analyze model correlations between distributed sites, and fuse decisions from multiple sources to gain a best model out of the Big Data. Developing a safe and sound information sharing protocol is a major challenge. To support Big Data mining, high-performance computing platforms are required, which impose systematic designs to unleash the full power of the Big Data. Big data as an emerging trend and the need for Big data mining is rising in all science and engineering domains.
  • 5. Literature Survey Title/Year Keywords Concept/Abstract Author “Data Mining With Big Data,Jan 2014” Big Data,data Mining,Heterogeneity,Au tonomous sources,Complex,and Evolving associations. This paper presents a HACE theorem that characterizes the features of Big Data revolutions,processing model from data mining. Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding “The Survey of Data Mining Applications And Feature Scope,,June 2012” Data mining task, Data mining life cycle , Visualization of the data mining model , Data mining Methods,s Data mining applications. This paper imparts more number of applications of the data mining and also o focuses scope of the data mining which will helpful in the further research. Neelamadhab Padhy1, Dr. Pragnyaban Mishra 2, and Rasmita Panigrahi3 “Review on Data Mining with Big Data..Dec 2014” Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associations. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, security and privacy considerations. Savita Suryavanshi, Prof. Bharati Kale. “SURVEY ON BIG DATA MINING PLATFORMS, ALGORITHMS AND CHALLENGES.sep201 4” big data, big data mining platforms, big data mining algorithms, big data mining challenges, data mining. This paper gives A review on various big data mining platforms, algorithms and challenges is also discussed in this paper. SHERIN A1, Dr S UMA2, SARANYA K3, SARANYA VANI M4.
  • 6. Architecture: Fig.: Big data Memory evolution
  • 7. Data Mining Algorithm  Decision tree induction classification algorithms  Evolutionary based classification algorithms  Partitioning based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Model based clustering algorithms
  • 8. Existing System: The rise of Big Data applications where data collection has grown tremendous doubly and is beyond the ability of commonly used software tools to capture, manage, and process within a “tolerable elapsed time.” The most fundamental challenge for Big Data applications is to explore the large volumes of data and extract useful information or knowledge for future actions. In many situations, the knowledge extraction process has to be very efficient and close to real time because storing all observed data is nearly infeasible. The unprecedented data volumes require an effective data analysis and prediction platform to achieve fast response and real-time classification for such Big Data.
  • 9. In model level it will produce local pattern. This pattern will be produced after mined local data. By sharing these local patterns with other local sites, we can produce a single global pattern. At the knowledge level, model correlation analysis investigates the relevance between models generated from various data sources to determine how related the data sources are correlated to each other, and how to form accurate decisions based on models built from autonomous sources Continue…
  • 10. Big Data Big Data is a comprehensive term for any collection of data sets so large and multifarious that it becomes difficult to process them using conventional data processing applications. There are two types of Big Data: structured and unstructured. Structured data Structured data are numbers and words that can be easily categorized and analyzed. These data are generated by things like network sensors embedded in electronic devices, smart phones, and global positioning system (GPS) devices. Structured data also include things like sales figures, account balances, and transaction data. Unstructured data Unstructured data include more multifarious information, such as customer reviews from feasible websites, photos and other multimedia, and comments on social networking sites. These data can not be separated into categorized or analyzed numerically.
  • 11. Big Data Characteristic(HACE Theorem) Figure . The blind men and the enormous elephant: the restricted view of each blind man leads to a biased conclusion.
  • 12. HACE theorem suggests that the key characteristics of the Big Data are: A. Huge with various and miscellaneous data sources B. Autonomous Sources with circulated & disperse Control C. Complex and Evolving associations
  • 13. Applications of Data Mining Marketing  Analysis of consumer behaviour  Advertising campaigns  Targeted mailings  Segmentation of customers, stores, or products Finance  Creditworthiness of clients  Performance analysis of finance investments  Fraud detection Manufacturing  Optimization of resources  Optimization of manufacturing processes  Product design based on customer requirements Health Care  Discovering patterns in X-ray images  Analyzing side effects of drugs  Effectiveness of treatments
  • 14. Big Data Mining Algorithm Big data applications have so many sources to gather information.  If we want to mine data, we need to gather all distributed data to the centralized site.But it is prohibited because of high data transmission cost and privacy concerns. Most of the mining levels order to achieve the pattern of correlations, or patterns can be discovered from combined variety of sources. The global data mining is done through two steps process.  Model level Knowledge level. Each and every local sites use local data to calculate the data statistics and it share this information in order to achieve global data distribution in their data level.
  • 15. Data Mining Challenges With Big Data Fig. a conceptual view of the Big Data processing framework
  • 16. DISADVANTAGES OF EXISTING SYSTEM To explore Big Data, we have analysed several challenges at the data, model, and system levels. The challenges at Tier I focus on data accessing and arithmetic computing procedures. Because Big Data are often stored at different locations and data volumes may continuously grow, an effective computing platform will have to take distributed large- scale data storage into consideration for computing.
  • 17. PROPOSED SYSTEM We propose a HACE theorem to model Big Data characteristics. The characteristics of HACH make it an extreme challenge for discovering useful knowledge from the Big Data.
  • 18. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 19. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 20. Characteristics of Big Data Fig. Five Vs of BIG DATA
  • 21. Volume- The quantity of data Variety - categorizing the data Velocity- speed of generation of data or the speed of processing the data Variability- Inconsistency Complexity- Managing the data Continue…
  • 22. BIG Data Mining Tools Hadoop Apache S4 Strom Apache Mahout MOA
  • 23. Fig.: Big Data processing
  • 24. Conclusion: Because of Increase in the amount of data in the field of genomics, meteorology, biology, environmental research, it becomes difficult to handle the data, to find Associations, patterns and to analyze the large data sets. As an organization collects more data at this scale, formalizing the process of big data analysis will become paramount.The paper describes methods for different algorithms used to handle such large data sets. And it gives an overview of architecture and algorithms used in large data sets.
  • 25. References  McKinsy Global Institute, Big Data: The next frontier for innovation, competition and productivity- May 2011 Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data  Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution of conserved relational states in dynamic network  IEEE, Data Mining with Big Data, January 2014  Oracle, June 2013,Unstructured Data Management with Oracle Database 12c