SlideShare a Scribd company logo
1 of 34
Concept of Big Data
Presented by
MTech-CE(Boys Group)
What is Data
The word Data is plural of datum in the Latin dare
which meant "to give", that is to “something given”.
Data as an abstract concept can be viewed as the
lowest level of abstraction from
which information and then knowledge are derived.
Information in raw or unorganized form(such as
alphabets, numbers, or symbols) that refer to,
or represent, conditions, ideas, or objects. Data is
limitless and present everywhere in the universe. See
also information and knowledge.
Computers: Symbols or signals that are input,
stored, and processed by a computer, for output as
usable information.
Type of Data
Relational Data (Tables/Transaction/Legacy Data)
Text Data (Web)
Semi-structured Data (XML)
Graph Data
Social Network, SemanticWeb (RDF), …
Streaming Data
You can only scan the data once
Big Data
Definition
Big data is a massive volume of both structured and
unstructured data that is so large that it's difficult to
process with traditional database and software
techniques.
Big data is the term for a collection of data sets so
large and complex that it becomes difficult to
process using on-hand database management tools
or traditional data processing applications
Big data is data whose scale, diversity, and
complexity require new architecture, techniques,
algorithms, and analytics to manage it and extract
value and hidden knowledge from it…
Walmart handles more than 1 million customer
transactions every hour.
Facebook handles 40 billion photos from its
user base.
Decoding the human genome originally took 10
years to process; now it can be achieved in one
week.
Google processes 20 PB a day (2008)
Wayback Machine has 3 PB + 100TB/month
(3/2009)
Facebook has 2.5 PB of user data + 15TB/day
(4/2009)
eBay has 6.5 PB of user data + 50TB/day (5/2009)
Where the
Big Data???
DataUnits
Big Data is Data growing
faster than Moore’s law
1 Bytes - 8 Bits
1 Kilobyte(KB) - 10^3 Bytes
1 Megabyte(MB) - 10^6 Bytes
1 Gigabyte(GB) - 10^9 Bytes
1 Terabyte(TB) - 10^12 Bytes)
Big Big Big
Data
Petabyte(PB) - 10^15 Bytes
Exabyte (EB) - 10^18 Bytes
Zettabyte(ZB) - 10^21 Bytes
Yottabyte (YB) - 10^24 Bytes
Xenottabyte(XB) - 10^27 Bytes
Shilentnobyte (SB) - 10^30 Bytes
Domegrottebyte (DB) - 10^33 Bytes
Characteristics
of Big Data
Volume
DataVolume
44x increase from 2009 2020
From 0.8 zettabytes to 35zb
Data volume is increasing exponentially
Varity
Various formats, types, and structures
Text, numerical, images, audio, video,
sequences, time series, social media data,
multi-dim arrays, etc…
Static data vs. streaming data
A single application can be
generating/collecting many types of data
Velocity
Data is begin generated fast and need to be
processed fast
Online Data Analytics
Late decisions  missing opportunities
Examples
E-Promotions: Based on your current location,
your purchase history, what you like  send
promotions right now for store next to you
Healthcare monitoring: sensors monitoring
your activities and body  any abnormal
measurements require immediate reaction
Big Data
(3-V)
Some Make it
4V’s
Harnessing
Big Data
OLTP: OnlineTransaction Processing
(DBMSs)
OLAP: Online Analytical Processing
(DataWarehousing)
RTAP: Real-TimeAnalytics
Processing (Big DataArchitecture &
technology)
LayOut
Who’s
Generating Big
Data
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
Implementation
of Big Data
Parallel DBMS technologies
Proposed in late eighties
Matured over the last two decades
Multi-billion dollar industry: Proprietary
DBMS Engines intended as Data
Warehousing solutions for very large
enterprises
Map Reduce
pioneered by Google
popularized byYahoo! (Hadoop)
MetaData
Management
of Big Data
MapReduce Parallel DBMS technologies
 Data-parallel programming
model
 An associated parallel and
distributed
 implementation for
commodity clusters
 Popularized by open-
source Hadoop
 Used byYahoo!,
Facebook,
 Amazon, and the list
is growing …
 Popularly used for more than
two decades
 Research Projects: Gamma,
Grace, …
 Commercial: Multi-billion
dollar industry but access to
only a privileged few
 Relational Data Model
 Indexing
 Familiar SQL interface
 Advanced query optimization
 Well understood and studied
Comparison
MapReduce
Advantages
Automatic Parallelization:
Depending on the size of RAW INPUT DATA 
instantiate multiple MAP tasks
Similarly, depending upon the number of
intermediate <key, value> partitions 
instantiate multiple REDUCE tasks
Run-time:
Data partitioning
Task scheduling
Handling machine failures
Managing inter-machine communication
Completely transparent to the programmer / analyst
/ end user
Big dataset
(Hadoop)
Why Hadoop
Big Data analytics and the apache hadoop
open source project are rapidly emerging as
the preferred solution to address business &
technology trends that’s are disrupting
traditional data management & processing
Hadoop
Adoption in
Industry
What is
Hadoop???
Challenge in
Big Data
 Big Data Integration is Multidisciplinary
Less than 10% of Big Data world are genuinely
relational
Meaningful data integration in the real, messy, schema-
less and complex Big Data world of database and
semantic web using multidisciplinary and multi-
technology method
The Linked Open Data Ripper
Mapping, Ranking,Visualization, Key Matching,
Snappiness
Demonstrate theValue of Semantics: let data integration
drive DBMS technology
Large volumes of heterogeneous data, like link data
and RDF
Provocations
for Big Data
1. Automating Research Changes the Definition of
Knowledge
2. Claim to Objectively and Accuracy are
Misleading
3. Bigger Data are not always Better data
4. Not all Data are equivalent
5. Just because it is accessible doesn’t make it
ethical
6. Limited access to big data creates new digital
divides
Who is
collecting all
Big Data
Web Browsers Search Engines
Who is
collecting all
Big Data
Smartphones & Apps
Apple’s iPhone
(Apple O/S)
Samsung, HTC.
Nokia, Motorola
(Android O/S)
RIM Corp’s Blackberry
(BlackBerry O/S)
Tablet Computers & Apps
Apple’s iPad
Samsung’s Galaxy
Amazon’s Kindle Fire
Who is
collecting for
what?
Credit Card Companies What data are they getting?
Restaurant check
Grocery Bill
Airline ticket
Hotel Bill
Why are they
collecting all
this data?
Target Marketing
 To send you catalogs for exactly
the merchandise you typically
purchase.
 To suggest medications that
precisely match your medical
history.
 To “push” television channels to
your set instead of your
“pulling” them in.
 To send advertisements on
those channels just for us!
Targeted Information
 To know what you need before
you even know you need it
based on past purchasing
habits!
 To notify you of your expiring
driver’s license or credit cards
or last refill on a Rx, etc.
 To give you turn-by-turn
directions to a shelter in case of
emergency.
Future
Enhancement
Smartphones and tablets outsold desktop and
laptop computers in 2011. There are more
Smartphones in the U.S. in 2012 than people!
The phone in your pocket has more programmable
memory, more storage and more capability than
several large IBM computers.
It takes dozens of microprocessors running 100 million
lines of code to get a premium car out of the
driveway, and this software is only going to get more
complex. In fact, the cost of software and electronics
accounts for 30-40% of the price.
Conclusion
Big Data and Big Data Analytics – Not Just for Large
Organizations
It Is Not Just About Building Bigger Databases
Moving Processing to the Data SourceYields Big Dividends
Choose the Most Appropriate Big Data Scenario
 Complete data scenario whereby entire data sets can
be properly managed and factored into analytical
processing, complete with in-database or in-memory
processing and grid technologies.
 Targeted data scenarios that use analytics and data
management tools to determine the right data to feed
into analytic models, for situations where using data set
isn’t technically feasible or adds little value.
Closing
Thought
Big data is not just about helping an organization be
more successful – to market more effectively or improve
business operations.
High-performance analytics from designed to support
big data initiatives, with in-memory, in-database and grid
computing options.
Those organizations can benefit from cloud computing,
where big data analytics is delivered as a service and IT
resources can be quickly adjusted to meet changing
business demands.
On Demand provides customers with the option to push
big data analytics to greatly eliminating the time, capital
expense and maintenance associated with on-premises
deployments.
Thank you

More Related Content

What's hot

Big Data in Medicine
Big Data in MedicineBig Data in Medicine
Big Data in MedicineNasir Arafat
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewSivashankar Ganapathy
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Reportjosnapv
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)SiamAhmed16
 
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Edureka!
 
big data Presentation
big data Presentationbig data Presentation
big data PresentationMahmoud Farag
 
Structured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebookStructured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebookEmcien Corporation
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareSkillspeed
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data PresentationMatthew Urdan
 

What's hot (20)

Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big Data in Medicine
Big Data in MedicineBig Data in Medicine
Big Data in Medicine
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Report
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
 
big data Presentation
big data Presentationbig data Presentation
big data Presentation
 
Big data
Big dataBig data
Big data
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Structured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebookStructured and Unstructured Big Data ebook
Structured and Unstructured Big Data ebook
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big data
Big dataBig data
Big data
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
 

Viewers also liked

Big data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilotsBig data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilotsBigData_Europe
 
Big data ... for security
Big data ... for securityBig data ... for security
Big data ... for securityJames Salter
 
Digital World Overview Final La 020411
Digital World Overview Final La 020411Digital World Overview Final La 020411
Digital World Overview Final La 020411leorodriquez
 
CSCMP 2014: Big Data Use in Retail Supply Chains
CSCMP 2014: Big Data Use in Retail Supply ChainsCSCMP 2014: Big Data Use in Retail Supply Chains
CSCMP 2014: Big Data Use in Retail Supply ChainsAnnibalSodero
 
Good Practices and Recommendations on the Security and Resilience of Big Data...
Good Practices and Recommendations on the Security and Resilience of Big Data...Good Practices and Recommendations on the Security and Resilience of Big Data...
Good Practices and Recommendations on the Security and Resilience of Big Data...Eftychia Chalvatzi
 
MS PPM Summit Chicago_Nov 2015
MS PPM Summit Chicago_Nov 2015MS PPM Summit Chicago_Nov 2015
MS PPM Summit Chicago_Nov 2015Ludvic Baquie
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcareBYTE Project
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and OpportunitiesKenny Huang Ph.D.
 
Intel boubker el mouttahid
Intel boubker el mouttahidIntel boubker el mouttahid
Intel boubker el mouttahidBigDataExpo
 
BigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" IntroductionBigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" IntroductionIvan Gruer
 
Big data and cyber security legal risks and challenges
Big data and cyber security legal risks and challengesBig data and cyber security legal risks and challenges
Big data and cyber security legal risks and challengesKapil Mehrotra
 
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon DunwoodyLayering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoodywkwsci-research
 
Big-Data in HealthCare _ Overview
Big-Data in HealthCare _ OverviewBig-Data in HealthCare _ Overview
Big-Data in HealthCare _ OverviewHamdaoui Younes
 
Data vs. information
Data vs. informationData vs. information
Data vs. informationBesar Limani
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities台灣資料科學年會
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 

Viewers also liked (20)

Big data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilotsBig data Europe: concept, platform and pilots
Big data Europe: concept, platform and pilots
 
Big data ... for security
Big data ... for securityBig data ... for security
Big data ... for security
 
Digital World Overview Final La 020411
Digital World Overview Final La 020411Digital World Overview Final La 020411
Digital World Overview Final La 020411
 
CSCMP 2014: Big Data Use in Retail Supply Chains
CSCMP 2014: Big Data Use in Retail Supply ChainsCSCMP 2014: Big Data Use in Retail Supply Chains
CSCMP 2014: Big Data Use in Retail Supply Chains
 
Good Practices and Recommendations on the Security and Resilience of Big Data...
Good Practices and Recommendations on the Security and Resilience of Big Data...Good Practices and Recommendations on the Security and Resilience of Big Data...
Good Practices and Recommendations on the Security and Resilience of Big Data...
 
MS PPM Summit Chicago_Nov 2015
MS PPM Summit Chicago_Nov 2015MS PPM Summit Chicago_Nov 2015
MS PPM Summit Chicago_Nov 2015
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
 
Intel boubker el mouttahid
Intel boubker el mouttahidIntel boubker el mouttahid
Intel boubker el mouttahid
 
BigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" IntroductionBigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" Introduction
 
Big Data in Cyber Security
Big Data in Cyber SecurityBig Data in Cyber Security
Big Data in Cyber Security
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 
Big data and cyber security legal risks and challenges
Big data and cyber security legal risks and challengesBig data and cyber security legal risks and challenges
Big data and cyber security legal risks and challenges
 
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon DunwoodyLayering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
 
Big-Data in HealthCare _ Overview
Big-Data in HealthCare _ OverviewBig-Data in HealthCare _ Overview
Big-Data in HealthCare _ Overview
 
Data vs. information
Data vs. informationData vs. information
Data vs. information
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 

Similar to A Big Data Concept

Similar to A Big Data Concept (20)

Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Big data
Big data Big data
Big data
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Big data
Big dataBig data
Big data
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
ANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEWANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEW
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest MindsWhitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
Whitepaper: Know Your Big Data – in 10 Minutes! - Happiest Minds
 
Big Data
Big DataBig Data
Big Data
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
big-data-notes1.ppt
big-data-notes1.pptbig-data-notes1.ppt
big-data-notes1.ppt
 
Big Data
Big DataBig Data
Big Data
 

More from Dharmesh Tank

Basic of Python- Hands on Session
Basic of Python- Hands on SessionBasic of Python- Hands on Session
Basic of Python- Hands on SessionDharmesh Tank
 
Goal Recognition in Soccer Match
Goal Recognition in Soccer MatchGoal Recognition in Soccer Match
Goal Recognition in Soccer MatchDharmesh Tank
 
Face recognization using artificial nerual network
Face recognization using artificial nerual networkFace recognization using artificial nerual network
Face recognization using artificial nerual networkDharmesh Tank
 
Graph problem & lp formulation
Graph problem & lp formulationGraph problem & lp formulation
Graph problem & lp formulationDharmesh Tank
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain MultithreadingDharmesh Tank
 

More from Dharmesh Tank (6)

Basic of Python- Hands on Session
Basic of Python- Hands on SessionBasic of Python- Hands on Session
Basic of Python- Hands on Session
 
Seminar on MATLAB
Seminar on MATLABSeminar on MATLAB
Seminar on MATLAB
 
Goal Recognition in Soccer Match
Goal Recognition in Soccer MatchGoal Recognition in Soccer Match
Goal Recognition in Soccer Match
 
Face recognization using artificial nerual network
Face recognization using artificial nerual networkFace recognization using artificial nerual network
Face recognization using artificial nerual network
 
Graph problem & lp formulation
Graph problem & lp formulationGraph problem & lp formulation
Graph problem & lp formulation
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 

Recently uploaded

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 

A Big Data Concept

  • 1. Concept of Big Data Presented by MTech-CE(Boys Group)
  • 2. What is Data The word Data is plural of datum in the Latin dare which meant "to give", that is to “something given”. Data as an abstract concept can be viewed as the lowest level of abstraction from which information and then knowledge are derived. Information in raw or unorganized form(such as alphabets, numbers, or symbols) that refer to, or represent, conditions, ideas, or objects. Data is limitless and present everywhere in the universe. See also information and knowledge. Computers: Symbols or signals that are input, stored, and processed by a computer, for output as usable information.
  • 3. Type of Data Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi-structured Data (XML) Graph Data Social Network, SemanticWeb (RDF), … Streaming Data You can only scan the data once
  • 4. Big Data Definition Big data is a massive volume of both structured and unstructured data that is so large that it's difficult to process with traditional database and software techniques. Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications Big data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it…
  • 5. Walmart handles more than 1 million customer transactions every hour. Facebook handles 40 billion photos from its user base. Decoding the human genome originally took 10 years to process; now it can be achieved in one week. Google processes 20 PB a day (2008) Wayback Machine has 3 PB + 100TB/month (3/2009) Facebook has 2.5 PB of user data + 15TB/day (4/2009) eBay has 6.5 PB of user data + 50TB/day (5/2009) Where the Big Data???
  • 6. DataUnits Big Data is Data growing faster than Moore’s law 1 Bytes - 8 Bits 1 Kilobyte(KB) - 10^3 Bytes 1 Megabyte(MB) - 10^6 Bytes 1 Gigabyte(GB) - 10^9 Bytes 1 Terabyte(TB) - 10^12 Bytes)
  • 7. Big Big Big Data Petabyte(PB) - 10^15 Bytes Exabyte (EB) - 10^18 Bytes Zettabyte(ZB) - 10^21 Bytes Yottabyte (YB) - 10^24 Bytes Xenottabyte(XB) - 10^27 Bytes Shilentnobyte (SB) - 10^30 Bytes Domegrottebyte (DB) - 10^33 Bytes
  • 9. Volume DataVolume 44x increase from 2009 2020 From 0.8 zettabytes to 35zb Data volume is increasing exponentially
  • 10. Varity Various formats, types, and structures Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc… Static data vs. streaming data A single application can be generating/collecting many types of data
  • 11. Velocity Data is begin generated fast and need to be processed fast Online Data Analytics Late decisions  missing opportunities Examples E-Promotions: Based on your current location, your purchase history, what you like  send promotions right now for store next to you Healthcare monitoring: sensors monitoring your activities and body  any abnormal measurements require immediate reaction
  • 14. Harnessing Big Data OLTP: OnlineTransaction Processing (DBMSs) OLAP: Online Analytical Processing (DataWarehousing) RTAP: Real-TimeAnalytics Processing (Big DataArchitecture & technology)
  • 16. Who’s Generating Big Data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data)
  • 17. Implementation of Big Data Parallel DBMS technologies Proposed in late eighties Matured over the last two decades Multi-billion dollar industry: Proprietary DBMS Engines intended as Data Warehousing solutions for very large enterprises Map Reduce pioneered by Google popularized byYahoo! (Hadoop)
  • 19. MapReduce Parallel DBMS technologies  Data-parallel programming model  An associated parallel and distributed  implementation for commodity clusters  Popularized by open- source Hadoop  Used byYahoo!, Facebook,  Amazon, and the list is growing …  Popularly used for more than two decades  Research Projects: Gamma, Grace, …  Commercial: Multi-billion dollar industry but access to only a privileged few  Relational Data Model  Indexing  Familiar SQL interface  Advanced query optimization  Well understood and studied Comparison
  • 20. MapReduce Advantages Automatic Parallelization: Depending on the size of RAW INPUT DATA  instantiate multiple MAP tasks Similarly, depending upon the number of intermediate <key, value> partitions  instantiate multiple REDUCE tasks Run-time: Data partitioning Task scheduling Handling machine failures Managing inter-machine communication Completely transparent to the programmer / analyst / end user
  • 22. Why Hadoop Big Data analytics and the apache hadoop open source project are rapidly emerging as the preferred solution to address business & technology trends that’s are disrupting traditional data management & processing
  • 25. Challenge in Big Data  Big Data Integration is Multidisciplinary Less than 10% of Big Data world are genuinely relational Meaningful data integration in the real, messy, schema- less and complex Big Data world of database and semantic web using multidisciplinary and multi- technology method The Linked Open Data Ripper Mapping, Ranking,Visualization, Key Matching, Snappiness Demonstrate theValue of Semantics: let data integration drive DBMS technology Large volumes of heterogeneous data, like link data and RDF
  • 26. Provocations for Big Data 1. Automating Research Changes the Definition of Knowledge 2. Claim to Objectively and Accuracy are Misleading 3. Bigger Data are not always Better data 4. Not all Data are equivalent 5. Just because it is accessible doesn’t make it ethical 6. Limited access to big data creates new digital divides
  • 27. Who is collecting all Big Data Web Browsers Search Engines
  • 28. Who is collecting all Big Data Smartphones & Apps Apple’s iPhone (Apple O/S) Samsung, HTC. Nokia, Motorola (Android O/S) RIM Corp’s Blackberry (BlackBerry O/S) Tablet Computers & Apps Apple’s iPad Samsung’s Galaxy Amazon’s Kindle Fire
  • 29. Who is collecting for what? Credit Card Companies What data are they getting? Restaurant check Grocery Bill Airline ticket Hotel Bill
  • 30. Why are they collecting all this data? Target Marketing  To send you catalogs for exactly the merchandise you typically purchase.  To suggest medications that precisely match your medical history.  To “push” television channels to your set instead of your “pulling” them in.  To send advertisements on those channels just for us! Targeted Information  To know what you need before you even know you need it based on past purchasing habits!  To notify you of your expiring driver’s license or credit cards or last refill on a Rx, etc.  To give you turn-by-turn directions to a shelter in case of emergency.
  • 31. Future Enhancement Smartphones and tablets outsold desktop and laptop computers in 2011. There are more Smartphones in the U.S. in 2012 than people! The phone in your pocket has more programmable memory, more storage and more capability than several large IBM computers. It takes dozens of microprocessors running 100 million lines of code to get a premium car out of the driveway, and this software is only going to get more complex. In fact, the cost of software and electronics accounts for 30-40% of the price.
  • 32. Conclusion Big Data and Big Data Analytics – Not Just for Large Organizations It Is Not Just About Building Bigger Databases Moving Processing to the Data SourceYields Big Dividends Choose the Most Appropriate Big Data Scenario  Complete data scenario whereby entire data sets can be properly managed and factored into analytical processing, complete with in-database or in-memory processing and grid technologies.  Targeted data scenarios that use analytics and data management tools to determine the right data to feed into analytic models, for situations where using data set isn’t technically feasible or adds little value.
  • 33. Closing Thought Big data is not just about helping an organization be more successful – to market more effectively or improve business operations. High-performance analytics from designed to support big data initiatives, with in-memory, in-database and grid computing options. Those organizations can benefit from cloud computing, where big data analytics is delivered as a service and IT resources can be quickly adjusted to meet changing business demands. On Demand provides customers with the option to push big data analytics to greatly eliminating the time, capital expense and maintenance associated with on-premises deployments.