2. • 'Big Data' is also a data but with a huge size.
• 'Big Data' is a term used to describe collection of data that is huge in
size and yet growing exponentially with time.
• In short, such a data is so large and complex that none of the
traditional data management tools are able to store it or process it
efficiently.
3. Examples Of 'Big Data'
The New York Stock Exchange generates about one terabyte of new
trade data per day.
Statistic shows that 500+terabytes of new data gets ingested into the
databases of social media site Facebook, every day. This data is mainly
generated in terms of photo and video uploads, message exchanges,
putting comments etc.
Single Jet engine can generate 10+terabytes of data in 30 minutes of
a flight time. With many thousand flights per day, generation of data
reaches up to many Petabytes.
5. Structured -- data that can be stored, accessed and
processed in the form of fixed format
Unstructured -- Any data with unknown form or the
structure
Semi-structured -- Semi-structured data can contain
both the forms of data.
8. • Process of collecting, organizing and analyzing large sets of data
(called Big Data) to discover patterns and other useful information.
• Help organizations to better understand the information contained
within the data
• Analysts working with Big Data typically want the knowledge that comes
from analyzing the data.
• Big Data analytics is typically performed using specialized software tools
and applications for predictive analytics, data mining, text mining,
forecasting and data optimization
9. Today's advances in analyzing big data allow researchers to
• Decode human DNA in minutes
• Predict where terrorists plan to attack
• Determine which gene is mostly likely to be responsible for certain
diseases
• Which ads you are most likely to respond to on Facebook.
How Big Data Analytics is Used Today
10. The Challenges
• The first challenge is in breaking down data to access all data an
organization stores in different places and often in different systems.
• Second challenge is in creating platforms that can pull in unstructured
data as easily as structured data.
12. Hadoop:
it is an open source, Java-based programming framework that supports
the processing and storage of extremely large data sets in a distributed
computing environment
Lumify:
Lumify is a relatively new open source project to create a Big Data fusion
and is a great alternative to Hadoop.
ElasticSearch:
A reliable and secure open source platform that allows users to take any
data from any source, in any format and search, analyze it and visualize
it real time.
MongoDb:
MongoDB is also a great tool to help store and analyze big data, as well
as help make applications.
13.
14. Applications of Big Data:
1. Banking and securities
2. Communications, Media and Entertainment
3. Healthcare providers:
4.Education:
5. Manufacturing and Natural Resources
6. Government
7.Insurance
8.Retail and Wholesale trade
19. • Amazon uses big data to develop personalized recommendation
system.
• Amazon recently obtained a patent for the concept of predictive
dispatch.
• Google uses big data analytics to provide predictive search results
• Netflix relies on the data it collects from its customers to determine
which genre of programs are likely to be viewed more than other.
20. Future Of Big Data
• Machine Learning will be the Next Big thing in Big Data.
• Privacy will be the Biggest Challenge.
• Data Scientists Will Be In High Demand –(The Hindu predicts that by
end of 2018, India alone will face a shortage of close to two lakh
Data Scientists)
• Big Data Will Be Replaced By Fast and Actionable Data