2. UNIT III
BIG DATA & INDUSTRY 4.0
What Is the Fourth Industrial Revolution
The original industrial revolution took place in the 18th and 19th centuries and
involved innovations such as steam engines and mechanical production.
The second industrial revolution came along toward the end of the 19th century and
just before World War I. It included advancements such as the telegraph and industrial
Fast forward to the late 1970s, and we have the third industrial revolution — a period
that’s still ongoing and brought along things such as the internet and smartphones.
Since the buildup to the fourth industrial revolution is still happening, the idea of it is a
bit abstract. However, it will involve a future in which artificial intelligence allows
machines of all types to communicate with and learn from each other, an idea that
could potentially have a huge impact on production.
3. What is Data?
The quantities, characters, or symbols on which operations are performed by a
computer, which may be stored and transmitted in the form of electrical signals and
recorded on magnetic, optical, or mechanical recording media.
S.No DATA INFORMATION
1 Data is unorganised raw facts that need
processing without which it is seemingly
random and useless to humans
Information is a processed, organised data
presented in a given context and is useful to
2 Data is an individual unit that contains raw
material which does not carry any specific
Information is a group of data that collectively
carry a logical meaning.
3 Data doesn’t depend on information. Information depends on data.
4 It is measured in bits and bytes. Information is measured in meaningful units like
time, quantity, etc.
5 Data is never suited to the specific needs
of a designer.
Information is specific to the expectations and
requirements because all the irrelevant facts
and figures are removed, during the
6 An example of data is a student’s test
The average score of a class is the information
derived from the given data.
4. TYPES OF BIG DATA
Any data that can be stored, accessed and processed in the form of fixed format is termed
as a 'structured' data. Over the period of time, talent in computer science has achieved
greater success in developing techniques for working with such kind of data (where the
format is well known in advance) and also deriving value out of it. However, nowadays, we
are foreseeing issues when a size of such data grows to a huge extent, typical sizes are
being in the rage of multiple zettabytes.
Any data with unknown form or the structure is classified as unstructured data. In addition
to the size being huge, un-structured data poses multiple challenges in terms of its
processing for deriving value out of it. A typical example of unstructured data is a
heterogeneous data source containing a combination of simple text files, images, videos
etc. Now day organizations have wealth of data available with them but unfortunately, they
don't know how to derive value out of it since this data is in its raw form or unstructured
5. 3. Semi-structured Data
Semi-structured data is a form of structured data that does not conform with the formal
structure of data models associated with relational databases or other forms of data tables,
but nonetheless contain tags or other markers to separate semantic elements and enforce
hierarchies of records and fields within the data. Therefore, it is also known as self-
describing structure. Examples of semi-structured data include JSON and XML are forms of
The reason that this third category exists (between structured and unstructured data) is
because semi-structured data is considerably easier to analyse than unstructured data.
Many Big Data solutions and tools have the ability to ‘read’ and process either JSON or XML.
This reduces the complexity to analyse structured data, compared to unstructured data.
4. Metadata – Data about Data
A last category of data type is metadata. From a technical point of view, this is not a
separate data structure, but it is one of the most important elements for Big Data analysis
and big data solutions. Metadata is data about data. It provides additional information
about a specific set of data.
for example: In a set of photographs, metadata could describe when and where the photos
were taken. The metadata then provides fields for dates and locations which, by
themselves, can be considered structured data. Because of this reason, metadata is
frequently used by Big Data solutions for initial analysis.
6. What is Big Data?
Big Data is a collection of data that is huge in volume, yet growing exponentially with
time. It is a data with so large size and complexity that none of traditional data
management tools can store it or process it efficiently. Big data is also a data but with
Following are some of the Big Data examples- The New York Stock Exchange
generates about one terabyte of new trade data per day. BSE & NSE,
Social Media: The statistic shows that 500+terabytes of new data get ingested into the
databases of social media site Facebook, every day. This data is mainly generated in
terms of photo and video uploads, message exchanges, putting comments etc.
A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time.
With many thousand flights per day, generation of data reaches up to many Petabytes.
Definition: Big data is different than "Business Intelligence" and "data mining" in terms
of data volumens, number of transactions and number of data sources are very big
and complex. Hence Big data require special methods and technologies in order to
draw insight out of data.
7. CHARACTERISTIC FEATURES OF BIG DATA
(i) Volume – The name Big Data itself is related to a size which is enormous. Size of data plays
a very crucial role in determining value out of data. Also, whether a particular data can actually
be considered as a Big Data or not, is dependent upon the volume of data. Hence, 'Volume' is
one characteristic which needs to be considered while dealing with Big Data.
(ii) Variety – The next aspect of Big Data is its variety.
Variety refers to heterogeneous sources and the nature of data, both structured and
unstructured. During earlier days, spreadsheets and databases were the only sources of data
considered by most of the applications. Nowadays, data in the form of emails, photos, videos,
monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications.
This variety of unstructured data poses certain issues for storage, mining and analyzing data.
(iii) Velocity – The term 'velocity' refers to the speed of generation of data. How fast the data is
generated and processed to meet the demands, determines real potential in the data.
Big Data Velocity deals with the speed at which data flows in from sources like business
processes, application logs, networks, and social media sites, sensors, Mobile devices, etc.
The flow of data is massive and continuous.
(iv) Variability – This refers to the inconsistency which can be shown by the data at times, thus
hampering the process of being able to handle and manage the data effectively.
Example: The major big data landscapes are twitter, facebook and youtube. Moreover big data
volume is increasing day by day due to creation of new websites, emails, registration of
domains, tweets etc
8. Benefits or Advantages of Big Data
Following are the benefits or advantages of Big Data:
➨Big data analysis derives innovative solutions. Big data analysis helps in
understanding and targeting customers. It helps in optimizing business processes.
➨It helps in improving science and research.
➨It improves healthcare and public health with availability of record of patients.
➨It helps in financial tradings, sports, polling, security/law enforcement etc.
➨Any one can access vast information via surveys and deliver anaswer of any
➨Every second additions are made.
➨One platform carry unlimited information.
9. Drawbacks or disadvantages of Big Data
Following are the drawbacks or disadvantages of Big Data:
➨Traditional storage can cost lot of money to store big data.
➨Lots of big data is unstructured.
➨Big data analysis violates principles of privacy.
➨It can be used for manipulation of customer records.
➨It may increase social stratification.
➨Big data analysis is not useful in short run. It needs to be analyzed for longer
duration to leverage its benefits.
➨Big data analysis results are misleading sometimes.
➨Speedy updates in big data can mismatch real figures.
10. WHAT ROLE BIG DATA WILL PLAY
In fact, some say big data is Industry 4.0. In manufacturing, for example,
improvements and efficiencies in the analysis of big data are expected to
bring billions of dollars to the industry over the next five years.
Others look at it as an equation in which artificial intelligence plus big data
equals the fourth industrial revolution.
On one hand, you can see the possibility of job losses as autonomous
machines take over tasks that humans have handled for years.
On the other, there could be a slew of new jobs created when it comes to
harnessing the power of data and using it in a meaningful way.
11. BEYOND THE MANUFACTURING
Throughout history, industrial revolutions have often been judged by their impact on
the production and manufacturing of goods and products. That’s no different with the
looming Industry 4.0, but it will affect many other industries as well.
In financial services, for example. In this field, experts view big data as “the new
electricity” — the power source driving change in the way that steam, actual electricity
and digital technology did before it.
In one example, a company in Chile uses big data and machine learning to predict the
likelihood individual customers will be able to repay loans. If you look back 20 or 30
years, it took groups of human employees time and effort to determine your credit
score. Now, using information such as automotive credit history, utility bills and
census data, combined with predictive machine learning, that process can be almost
12. BIG DATA, AI AND THE INTERNET OF
Part of the fourth industrial revolution is the manner in which all types of
machines and devices interact, communicate and learn from each other.
At this early stage, it’s similar to the Internet of Things, or IoT — the
concept in which everyday objects such as cars, refrigerators, TVs, ovens
and home security systems are all connected to the internet.
Add on top of that a layer of artificial intelligence that saves time by
making decisions for you, and you see how these products and ideas can
make your life easier and even create new business opportunities.
As one IBM analyst puts it, the thing that AI and IoT have in common is
the use and interpretation of big data. Companies that invest in all three
areas — AI, IoT and big data — stand a good chance at becoming leaders
and innovators in the fourth industrial revolution.
13. WHY SMALL BUSINESSES SHOULD BE
USING BIG DATA
Big businesses aren’t the only ones who can make data-driven decisions
using big data these days. Small businesses can reap the benefits, too.
Analyzing all the online and offline information that one can helps to
grow their business.
Big data is defined as very large datasets that can be analyzed
computationally to reveal patterns, trends, and associations – especially
in connection with human behavior and interactions. A big data
revolution has arrived with the growth of the Internet, wireless networks,
smartphones, social media and other technology.
Organizations who discuss using big data usually have the resources to
hire research forms and data scientists to do the work for them. But, if
one knows where to look, small businesses can finally step up to the
plate and utilize big data, themselves.
14. 7 BENEFITS TO USING BIG DATA
FOR SMALL BUSINESSES
1. Using big data cuts your costs
2. Using big data increases the efficiency
3. Using big data improves your pricing
4. The small companies can compete with big businesses
5. Allows the comapnies to focus on local preferences
6. Using big data helps the small companies to increase sales and
7. Using big data ensures you hire the right employees
15. HOW BUSINESSES CAN ANALYZE BIG
In order to analyze big data, we need to first identify the issues that need
solutions or answers. Then, attempt to identify the answer to our question and
ask ourself, ‘how can we get the data to solve it?’ or ‘what can big data do for
Your big data solutions need to be user-friendly, match what you had in mind
for pricing, and flexible enough to serve your business both now and in the
Research what the most reliable tool is for the problem we need to solve. For
example, if you want to launch more effective promotions and marketing
campaigns, you can use Canopy Labs, which predicts customer behavior and
There are many tools out there that are inexpensive or even free that you can
use. Google has user-friendly tools like Google Adwords and Google BigQuery.
Administering a survey is simple and cheap using tools, such as SurveyMonkey