Big data is data that is too large or complex for traditional data processing applications to analyze in a timely manner. It is characterized by high volume, velocity, and variety. Big data comes from a variety of sources, including business transactions, social media, sensors, and call center notes. It can be structured, unstructured, or semi-structured. Tools used for big data include NoSQL databases, MapReduce, HDFS, and analytics platforms. Big data analytics extracts useful insights from large, diverse data sets. It has applications in various domains like healthcare, retail, and transportation.
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
Chapter 1 big data
1. What is Data?
The quantities, characters, or symbols on which operations are
performed by a computer, which may be stored and
transmitted in the form of electrical signals and recorded on
magnetic, optical, or mechanical recording media.
2.
3. What is Big Data?
Big Data is also data but with a huge size.
Big Data is a term used to describe a collection of data that is huge in volume
yet growing exponentially with time.
That means, Data is so large and complex that none of the traditional data
management tools are able to store it or process it efficiently.
8. (i) Volume – The name Big Data itself is related to a size which is enormous. Size
of data plays a very crucial role in determining value out of data. Also,
whether a particular data can actually be considered as a Big Data or not, is
dependent upon the volume of data. Hence, 'Volume' is one characteristic
which needs to be considered while dealing with Big Data.
(ii) Variety – The next aspect of Big Data is its variety.
Variety refers to heterogeneous sources and the nature of data, both
structured unstructured. During earlier days, spreadsheets and databases
were the only sources of data considered by most of the applications.
Nowadays, data in the form of emails, photos, videos, monitoring devices,
PDFs, audio, etc. are also being considered in the analysis applications. This
variety of unstructured data poses certain issues for storage, mining and
analyzing data.
(iii) Velocity – The term 'velocity' refers to the speed of generation of data. How
fast the data is generated and processed to meet the demands, determines real
potential in the data.
Big Data Velocity deals with the speed at which data flows in from sources
like business processes, application logs, networks, and social media sites,
sensors, Mobile devices, etc. The flow of data is massive and continuous.
9. (iv) Variability – This refers to the inconsistency which can be
shown by the data at times, thus hampering the process of
being able to handle and manage the data effectively.
(iv) Value – After having the 4 V’s into account there comes
one more V which stands for Value!. The bulk of
Data having no Value is of no good to the company, unless
you turn it into something useful.
Data in itself is of no use or importance but it needs to
be converted into something valuable to extract
Information. Hence, you can state that Value! is the most
important V of all the 5V’s.
11. Types of Big Data(Digital Data)
Digital data can be classified into three forms as shown in following figure .
12. Structered - Structured is one of the types of big data.
we mean data that can be processed, stored, and retrieved in a
fixed format. It refers to highly organized information that can be readily
and seamlessly stored and accessed from a database by simple
search engine algorithms.
For e.g :- For instance, the employee table in a company
database will be structured as the employee details, their job
positions, their salaries, etc., will be present in an organized
manner.
13. UnStructered - UnStructured is one of the types of big data.
we mean data that can be processed, stored, and retrieved in a
which is not in fixed format. It refers to highly unorganized information that
can be readily and seamlessly stored and accessed from a database.
This makes it very difficult and time-consuming to process and analyze
unstructured data.
For e.g :- Emails, facebook , will be present in an unorganized
manner.
14. Semi-Structered - Semi structured is the third type of big data
Semi-structured data pertains to the data containing both the
formats mentioned above, that is, structured and unstructured data.
16. Big Data Analytics
Big Data analytics is a process used to extract usefull insights, such as
hidden patterns, unknown correlations, market trends, and customer
preferences. Big Data analytics provides various advantages—it can be
used for better decision making, preventing fraudulent activities, among
other things.