Escorts Service Kumaraswamy Layout â 7737669865â Book Your One night Stand (B...
Â
Big data overview
1.
2. What's Driving Data Deluge?
⢠Big Data is data whose
scale, distribution,
diversity, and/or
timeliness require the
use of new technical
architectures and
analytics to enable
insights that unlock new
sources of business
value.
3. Three attributes defining Big Data characteristics:
⢠Huge volume of data: Rather than thousands or millions of
rows, Big Data can be billions of rows and millions of columns.
⢠Complexity of data types and structures: Big Data reflects the
variety of new data sources, formats, and structures, including
digital traces being left on the web and other digital repositories
for subsequent analysis.
⢠Speed of new data creation and growth: Big Data can describe
high velocity data, with rapid data ingestion and near real time
analysis.
Big Data is sometimes described as having 3 Vs: volume,
variety, and velocity.
4. ⢠Social media and genetic sequencing are among the fastest-growing
sources of Big Data and examples of untraditional sources of data
being used for analysis.
⢠Another example comes from genomics.
⢠While data has grown, the cost to perform this work has fallen
dramatically. The cost to sequence one human genome has
fallen from $100 million in 2001 to $10,000 in 2011, and the cost
continues to drop.
5. Data Structures
⢠Big Data is unstructured
or semi-structured in
nature, which requires
different techniques and
tools to process and
analyze.
6. Data Structures
⢠Although analyzing structured data tends to be the most familiar
technique, a different technique is required to meet the
challenges to analyze semi-structured data (shown as XML),
quasi-structured (shown as a clickstream), and unstructured
data. Here are examples of how each of the four main types of
data structures may look.
Structured data: Data containing a defined data type, format,
and structure (that is, transaction data, online analytical
processing [OLAP] data cubes, traditional RDBMS, CSV files,
and even simple spreadsheets.
8. Data Structures
⢠Semi-structured data:
Textual data files with a
discernible pattern that
enables parsing (such as
Extensible Markup
Language [XML] data files
that are self describing and
defined by an XML
schema).
⢠See Figure 1-5..
9. Data Structures
⢠Quasi-structured data:
Textual data with erratic
data formats that can be
formatted with effort, tools,
and time (for instance, web
clickstream data that may
contain inconsistencies in
data values and formats).
See Figure 1-6.
10. Data Structures
⢠Unstructured data:
Data that has no
inherent structure,
which may include text
documents, PDFs,
images, and video.
See Figure 1-7.
11. BI Versus Data Science
Bl tends to provide reports,
dashboards, and queries on
business questions for the current
period or in the past. Bl systems
make it easy to answer questions
related to quarter-to-date revenue,
progress toward quarterly targets,
and understand how much of a
given product was sold in a prior
quarter or year. These questions
tend to be closed-ended and
explain current or past behavior,
typically by aggregating historical
data and grouping it in some way..
12.
13.
14. ⢠Organizations and data collectors are realizing that the data
they can gather from individuals contains intrinsic value and, as
a result, a new economy is emerging. As this new digital
economy continues to evolve, the market sees the introduction
of data vendors and data cleaners that use crowdsourcing
(such as Mechanical Turk and GalaxyZoo) to test the outcomes
of machine learning techniques. As the new ecosystem takes
shape, there are four main groups of players within this
interconnected web. These are shown in Figure 1-11.
⢠Data devices
⢠Data collectors
⢠Data aggregators
⢠Data users and buyers
15. What is Analytics?
Raw data in itself does not have a meaning until it
is contextualized and processed into useful
information.
Analytics is this process of extracting and creating
information from raw data by filtering, processing,
categorizing, condensing and contextualizing the
data.
16. What is Analytics?
The choice of the technologies, algorithms, and frameworks for
analytics is driven by the analytics goals of the application. For
example, the goals of the analytics task may be: (1) to predict
something (for example whether a transaction is a fraud or not,
whether it will rain on a particular day, or whether a tumor is benign or
malignant), (2) to find patterns in the data (for example, finding the top
10 coldest days in the year, finding which pages are visited the most on
a particular website, or finding the most searched celebrity in a
particular year), (3) finding relationships in the data (for example,
finding similar news articles, finding similar patients in an electronic
health record system, finding related products on an eCommerce
website, finding similar images, or finding correlation between news
items and stock prices).
18. Descriptive Analytics
⢠Descriptive analytics comprises analyzing past data to present it
in a summarized form which can be easily interpreted.
Descriptive analytics aims to answer - What has happened?
For example, computing the total number of likes for a particular
post, computing the average monthly rainfall or finding the
average number of visitors per month on a website. Descriptive
analytics is useful to summarize the data.
A major portion of analytics done today is descriptive analytics
through use of statistics functions such as counts, maximum,
minimum, mean, top-N, percentage, for instance.
Help in describing patterns in the data and present the data in a
summarized form.
19. What is Predictive Data Analytics?
The term predictive analytics refers to the use of statistics and
modeling techniques to make predictions about future outcomes and
performance. Predictive analytics looks at current and historical data
patterns to determine if those patterns are likely to emerge again. This
allows businesses and investors to adjust where they use their
resources to take advantage of possible future events. Predictive
analysis can also be used to improve operational efficiencies and
reduce risk.
20. Key Takeaways of PDA
⢠Predictive analytics uses statistics and modeling techniques to
determine future performance.
⢠Industries and disciplines, such as insurance and marketing, use
predictive techniques to make important decisions.
⢠Predictive models help make weather forecasts, develop video games,
translate voice-to-text messages, customer service decisions, and
develop investment portfolios.
⢠People often confuse predictive analytics with machine learning even
though the two are different disciplines.
⢠Types of predictive models include decision trees, regression, and
neural networks.
21. Understanding Predictive Analytics
â˘Predictive analytics is a form of technology that
makes predictions about certain unknowns in the
future. It draws on a series of techniques to make
these determinations, including artificial intelligence
(AI), data mining, machine learning, modeling, and
statistics.
⢠For instance, data mining involves the analysis of
large sets of data to detect patterns from it. Text
analysis does the same, except for large blocks of
text.
22. Applications of Predictive Models
⢠Weather forecasts
⢠Creating video games
⢠Translating voice to text for mobile phone messaging
⢠Customer service
⢠Investment portfolio development
All of these applications use descriptive statistical models of
existing data to make predictions about future data.
23. Applications of Predictive Models
They're also useful for businesses to help them manage inventory,
develop marketing strategies, and forecast sales.4 It also helps
businesses survive, especially those in highly competitive industries,
such as health care and retail.5 Investors and financial professionals can
draw on this technology to help craft investment portfolios and reduce
the potential for risk.
24. Uses of Predictive Analytics
⢠Forecasting
Forecasting is essential in manufacturing because it ensures the optimal
utilization of resources in a supply chain. Predictive modeling is often used
to clean and optimize the quality of data used for such forecasts. Modeling
ensures that more data can be ingested by the system, including from
customer-facing operations, to ensure a more accurate forecast.
⢠Credit
Credit scoring makes extensive use of predictive analytics. When a consumer
or business applies for credit, data on the applicant's credit history and the
credit record of borrowers with similar characteristics are used to predict the
risk that the applicant might fail to perform on any credit extended.
25. ⢠Underwriting
Data and predictive analytics play an important role in
underwriting. Insurance companies examine policy
applicants to determine the likelihood of having to pay out
for a future claim based on the current risk pool of similar
policyholders, as well as past events that have resulted in
payouts. Predictive models that consider characteristics
in comparison to data about past policyholders and
claims are routinely used by actuaries.
26. Applications of Predictive Models
⢠Marketing
Individuals who work in this field look at how consumers
have reacted to the overall economy when planning on a
new campaign. They can use these shifts in
demographics to determine if the current mix of products
will entice consumers to make a purchase.
Active traders, meanwhile, look at a variety of metrics
based on past events when deciding whether to buy or
sell a security. Moving averages, bands,
and breakpoints are based on historical data and are
used to forecast future price movements.
27. Predictive Analytics vs. Machine Learning
A common misconception is that predictive analytics
and machine learning are the same things.
Predictive analytics help us understand possible future
occurrences by analyzing the past. At its core, predictive
analytics includes a series of statistical techniques
(including machine learning, predictive modeling, and
data mining) and uses statistics (both historical and
current) to estimate, or predict, future outcomes.
28. Predictive Analytics vs. Machine Learning
Machine learning, on the other hand, is a subfield of computer science
that, as per the 1959 definition by Arthur Samuel (an American pioneer
in the field of computer gaming and artificial intelligence) means "the
programming of a digital computer to behave in a way which, if done
by human beings or animals, would be described as involving the
process of learning."
30. Types of Predictive Analytical Models
⢠Decision Trees If you want to understand what leads to
someone's decisions, then you may find decision trees
useful. This type of model places data into different
sections based on certain variables, such as price
or market capitalization. Just as the name implies, it
looks like a tree with individual branches and leaves.
Branches indicate the choices available while individual
leaves represent a particular decision.
⢠Decision trees are the simplest models because they're
easy to understand and dissect. They're also very
useful when you need to make a decision in a short
period of time
31. Types of Predictive Analytical Models
Regression
It is used when you want to determine
patterns in large sets of data and when
there's a linear relationship between the
inputs. This method works by figuring out
a formula, which represents the
relationship between all the inputs found
in the dataset.
For example, you can use regression to
figure out how price and other key
factors can shape the performance of
a security.
32. Applications of Predictive Models
⢠Neural Networks
Neural networks were
developed as a form of
predictive analytics by
imitating the way the human
brain works. This model can
deal with complex data
relationships using artificial
intelligence and pattern
recognition.
33. Applications of Predictive Models
⢠Artificial Neural
Network
(ANN) uses the
processing of the
brain as a basis to
develop algorithms
that can be used to
model complex
patterns and
prediction
problems.
34. How Businesses Can Use Predictive Analytics
⢠Predictive models are frequently used by businesses to help
improve their customer service and outreach.
⢠Executives and business owners can take advantage of this
kind of statistical analysis to determine customer behavior. For
instance, the owner of a business can use predictive techniques
to identify and target regular customers who could defect and
go to a competitor.
⢠Predictive analytics plays a key role in advertising
and marketing. Companies can use models to determine which
customers are likely to respond positively to marketing and
sales campaigns. Business owners can save money by
targeting customers who will respond positively rather than
doing blanket campaigns
35. Benefits of Predictive Analytics
⢠Using this type of analysis can help entities when you need to make
predictions about outcomes when there are no other (and obvious)
answers available. Investors, financial professionals, and business
leaders are able to use models to help reduce risk. For instance, an
investor and their advisor can use certain models to help craft an
investment portfolio with minimal risk to the investor by taking
certain factors into consideration, such as age, capital, and goals.
⢠Businesses can determine the likelihood of success or failure of a
product before it launches.
36. Criticism of Predictive Analytics
⢠The use of predictive analytics has been criticized and, in some
cases, legally restricted due to perceived inequities in its
outcomes. Most commonly, this involves predictive models that
result in statistical discrimination against racial or ethnic groups
in areas such as credit scoring, home.
A famous example of this is the (now illegal) practice
of redlining in home lending by banks. Regardless of whether
the predictions drawn from the use of such analytics are
accurate, their use is generally frowned upon, and data that
explicitly include information such as a person's race are now
often excluded from predictive analytics. lending, employment,
or risk of criminal behavior.