5. Reverend Thomas Bayes (1763)
Reverend Thomas Bayes
was an English statistician,
philosopher and Presbyterian
minister who is known for
having formulated a specific
case of the theorem that
bears his name: Bayes'
theorem.
1763: paper was published
Application Real Case: Uber
Priors used by Uber:
Rider Prior: this prior is about the user
Popular place prior: restaurants, night life,museums
Uber Prior: all uber riders go to certain places
6. Legendre & Gauss (1805)
1805 Adrien-Marie Legendre and Carl Friedrich
Gauss applied regression to determine the
orbits of bodies about the Sun. Hence the
method of least squares (for computing the
unknown parameters in the general regression
model)
7. Alan Turing (1936)
An "effectively computable" procedure is
supposed to be one that can be performed by
systematic application of clearly specified rules,
without requiring any inspirational leaps or
spontaneous intellectual insights
8. Hirotugu Aikaike & the AIC (1974)
November 5, 1927 – August 4,
2009) Hirotugu was a Japanese
statistician. In the early 1970s he
formulated a criterion for model
selection—the Akaike information
criterion, which he thought of
while riding the train.
“On the morning of March 16, 1971, while
taking a seat in a commuter train, I suddenly
realized that the parameters of the factor
analysis model were estimated by
maximizing the likelihood and that the
mean value of the logarithmus of the
likelihood was connected with the Kullback-
Leibler information number. This was the
quantity that was to replace the mean
squared error of prediction”
9. Data Science History
1943 McCulloch and Pitts wrote , they describe the idea of a neuron in a network. Each of
these neurons can do 3 things: receive inputs, process inputs and generate output.
1989 The term “Knowledge Discovery in Databases” (KDD) is coined by . It was also
at this time that he co-founded the also named .
1990s The term “data mining” appeared in the database community. Retail companies
and the financial community are using data mining to analyze data and recognize trends
to increase their customer base.
1992 Boser, Guyon and Vapnik suggested an improvement on the original support
vector machine which allows for the creation of nonlinear classifiers. are a supervised
learning approach that analyzes data and recognizes patterns used for classification
and regression analysis.
2001 Although the term has existed since 1960s, it wasn’t until 2001 that William S.
Cleveland it as an independent discipline. As per , DJ Patil and Jeff Hammerbacher then
used the term to describe their roles at LinkedIn and Facebook.
10. Anthony Goldbloom, Kaggle (2010)
Goldbloom (1983) born in Australia, founded
Kaggle in 2010 as a Silicon Valley Startup that
focused on predictive analytics.
11. Andrew Ng of Baidu, Coursera
(2011)
Andrew Yan-Tak Ng (born 1976) is Chief Scientist
at Baidu Research in Silicon Valley. In addition,
he is an associate professor in the Department of
Computer Science at Stanford University. He is
chairman of the board of Coursera, an online
education platform that provides data science
courses online.
In 2011, Ng founded the Google Brain which developed very large
scale artificial neural networks using Google's distributed computer
infrastructure. Among its notable results was a neural network trained
using deep learning algorithms that learned to recognize cats after
watching only YouTube videos.
12. Branches of Data Science
Natural Language Processing (NLP)
Deep Learning
Predictive Analytics
Text Analytics
Social Media Analytics
Image Processing