Artificial intelligence: Simulation of Intelligence

A. I.
The simulation of Intelligence
By – Abhishek Upadhyay

Image Source: https://i1.wp.com/i.huffpost.com/gen/4864344/images/o-AI-HUMAN-facebook.jpg

Machine
Learning:
Intersection of
programming,
statistics, and
data
Image Source: https://qwg2b3m869-flywheel.netdna-ssl.com/wp-content/uploads/2015/09/artificial-intelligence-companies.png

“man’s dependence on probability was simply a consequence of imperfect
knowledge. A being who could follow every particle in the universe, and who had
unbounded powers of calculation, would be able to know the past and to predict
the future with perfect certainty”
- Philosophical Essay on Probabilities (1825)

Mathematical theorem of probability
By Thomas Bayes in 1763
Image source: Wikipedia

“Statistics is about gathering data and working out what the numbers can tell us.
From the earliest farmer estimating whether he had enough grain to last the
winter to the scientists of the Large Hadron Collider confirming the probable
existence of new particles, people have always been making inferences from data.
Statistical tools like the mean or average summarise data, and standard
deviations measure how much variation there is within a set of numbers.
Frequency distributions - the patterns within the numbers or the shapes they
make when drawn on a graph - can help predict future events.
Knowing how sure or how uncertain your estimates are a key part of statistics”
- Julian Champkin, Significance Magazine

Image source: https://fdrlibrary.wordpress.com/tag/1936/

1943: The McCulloch-Pitts Model of Neuron
Image Source

1950: Can Machines Think? Turing Test

1957: Perceptron
• M-P model was simple function with multi-dimensional input and
binary output
• Perceptron has two layers of node
• Weights and threshold were not all identical
• Output function goes from [-1,1] not [0,1]
• Adds extra input that represents bias (sometimes called theta)
• Most important, a learning rule, Read more
• It was a machine that could take input and create output

1970s: AI Winter
• 1973: UK Parliament to evaluate the state of AI research in the United
Kingdom
• “Computers have been oversold.. Indeed, it is big business….Continuous
failures occurred in Language translation, image recognition, human
speech, hand written letters, and so on…A robot can only mimic certain
range of human activities…Specialised problems are best treated by
specialised methods rather than generalised intelligence…. The general
purpose robot is a mirage”
– Sir James Lighthill
• The Lighthill report led to the near-complete dismantling of AI research in
England.
• The assessment coupled with slow progress contributed to loss of
confidence and drop in resources for AI research

Image source: “You and AI – The History, Capabilities and Frontiers of AI” YouTube

Automate the analysis
• Manual analysis is tedious
• Bandicoot is an open-source Python toolbox
used to analyze mobile phone metadata
• Bandicoot computes indicators
• Stratify the data between weekday and
weekend or day and night
• Strategy is to generate features to be
processed by algorithm to identify
behaviour, such as, In 2015, a study, titled
"Predicting Gender from Mobile Phone
Metadata“
• Learning algorithms use features for
prediction and clustering tasks – decide
which features can predict what

Machine Learning
• Bandicoot generates 1400 indicators. Next question, “Can we do
something useful with these indicators or variables or features?”, such as,
Can mobile phone data answer global development call?
• The SAS institute (2016) defines machine learning as “a method of data
analysis that automates analytical model building. Using algorithms that
iteratively learn from data, machine learning allows computers to find
hidden insights without being explicitly programmed where to look.” There
are two main classes of machine learning algorithms:
• Unsupervised Learning: Infer a function to describe a hidden structure or similarity
of patterns in unlabelled data
• Supervised Learning: Not only provide a set of features (Xi for i = 1,..N) but also set
of labels (Yi for i = 1,…N), where each yi is a label corresponding to Xi. One uses the
pair to learn a function f that can be used to predict unknown target value of some
input vector: y = f(X)

What is Learning?
• Can we extract answers to meaningful questions using vast amounts of data:
• How susceptible are they to marketing?
• What is probability of person using our new service
• Which members of community are most at risk in an epidemic outbreak?
• Approach to finding a theorem or law is difficult as they are complex and require
measurements of large data over time
• What is learning? Specifying the model f that can extract regularities for problem
– appropriate objective function to optimize specified loss function
• Learning (or fitting) the model essentially means finding optimal parameters of
the model structure, using provided input or target data. Fitting the model to
perform well on given or seen data (training data)
• However, our primary goal is for the model to perform well on unseen data

Problem
• Think of business problems (1-3) un-solved or can be solved better?
Problems that are:
• Complicated
• Requires learning from data
• Sufficiently self-contained
Once knowing that problem fits in ML domain, further two important
questions to answer are:
• Q: Whether right data exists for the problem? Where does it comes
from? Is data feed for machine sufficient to solve the problem?
• Q: Which ML model makes more sense to the problem?
source: HBR

Cast the use case (problem)
• As an ML problem:
1. What is being predicted?
2. What data is needed?
• As an Software:
1. What is the API for the problem during prediction?
2. Who will use this service? How are they doing it today?
• As a Data problem:
1. What data are we analyzing?
2. What data are we predicting?
3. What data are we reacting to?
source: Google coursera ML course

Journey
1. Understand AI
‐ Short term course
‐ Events
‐ Blogs
2. Follow a master
3. Find a problem
4. Problem fits in ML domain
5. Data Strategy
6. Design Thinking

Source: Yiou Intelligence
“AI Industry Synthesis Report” April, 2017

About 70% of the brain's cortical activity is dealing with visually related information, which
is equivalent to the gates of the human brain, others such as hearing, touch, and taste are
narrower channels. Visual is like a high way with eight traffic lanes, and the other feeling
are like sidewalks on both sides. If you cannot deal with visual information, the whole AI
system is an empty shelf. It can only do symbolic reasoning, such as playing chess and
proving theorem, but cannot enter the real world. Computer vision is like a door-opening
spell for AI. The door is inside, and if you fail to open it, there is no way to study AI in the
real world.
- Songchun Zhu, Professor of statistics and computer science of University of
California, Los Angeles

First AI Project – Two recommendations
“Not all AI projects are created equal. Some can provide incremental
improvements and are good places to start whereas some provide competitive
advantage”
- Bern Elliot, Gartner
“Build intelligence to solve one business problem. Use the intelligence and
experience for everything else”
- Demis Hassabis, DeepMind

AI Startup/Project
Horizontal
₋ Very science driven
₋ Solve one fundamental problem
₋ Serve many industries
₋ Such as NLP
₋ Players: Google, Facebook,
Amazon, Baidu, Microsoft, and
DeepMind
Vertical
₋ Customer segmentation &
targeting
₋ Solve problem of specific
customer
₋ Success depends on
democratized base technology
and strong community around
customers of technology
Ref - https://www.techinasia.com/talk/vertical-horizontal-ai-startup

Linear Regression
Machine Learning Algorithm

Linear Regression
• Regression – find equation that fits data
• The learning step is function estimation
• Reducing error is gradient descent
• Supervised learning
• Input training data
• Input, x – size of house
• Output, y – price of house
• m – number of training examples
• Build hypothesis (predict y for given x) over input data, h(θ) = θ0 + θ1 x

• If α is too small, gradient descent can be slow
• If α is too large, gradient descent can overshoot the
minimum
• Derivative is the slope of cost function J
• Using calculus, somehow, derivative term is :
• Batch, each step uses all training examples

Elementary Algebra
• If you recall from elementary algebra,
the equation for a line is y = mx + b
• Alternate to gradient descent is algebra
equation to calculate the min of cost
function
• In order to calculate linear regression,
and find the equation y = a + bx:
• In language of AP Statistics, we may see
equation written as:
• In Machine Learning, it is referred as
hypothesis:

Multiple variables
Size (feet2) Number of
bedrooms
Number of floors Age of home (years) Price ($1000)
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
n = number of features
x(i) = input (features) of ith training example
xj
(i) = value of feature j in ith training example
Hypothesis:

For convenience of notation, define x0 = 1. Therefore, features are a Vector x0, x1, …, xn, that is, Rn+1. Parameters are
also a Vector from θ0, θ1, …, θn, that is, Rn+1.
Hypothesis equation can be written as: h(x) = θ0x0 + θ1x1 + . . . + θnxn = θTx

References
• The Lighthill Debate (1973)
• You and AI – The History, Capabilities and Frontiers of AI
• MIT SA+P Big Data and Social Analytics
• An Easy Introduction to Artificial Intelligence, Machine Learning and
Deep Learning
• Scala and Spark for Big Data and Machine Learning
• Machine Learning — Andrew Ng, Stanford University
•

Thank You
Abhishek Upadhyay
shek.up@gmail.com
http://shekup.blogspot.com/

Artificial intelligence: Simulation of Intelligence

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Artificial intelligence: Simulation of Intelligence

Ähnlich wie Artificial intelligence: Simulation of Intelligence (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Artificial intelligence: Simulation of Intelligence

Hinweis der Redaktion