SlideShare ist ein Scribd-Unternehmen logo
1 von 106
MACHINE LEARNING IN FINANCE
CREATED BY: HAMED VAHEB
FALL 2018
1
2. ML in Tech vs ML in Finance
3. Example: Bank Rating Prediction
4. Deep Learning and Neural Networks
5. Example: Neural Net Copula in Markoviz Problem
1. Fundamentals of Machine Learning
What is ML?
2
3
1. Fundamentals of Machine Learning
Major AI Approaches
• Logic and Rules-Based Approach
• Hard-code knowledge about the world in formal languages
• Top-down rules are created for computers
• Computers reason about these rules automatically.
Example: Project Cyc (Lenat and Guha, 1989)
4
1. Fundamentals of Machine Learning
Major AI Approaches
Example within law – Expert Systems
• Turbotax
• Personal income tax laws
• Represented as logical computer rules
• Software computers tax liability
• Logic and Rules-Based Approach
5
1. Fundamentals of Machine Learning
Learning: Process of converting experience into expertise or knowledge
We wish to program “agents” that they can “learn” from input data
ML is what computers use to learn about the outside world. Much like humans
use math and physics for the same purpose.
Agent = Architecture + Algorithm
AI systems need the ability to acquire their own knowledge, by extracting
patterns from raw data.
• Machine Learning (Pattern-Based Approach)
Major AI Approaches
6
1. Fundamentals of Machine Learning
Machine Learning in our daily life
7
1. Fundamentals of Machine Learning
Example: Email Spam
Filter
8
1. Fundamentals of Machine Learning
Example: Email Spam
Filter
9
1. Fundamentals of Machine Learning
Example: AARON
10
1. Fundamentals of Machine Learning
Formal Definition
Field of study that gives computers the ability to learn without being
explicitly programmed
Arthur Samuel (1959):
Well posed Learning Problem: A computer program is said to learn
from experience E with respect to some task T and some
performance measure P, if its performance on T, as measured by P,
improves with experience E.
Tom Mitchell (1998):
Example: Chess
T: playing chess, E: agent playing with itself, P: number of wins / number of games
11
1. Fundamentals of Machine Learning
Studies “intelligent agents” that perceive their environment and perform different
actions to solve tasks that involve mimicking cognitive functions of human brain
(Russell, Norvig)
Artificial Intelligence
Goals of AI
Knowledge
Representation
Taking Actions,
Planning
Perception and
Learning
Natural
Language
Processing
Automated
Reasoning
Ontology: the
set of objects,
relations,
concepts
Acting with
visualizing future
to achieve goals
Perception from
sensors,
learning from
experience
Ability to read
and understand
human language
Mimicking
human
reasoning for
logical
deductions
M
L
12
1. Fundamentals of Machine Learning
Perception
(learning),
actions
Communication
(NLP)
Knowledge/Ont
ologies
Reasoning,
planning
Applied AI
Learns and
acts
autonomously
Use sub-
symbolic
information
Algorithmic
theory of
cognitive acts
Solves any
intellectual
tasks
Artificial General
Intelligence (AGI)
Present Future
15
13
1. Fundamentals of Machine Learning
Agent Environment
Perception
Actions
Perception Tasks: There is a fixed action
(Perception via The physical world (through sensors), or digital
data (read from a disk))
Action Tasks: There are multiple possible actions
involve planning and forecasting the future
involve sub-tasks of learning, for sequential (multi-step) problems
(Actions can be fixed, or can vary. May or may not change the
environment)
14
1. Fundamentals of Machine Learning
When do we need ML (instead of directly program)?
• Complexity
1. Tasks performed by Animals/Humans: Can’t extract a well
defined program. (Driving, speech recognition, image understanding)
2. Tasks beyond Human Capabilities: Analysis of very large and
complex datasets (Astronomical and genomic data, turning medical
archives to medical knowledge, weather prediction)
• Adaptivity
adaptive to changes in the environment they interact with.
(handwritten text, spam detection, speech recognition)
15
1. Fundamentals of Machine Learning
Types of learning
• Supervised: environment (teacher) that “supervises” the learner by
providing the extra information (“labels”). We have train (seen) and test
(unseen) data.
p(y|x)
• Unsupervised: come up with summary or a compressed version of
data, learn probability distribution, clustering (denoise, synthesis)
• Reinforcement: Intermediary. There is teacher but with partial feedback
(reward), sequence of actions. (describe chess’s setting position value,
Self-drive)
16
1. Fundamentals of Machine Learning
Supervised Learning
Most common types
17
1. Fundamentals of Machine Learning
Linear Regression Example: satisfaction rate of company employees
Training data: company employees have rated their satisfaction on a scale of 1 to 100
Predictor:
18
1. Fundamentals of Machine Learning
Linear Regression Example: satisfaction rate of company employees
Let’s start with
19
1. Fundamentals of Machine Learning
Cost Function:
As we minimized J (using Gradient Descent, the fitting line gets better and better
Linear Regression Example: satisfaction rate of company employees
22
20
1. Fundamentals of Machine Learning
Best Line:
Linear Regression Example: satisfaction rate of company employees
21
1. Fundamentals of Machine Learning
Minimization Algorithm: Gradient Descent
Linear Regression Example: satisfaction rate of company employees
𝑔 𝜃 =
𝜕
𝜕
𝑔
22
1. Fundamentals of Machine Learning
Plot of J
Linear Regression Example: satisfaction rate of company employees
In this case, J is convex and therefore there is no local minima!
23
1. Fundamentals of Machine Learning
J cantors
Linear Regression Example: satisfaction rate of company employees
24
1. Fundamentals of Machine Learning
Iterations
Linear Regression Example: satisfaction rate of company employees
Fore more visualization:
https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression-
41a5d11f5220
25
1. Fundamentals of Machine Learning
Unsupervised Learning
Example: Dimension Reduction
26
1. Fundamentals of Machine Learning
Supervised vs Unsupervised
Clusterin
g
Classificati
on
27
1. Fundamentals of Machine Learning
Reinforcement
28
1. Fundamentals of Machine Learning
Linear Regression for Stock
Market
29
1. Fundamentals of Machine Learning
Machine Learning Landscape
Supervised Learning Unsupervised Learning
Learn regression
Function
Given: input/output
pairs
Regression Classification
Representation
Learning
Clustering
Learn regression
Function
Given: input/output
pairs
Learn class
Function
k: the number of
clusters
Given: inputs only
Learn representer
function
Given: input/output
pairs
Perception Tasks
30
1. Fundamentals of Machine Learning
Machine Learning Landscape
Reinforcement Learning
Learn regression
Function
Given: input/output
pairs
Optimization of
strategy for a task
IRL: Learn
objectives from
behavior
Learn regression
Function
Given: input/output
pairs
Action Tasks
31
1. Fundamentals of Machine Learning
Machine Learning Examples
Supervised Learning Unsupervised Learning
Demand Forecast
Regression Classification
Representation
Learning
Clustering
Spam detection
Image recognition
Document
classification
Customer
segmentation
Anomaly detection
Text recognition
Machine
Translation
Perception Tasks
32
1. Fundamentals of Machine Learning
Machine Learning Examples
Reinforcement Learning
Robotics
Computational
advertising
Optimization of
strategy for a task
IRL: Learn
objectives from
behavior
Imitation learning
for robotics
Action Tasks
33
1. Fundamentals of Machine Learning
Machine Learning Methods
Supervised Learning Unsupervised Learning
Linear regression
Trees: CART
SMV/SRV
Ensemble methods
Neural Networks
Regression Classification
Representation
Learning
Clustering
Logistic regression
Naive Bayes
Nearest neighbors
SVM
Decision trees
Ensemble methods
Neural Networks
K-means
Hierarchical
clustering
Guassian matrix
Hidden Markov
Models
Neural Networks
PCA
Factor models
ICA
Dimension reduction
Manifold learning
Neural Networks
Perception Tasks
34
1. Fundamentals of Machine Learning
Machine Learning Methods
Reinforcement Learning
Model-based RL
Model-free RL
Batch/online RL
RL with linear
models
Neural Networks
Optimization of
strategy for a task
IRL: Learn
objectives from
behavior
Model-based IRL
Model-free IRL
Batch/online IRL
MaxEnt IRL
Neural Networks
Action Tasks
35
1. Fundamentals of Machine Learning
Machine Learning in Finance
Supervised Learning Unsupervised Learning
Earning prediction
Credit loss forecast
Algorithmic trading
Regression Classification
Representation
Learning
Clustering
Rating prediction
Default modeling
Credit card fraud
Anti-money laundry
Customer
segmentation
Stock
segmentation
Factor modeling
De-noising
Regime change
detection
Perception Tasks
36
1. Fundamentals of Machine Learning
Reinforcement Learning
Trading strategies
Asset management
Optimization of
strategy for a task
IRL: Learn
objectives from
behavior
Reverse engineering
of consumer
behavior, trading
strategies, …
Action Tasks
Machine Learning in Finance
37
1. Fundamentals of Machine Learning
ML by Financial Application Areas
Banking Asset Management
Customer
segmentation
Loan defaults
Credit card defaults
Fraud detection
Anti-money laundry
Retail P2P
Lending
Commercial and
Investment
Portfolio
optimization
Representation
Learning
Rating prediction
Default modeling
Client data mining
Recommender
systems
Factor modeling
De-noising
Regime change
Detection
Stock segmentation
Multi-period
portfolio
optimization
Derivatives trading
Perception Tasks
38
1. Fundamentals of Machine Learning
Quantitative Trading
Profit-maximizing
trading execution
Optimal trade
execution
Quantitative trading
strategies
Earning prediction
Algorithmic trading
Optimal market
making
Action Tasks
ML by Financial Application Areas
ML in Tech
• Perception (image recognition, NLP tasks, etc.)
Methods: SL/UL
• Action (computational advertising, robotics, self-driving cars, etc.). Methods:
SL/UL/RL
39
2. ML in Tech vs ML in Finance
ML in Tech ML in Finance
Image
recognition
NLP Tasks
Forecasting
Tasks
Valuation
Tasks
Computational
advertising
Robotics
ML in Finance
Perception: Forecasting tasks
• Security price predictions (stocks, bonds, commodities, etc.).
Methods: SL/UL
• Corporate actors action prediction (dividends, mergers, defaults, etc.).
Methods: SL/UL/RL
• Individual actors action prediction (loan defaults, fraud, AML, etc.).
Methods: SL/UL/RL 40
2. ML in Tech vs ML in Finance
ML in Tech ML in Finance
Image
recognition
NLP Tasks
Forecasting
Tasks
Valuation
Tasks
Computational
advertising
Robotics
ML in Finance
Perception: Valuation tasks
• Asset valuation (stocks, futures, commodities, bonds, etc.). Related to forecasting.
Methods: SL/UL
• Derivatives valuation.
Methods: SL/UL/RL
41
2. ML in Tech vs ML in Finance
ML in Tech ML in Finance
Image
recognition
NLP Tasks
Forecasting
Tasks
Valuation
Tasks
Computational
advertising
Robotics
42
2. ML in Tech vs ML in Finance
Tasks ML in Tech ML for Finance
Big Data? typically yes typically no
Data for ML in Tech are of huge size.
Most of data for ML in Finance are medium-size, except HFT.
43
2. ML in Tech vs ML in Finance
Tasks ML in Tech ML for Finance
Stationary Data? typically yes typically no
As most of financial data
are non-stationary,
collecting more data, even
when possible is not
always helpful
44
2. ML in Tech vs ML in Finance
Tasks ML in Tech ML for Finance
Noise-to-signal ratio typically low typically high
Financial data are typically quite noisy,
“true” signals are unobservable!
45
2. ML in Tech vs ML in Finance
Tasks ML in Tech ML for Finance
Interpretability of results Typically, not important, or
not the main focus
Typically, either desired or
required
Interpretability of results is:
• Desired for trading
• Required for regulation (General Data
Protection Regulation, 2018)
46
2. ML in Tech vs ML in Finance
Tasks ML in Tech ML for Finance
Action (RL) tasks Low dimensional state-action
space, low uncertainty
High-dimensional state-
action space, high
uncertainty
• ML in Tech: Dimensionality of the state-action space is usually in
hundreds.
The action space is often more discrete (except in robotics)
Uncertainty is low to moderate (think self-driving cars!)
• ML in Finance: Dimensionality of the state-action space is often
in thousands.
The action space is usually continuous.
Uncertainty is low to high (think Brexit!)
47
1. Fundamentals of Machine Learning
A Gentle Model (Statistical Learning Framework)
 Domain set: features
 Label set
(discrete or continuous)
 Training data: also called training set (seen)
The learner’s input:
 Prediction function (hypothesis)
 Data-generation model: probability distribution of
 Measure of success: error of predictor, loss function
The learner’s output:
48
1. Fundamentals of Machine Learning
Types of Error
• The ability to perform well on previously unobserved inputs is called generalization
• What separates machine learning from optimization is that we want the generalization
error to be low as well
• Estimate generalization error by a test set of examples that were collected separately
from the training set
Error measure on the training set
Training error
𝐿 𝐷,𝑓 ℎ ≝ 𝑃𝑥 𝐷 ℎ 𝑥 ≠ 𝑦
Generalization error (Test error):
49
1. Fundamentals of Machine Learning
• We sample the training set, then use it to choose the parameters to
reduce training set error. Under this process, the expected test error is
greater than or equal to the expected value of training error
• The factors determining how well a machine learning algorithm will
perform are its ability to
1. Make the training error small (underfitting)
2. Make the gap between training and test error small (overfitting)
Types of Error
50
1. Fundamentals of Machine Learning
Papayas Example
𝐿 𝐷 ℎ 𝑆
= 1 2
𝐿 𝑆 ℎ 𝑆
= 0
• No matter what the sample is ,
• Predicts label 1 only an finite number of instances:
• We have found a predictor whose performance on the training set is excellent, yet its
performance on the true “world” is very poor
51
1. Fundamentals of Machine Learning
Example
52
1. Fundamentals of Machine Learning
• Overfitting occurs when our hypothesis fits the training data “too well” (perhaps
like the everyday experience that a person who provides a perfect detailed
explanation for each of his single actions may raise suspicion).
Altering Capacity
• Model’s capacity is its ability to fit a wide variety of functions.
• Capacity is controlled by Restrict hypothesis class (size or complexity), VC
dimension, techniques, program bits, …
• Restrict to axis aligned rectangles guarantees not to overfit
• If H is a finite class, then ERMH will not overfit
53
1. Fundamentals of Machine Learning
Bias – Complexity Tradeoff
Error Decomposition
Approximation Error
• Due to underfitting
• the minimum risk achievable by a predictor in the hypothesis class.
• how much risk we have because we restrict ourselves to a specific class (bias)
• depends on the chosen hypothesis class
• Reflects the quality of prior knowledge
Estimation Error
• Due to overfitting
• the difference between the approximation error and the predictor error
• It exists because the training error is only an estimate of the generalization error
• depends on the training set size and on the size or complexity of the hypothesis class
54
1. Fundamentals of Machine Learning
Bias – Complexity Tradeoff
55
1. Fundamentals of Machine Learning
Model Capacity
DataComplexity
Bias – Complexity Tradeoff
56
1. Fundamentals of Machine Learning
Generalization
Design Matrix
• A model is trained using only a training set
• A test set is used to estimate algorithm’s ability to generalize, i.e. perform well on
unseen data.
57
1. Fundamentals of Machine Learning
• To generalize well, machine learning algorithms need to be guided by prior beliefs
about what kind of function they should learn.
• the stronger the prior knowledge (or prior assumptions) that one starts the learning
process with, the easier it is to learn from further examples. However, the stronger
these prior assumptions are, the less flexible the learning is (it is bound, a priori, by the
commitment to these assumptions.)
Prior Knowledge
• Restricting our hypothesis class (Finiteness, VC Dimension)
• Assumption on distribution
Examples
58
1. Fundamentals of Machine Learning
Prior Knowledge
Bait
Shyness
The rats seem to have some “built in” prior knowledge telling them that, while temporal
correlation between food and nausea can be causal, it is unlikely that there would be a
causal relationship between food consumption and electrical shocks or between sounds
and nausea.
59
1. Fundamentals of Machine Learning
Pigeon Superstition
Prior Knowledge
60
1. Fundamentals of Machine Learning
ML vs Statistical Modeling
61
3. Bank Failures Example
FDI
C
• US-based commercial banks are regulated by the FDIC
• FDIC provides insurance for commercial banks, and charges them insurance premium
according to an internal (and non-public) rating based on the CAMELS supervisory
system
62
3. Bank Failures Example
Importance
63
3. Bank Failures Example
CAMEL
S • Rate 1: Best, Rate 5: Worst
• Rating 4 or 5 is likely to be closed soon
Capital inadequacy is the most common cause of a
bank closure (other reasons: violation of financial
rules, management failures)
If FDIC decides to close the bank, it takes over both
its assets and its liabilities and then tries to sell the
assets at the best price possible to pay up the
liabilities.
• CAMEL ratings are not publicly known; However,
Call Reports are available.
• In addition, FDIC provides historical data for failed
banks:
(https://www.fdic.gov/bank/individual/failed/)
64
3. Bank Failures Example
Call Report
• 28 schedules in total
• Form FFIEC 031: for banks with both domestic (US) and foreign offices
• Form FFIEC 041: for banks with domestic (US) offices only
65
3. Bank Failures Example
Call Report Content (Schedules)
66
3. Bank Failures Example
Call Report Content (Schedules)
67
3. Bank Failures Example
Correlation Matrix of features
In this problem we want to predict failed(defaulter) Banks and non-failed Banks
NI: net income
log_TA: logarithm of total assets
TL: total loans
NPL: non-performing loans
Assessment Base: average consolidated assets minus tangle equity
…
68
3. Bank Failures Example
Defaulter by log_TA in Training data
68
3. Bank Failures Example
Defaulter by log_TA in Test data
69
3. Bank Failures Example
70
3. Bank Failures Example
Logistic Regression used for classification
71
3. Bank Failures Example
Training
72
3. Bank Failures Example
Training
73
3. Bank Failures Example
Testing
3
Deep Learning
74
75
4. Deep Learning and Neural Networks
The performance of simple machine learning algorithms depends heavily on the
representation of the data they are given.
Goal: separate the factors of variation
Problem: influence every single piece of data we are able to observe. (car
image at night, car )
Most applications require us to disentangle the factors of variation and discard
the ones that we do not care about
Representation Learning: use ML to discover not only the mapping from
representation to output but also the representation itself.
quintessential example: Autoencoder
the combination of an encoder function, which converts the input data into a
different representation, and a decoder function, which converts the new
representation back into the original format.
 Representation
76
4. Deep Learning and Neural Networks
Example
 Representation
77
4. Deep Learning and Neural Networks
Deep learning solves this problem by introducing representations that are
expressed in terms of other, simpler representations.
(build complex concepts out of simpler concepts. )
Example
77
4. Deep Learning and Neural Networks
 Depth
Depth enables the computer to learn a multistep computer program
Layer: state of the computer’s memory after executing another set of instructions in
parallel
Networks with greater depth can execute more instructions in sequence. (later
instructions can refer back to the results of earlier instructions.
Measuring Depth
1. Depth of computational graph: number of sequential instructions (length of the
longest path through a flow chart)
2. Depth of the concepts graph: describing how concepts are related to each other.
• Depth of the flowchart of the computations needed to compute the representation of
each concept may be much deeper than the graph of the concepts themselves
78
4. Deep Learning and Neural Networks
Depth = 3 Depth = 2
79
4. Deep Learning and Neural Networks
80
4. Deep Learning and Neural Networks
81
4. Deep Learning and Neural Networks
History of
DL
• Dates back to 1940s (only appears to be new)
• Different Names:
1. 1940s - 1960: Cybernetics
2. 1980s – 1990s: Connectionism
3. Beginning of 2006: Deep Learning
4. learning algorithms for biological learning (models of how learning happens or
could happen in brain): Artificial Neural Networks
Neural Perspective on DL
1. Brain provides a proof that intelligent behavior is possible
2. Reverse engineer the computational principles behind the brain
• Today, neuroscience is regarded as an important source of inspiration for DL
researchers, but it is no longer the predominant guide for the field because To obtain a
deep understanding of the actual algorithms used by the brain, we would need to be
able to monitor the activity of (at the very least) thousands of interconnected neurons
simultaneously.
• The basic idea of having many computational units that become intelligent only via their
interactions with each other is inspired by the brain
• 1980s algorithms work quite well, but this was not apparent circa 2006 because they
were too computationally costly.
82
4. Deep Learning and Neural Networks
• Increasing Dataset sizes: Some skill is required to get good performance from a DL
algorithm. Fortunately, the amount of skill required reduces as the amount of training
data increases.
The age of “Big Data” has made ML much easier because the key burden of statistical
estimation (generalizing to new data after observing only a small amount) has been
considerably lightened.
• Increasing Model Sizes: animals become intelligent when many of their neurons work
together. Larger networks are able to achieve higher accuracy on more complex tasks.
History of
DL
83
4. Deep Learning and Neural Networks
Challenges motivating DL
• Curse of Dimensionality
Regions Regions Regions
statistical challenge arises because the number of possible configurations of x is much
larger than the number of training examples.
84
4. Deep Learning and Neural Networks
www.playground.tensorflow.org
• Local Constancy and Smoothness
Among the most widely used of these implicit “priors” is the smoothness
prior, or local constancy prior.
It states that the function we learn should not change very much within a small region.
Much of the modern motivation for deep learning is derived from studying the limitations of
local template matching and how deep models are able to succeed in cases where local
template matching fails (Bengio et al., 2006b).
85
4. Deep Learning and Neural Networks
Neural Networks
Feedforward Neural Network (MLP)
Goal: approximate some function with some
Feedforward: information flows through the function with no feedback connections
Neural: loosely inspired by neuroscience
Network: composing together many different functions .
( is the ’th layer and final layer is output layer)
Depth: overall length of the chain
Width: dimensionality of hidden layers
Hidden Layer: Training data does not show the desired output for each of these layers
• During NN training, we drive to match
• Each hidden layer is vector valued
86
4. Deep Learning and Neural Networks
Depth
𝑓 1
𝑓 2
𝑓 3
Feedforward Neural Network (MLP)
Width
87
Feedforward Neural Network (MLP)
MLP as a kernel technique
extend linear models to represent nonlinear functions of by applying the linear model not to
, but to a transformed input
How to choose
1. Generic: infinite-dimensional(based on RBF kernel).
Enough capacity but poor generalization
2. Manually Engineer : Requires decades of human effort for each separate task
3. Learn :
This is an example of a deep feedforward network, with defining a hidden layer
• The advantage of 3’rd approach is that the human designer only needs to find the right
general function family rather than finding precisely the right function.
88
4. Deep Learning and Neural Networks
Feedforward Neural Network (MLP)
Example: Learning XOR
• After solving: and
, where and
• Most neural networks establish a nonlinear function by using a affine transformation
controlled by learned parameters, followed by a fixed nonlinear function called an
activation function.
or , where
89
4. Deep Learning and Neural Networks
When , the model’s output must increase as increases. When
, the model’s output must decrease as increases.
90
4. Deep Learning and Neural Networks
,
91
4. Deep Learning and Neural Networks
Recurrent Neural Network (RNN)
• For processing a sequence of values . ( can be variable)
• Parameter sharing: using the same parameter for more than one function in a
model (tied weights).
If we had separate parameters for each value of the time index, we could
not generalize to sequence lengths not seen during training, nor share
statistical strength across different sequence lengths and across different
positions in time. Such sharing is particularly important when a specific piece
of information can occur at multiple positions within the sequence. (“I went
to Nepal in 2009” and “In 2009, I went to Nepal)
• Each member of the output is a function of the previous members of the output. Each
member of the output is produced using the same update rule applied to the previous
outputs.
• Include cycles that represent the influence of the present value of a variable on its own
value at a future time step.
• Any function involving recurrence can be considered a recurrent neural network.
92
4. Deep Learning and Neural Networks
Parameter Sharing
Recurrent Neural Network (RNN)
93
4. Deep Learning and Neural Networks
Unfolding Computational Graphs
The unfolding process thus introduces two major advantages:
1. Regardless of the sequence length, the learned model always has the same
input size, because it is specified in terms of transition from one state to
another state, rather than specified in terms of a variable-length history of
states.
2. It is possible to use the same transition function f with the same parameters
at every time step.
Recurrent Neural Network (RNN)
94
4. Deep Learning and Neural Networks
Some types of
RNNs
Recurrent Neural Network (RNN)
I. Produce an output at each time step and have recurrent connections between hidden
units
II. Produce an output at each time step and have recurrent connections only from the
output at one time step to the hidden units at the next time step.
III. With recurrent connections between hidden units, that read an entire sequence and
then produce a single output
• The network with recurrent connections only from the output at one time step to
the hidden units at the next time step is strictly less powerful because it lacks hidden-to-
hidden recurrent connections. For example, it cannot simulate a universal Turing
machine. It requires that the output units capture all the information about the past that
the network will use to predict the future.
95
4. Deep Learning and Neural Networks
I
96
4. Deep Learning and Neural Networks
II
97
4. Deep Learning and Neural Networks
98
4. Deep Learning and Neural Networks
III
99
4. Deep Learning and Neural Networks
Teacher Forcing
Recurrent Neural Network (RNN)
a procedure that emerges from the maximum likelihood criterion, in which during training
the model receives the ground truth output as input at time .
𝑙𝑜𝑔𝑝 𝑦 1
, 𝑦 2
𝑥 1
, 𝑥 2
= 𝑙𝑜𝑔𝑝 𝑦 2
𝑦 1
, 𝑥 1
, 𝑥 2
+ 𝑙𝑜𝑔𝑝 𝑦 1
𝑦 1
, 𝑥 1
, 𝑥 2
• avoid back-propagation through time in models that lack hidden-to-hidden connections.
Teacher forcing may still be applied to models that have hidden-to-hidden connections
as long as they have connections from the output at one time step to values computed
in the next time step.
• As soon as the hidden units become a function of earlier time steps, however, the BPTT
algorithm is necessary.
• Some models may thus be trained with both teacher forcing and BPTT.
104
100
4. Deep Learning and Neural Networks
101
4. Deep Learning and Neural Networks
Any time we choose a specific machine learning algorithm, we are implicitly stating some
set of prior beliefs we have about what kind of function the algorithm should learn.
Choosing a deep model encodes a very general belief that the function we want to learn
should involve composition of several simpler functions. This can be interpreted from a
representation learning point of view as saying that we believe the learning problem
consists of discovering a set of underlying factors of variation that can in turn be described
in terms of other, simpler underlying factors of variation. Alternately, we can interpret the
use of a deep architecture as expressing a belief that the function we want to learn is a
computer program consisting of multiple steps, where each step makes use of the previous
step’s output. These intermediate outputs are not necessarily factors of variation but can
instead be analogous to counters or pointers that the network uses to organize its internal
processing. Empirically, greater depth does seem to result in better generalization
Last Note
102
References
1. Understanding Machine Learning: From Theory to
Algorithms (Shai Ben-David and Shai Shalev-
Shwartz)
2. Deep Learning (Aaron C. Courville, Ian Goodfellow,
and Yoshua Bengio)
3. “Machine Learning in Finance” course
(www.coursera.org)
4. Advances in Financial Machine Learning (marcos
lopez de prado)

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningDr. Radhey Shyam
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learningananth
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisWriteMyThesis
 
Machine learning
Machine learningMachine learning
Machine learningRohit Kumar
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its ApplicationsBhuvan Chopra
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt Poojamanic
 
acai01-updated.ppt
acai01-updated.pptacai01-updated.ppt
acai01-updated.pptbutest
 
An Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAn Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAnirban Santara
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introductionbutest
 
The IPO Model of Evaluation (Input-Process-Output)
The IPO Model of Evaluation (Input-Process-Output)The IPO Model of Evaluation (Input-Process-Output)
The IPO Model of Evaluation (Input-Process-Output)Janilo Sarmiento
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learningbutest
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZCharles Vestur
 
Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Amruta Aphale
 

Was ist angesagt? (20)

Launching into machine learning
Launching into machine learningLaunching into machine learning
Launching into machine learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
 
Machine learning
Machine learningMachine learning
Machine learning
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
Lesson 33
Lesson 33Lesson 33
Lesson 33
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
 
acai01-updated.ppt
acai01-updated.pptacai01-updated.ppt
acai01-updated.ppt
 
Machine learning
Machine learningMachine learning
Machine learning
 
An Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAn Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGI
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introduction
 
Machine learning
Machine learningMachine learning
Machine learning
 
The IPO Model of Evaluation (Input-Process-Output)
The IPO Model of Evaluation (Input-Process-Output)The IPO Model of Evaluation (Input-Process-Output)
The IPO Model of Evaluation (Input-Process-Output)
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 
Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1
 

Ähnlich wie Machine Learning in Finance

Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research OverviewKathirvel Ayyaswamy
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA BoostAman Patel
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.pptARVIND SARDAR
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introductionARVIND SARDAR
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptxNaveenkushwaha18
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.pptSeshuSrinivas2
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1JagadishPogu
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeOsama Ghandour Geris
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Saurabh Kaushik
 
A quick peek into the word of AI
A quick peek into the word of AIA quick peek into the word of AI
A quick peek into the word of AISubhendu Dey
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptxHchethankumar
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptxHchethankumar
 
1 machine learning demystified
1 machine learning demystified1 machine learning demystified
1 machine learning demystifiedDr Nisha Arora
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introductionAnas Jamil
 

Ähnlich wie Machine Learning in Finance (20)

Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
 
Lec1 intoduction.pptx
Lec1 intoduction.pptxLec1 intoduction.pptx
Lec1 intoduction.pptx
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
 
ML_Lecture_1.ppt
ML_Lecture_1.pptML_Lecture_1.ppt
ML_Lecture_1.ppt
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introduction
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptx
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.ppt
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
A quick peek into the word of AI
A quick peek into the word of AIA quick peek into the word of AI
A quick peek into the word of AI
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
1 machine learning demystified
1 machine learning demystified1 machine learning demystified
1 machine learning demystified
 
recent.pptx
recent.pptxrecent.pptx
recent.pptx
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introduction
 

Kürzlich hochgeladen

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 

Kürzlich hochgeladen (20)

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 

Machine Learning in Finance

  • 1.
  • 2. MACHINE LEARNING IN FINANCE CREATED BY: HAMED VAHEB FALL 2018
  • 3. 1 2. ML in Tech vs ML in Finance 3. Example: Bank Rating Prediction 4. Deep Learning and Neural Networks 5. Example: Neural Net Copula in Markoviz Problem 1. Fundamentals of Machine Learning
  • 5. 3 1. Fundamentals of Machine Learning Major AI Approaches • Logic and Rules-Based Approach • Hard-code knowledge about the world in formal languages • Top-down rules are created for computers • Computers reason about these rules automatically. Example: Project Cyc (Lenat and Guha, 1989)
  • 6. 4 1. Fundamentals of Machine Learning Major AI Approaches Example within law – Expert Systems • Turbotax • Personal income tax laws • Represented as logical computer rules • Software computers tax liability • Logic and Rules-Based Approach
  • 7. 5 1. Fundamentals of Machine Learning Learning: Process of converting experience into expertise or knowledge We wish to program “agents” that they can “learn” from input data ML is what computers use to learn about the outside world. Much like humans use math and physics for the same purpose. Agent = Architecture + Algorithm AI systems need the ability to acquire their own knowledge, by extracting patterns from raw data. • Machine Learning (Pattern-Based Approach) Major AI Approaches
  • 8. 6 1. Fundamentals of Machine Learning Machine Learning in our daily life
  • 9. 7 1. Fundamentals of Machine Learning Example: Email Spam Filter
  • 10. 8 1. Fundamentals of Machine Learning Example: Email Spam Filter
  • 11. 9 1. Fundamentals of Machine Learning Example: AARON
  • 12. 10 1. Fundamentals of Machine Learning Formal Definition Field of study that gives computers the ability to learn without being explicitly programmed Arthur Samuel (1959): Well posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. Tom Mitchell (1998): Example: Chess T: playing chess, E: agent playing with itself, P: number of wins / number of games
  • 13. 11 1. Fundamentals of Machine Learning Studies “intelligent agents” that perceive their environment and perform different actions to solve tasks that involve mimicking cognitive functions of human brain (Russell, Norvig) Artificial Intelligence Goals of AI Knowledge Representation Taking Actions, Planning Perception and Learning Natural Language Processing Automated Reasoning Ontology: the set of objects, relations, concepts Acting with visualizing future to achieve goals Perception from sensors, learning from experience Ability to read and understand human language Mimicking human reasoning for logical deductions M L
  • 14. 12 1. Fundamentals of Machine Learning Perception (learning), actions Communication (NLP) Knowledge/Ont ologies Reasoning, planning Applied AI Learns and acts autonomously Use sub- symbolic information Algorithmic theory of cognitive acts Solves any intellectual tasks Artificial General Intelligence (AGI) Present Future
  • 15. 15 13 1. Fundamentals of Machine Learning Agent Environment Perception Actions Perception Tasks: There is a fixed action (Perception via The physical world (through sensors), or digital data (read from a disk)) Action Tasks: There are multiple possible actions involve planning and forecasting the future involve sub-tasks of learning, for sequential (multi-step) problems (Actions can be fixed, or can vary. May or may not change the environment)
  • 16. 14 1. Fundamentals of Machine Learning When do we need ML (instead of directly program)? • Complexity 1. Tasks performed by Animals/Humans: Can’t extract a well defined program. (Driving, speech recognition, image understanding) 2. Tasks beyond Human Capabilities: Analysis of very large and complex datasets (Astronomical and genomic data, turning medical archives to medical knowledge, weather prediction) • Adaptivity adaptive to changes in the environment they interact with. (handwritten text, spam detection, speech recognition)
  • 17. 15 1. Fundamentals of Machine Learning Types of learning • Supervised: environment (teacher) that “supervises” the learner by providing the extra information (“labels”). We have train (seen) and test (unseen) data. p(y|x) • Unsupervised: come up with summary or a compressed version of data, learn probability distribution, clustering (denoise, synthesis) • Reinforcement: Intermediary. There is teacher but with partial feedback (reward), sequence of actions. (describe chess’s setting position value, Self-drive)
  • 18. 16 1. Fundamentals of Machine Learning Supervised Learning Most common types
  • 19. 17 1. Fundamentals of Machine Learning Linear Regression Example: satisfaction rate of company employees Training data: company employees have rated their satisfaction on a scale of 1 to 100 Predictor:
  • 20. 18 1. Fundamentals of Machine Learning Linear Regression Example: satisfaction rate of company employees Let’s start with
  • 21. 19 1. Fundamentals of Machine Learning Cost Function: As we minimized J (using Gradient Descent, the fitting line gets better and better Linear Regression Example: satisfaction rate of company employees
  • 22. 22 20 1. Fundamentals of Machine Learning Best Line: Linear Regression Example: satisfaction rate of company employees
  • 23. 21 1. Fundamentals of Machine Learning Minimization Algorithm: Gradient Descent Linear Regression Example: satisfaction rate of company employees 𝑔 𝜃 = 𝜕 𝜕 𝑔
  • 24. 22 1. Fundamentals of Machine Learning Plot of J Linear Regression Example: satisfaction rate of company employees In this case, J is convex and therefore there is no local minima!
  • 25. 23 1. Fundamentals of Machine Learning J cantors Linear Regression Example: satisfaction rate of company employees
  • 26. 24 1. Fundamentals of Machine Learning Iterations Linear Regression Example: satisfaction rate of company employees Fore more visualization: https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression- 41a5d11f5220
  • 27. 25 1. Fundamentals of Machine Learning Unsupervised Learning Example: Dimension Reduction
  • 28. 26 1. Fundamentals of Machine Learning Supervised vs Unsupervised Clusterin g Classificati on
  • 29. 27 1. Fundamentals of Machine Learning Reinforcement
  • 30. 28 1. Fundamentals of Machine Learning Linear Regression for Stock Market
  • 31. 29 1. Fundamentals of Machine Learning Machine Learning Landscape Supervised Learning Unsupervised Learning Learn regression Function Given: input/output pairs Regression Classification Representation Learning Clustering Learn regression Function Given: input/output pairs Learn class Function k: the number of clusters Given: inputs only Learn representer function Given: input/output pairs Perception Tasks
  • 32. 30 1. Fundamentals of Machine Learning Machine Learning Landscape Reinforcement Learning Learn regression Function Given: input/output pairs Optimization of strategy for a task IRL: Learn objectives from behavior Learn regression Function Given: input/output pairs Action Tasks
  • 33. 31 1. Fundamentals of Machine Learning Machine Learning Examples Supervised Learning Unsupervised Learning Demand Forecast Regression Classification Representation Learning Clustering Spam detection Image recognition Document classification Customer segmentation Anomaly detection Text recognition Machine Translation Perception Tasks
  • 34. 32 1. Fundamentals of Machine Learning Machine Learning Examples Reinforcement Learning Robotics Computational advertising Optimization of strategy for a task IRL: Learn objectives from behavior Imitation learning for robotics Action Tasks
  • 35. 33 1. Fundamentals of Machine Learning Machine Learning Methods Supervised Learning Unsupervised Learning Linear regression Trees: CART SMV/SRV Ensemble methods Neural Networks Regression Classification Representation Learning Clustering Logistic regression Naive Bayes Nearest neighbors SVM Decision trees Ensemble methods Neural Networks K-means Hierarchical clustering Guassian matrix Hidden Markov Models Neural Networks PCA Factor models ICA Dimension reduction Manifold learning Neural Networks Perception Tasks
  • 36. 34 1. Fundamentals of Machine Learning Machine Learning Methods Reinforcement Learning Model-based RL Model-free RL Batch/online RL RL with linear models Neural Networks Optimization of strategy for a task IRL: Learn objectives from behavior Model-based IRL Model-free IRL Batch/online IRL MaxEnt IRL Neural Networks Action Tasks
  • 37. 35 1. Fundamentals of Machine Learning Machine Learning in Finance Supervised Learning Unsupervised Learning Earning prediction Credit loss forecast Algorithmic trading Regression Classification Representation Learning Clustering Rating prediction Default modeling Credit card fraud Anti-money laundry Customer segmentation Stock segmentation Factor modeling De-noising Regime change detection Perception Tasks
  • 38. 36 1. Fundamentals of Machine Learning Reinforcement Learning Trading strategies Asset management Optimization of strategy for a task IRL: Learn objectives from behavior Reverse engineering of consumer behavior, trading strategies, … Action Tasks Machine Learning in Finance
  • 39. 37 1. Fundamentals of Machine Learning ML by Financial Application Areas Banking Asset Management Customer segmentation Loan defaults Credit card defaults Fraud detection Anti-money laundry Retail P2P Lending Commercial and Investment Portfolio optimization Representation Learning Rating prediction Default modeling Client data mining Recommender systems Factor modeling De-noising Regime change Detection Stock segmentation Multi-period portfolio optimization Derivatives trading Perception Tasks
  • 40. 38 1. Fundamentals of Machine Learning Quantitative Trading Profit-maximizing trading execution Optimal trade execution Quantitative trading strategies Earning prediction Algorithmic trading Optimal market making Action Tasks ML by Financial Application Areas
  • 41. ML in Tech • Perception (image recognition, NLP tasks, etc.) Methods: SL/UL • Action (computational advertising, robotics, self-driving cars, etc.). Methods: SL/UL/RL 39 2. ML in Tech vs ML in Finance ML in Tech ML in Finance Image recognition NLP Tasks Forecasting Tasks Valuation Tasks Computational advertising Robotics
  • 42. ML in Finance Perception: Forecasting tasks • Security price predictions (stocks, bonds, commodities, etc.). Methods: SL/UL • Corporate actors action prediction (dividends, mergers, defaults, etc.). Methods: SL/UL/RL • Individual actors action prediction (loan defaults, fraud, AML, etc.). Methods: SL/UL/RL 40 2. ML in Tech vs ML in Finance ML in Tech ML in Finance Image recognition NLP Tasks Forecasting Tasks Valuation Tasks Computational advertising Robotics
  • 43. ML in Finance Perception: Valuation tasks • Asset valuation (stocks, futures, commodities, bonds, etc.). Related to forecasting. Methods: SL/UL • Derivatives valuation. Methods: SL/UL/RL 41 2. ML in Tech vs ML in Finance ML in Tech ML in Finance Image recognition NLP Tasks Forecasting Tasks Valuation Tasks Computational advertising Robotics
  • 44. 42 2. ML in Tech vs ML in Finance Tasks ML in Tech ML for Finance Big Data? typically yes typically no Data for ML in Tech are of huge size. Most of data for ML in Finance are medium-size, except HFT.
  • 45. 43 2. ML in Tech vs ML in Finance Tasks ML in Tech ML for Finance Stationary Data? typically yes typically no As most of financial data are non-stationary, collecting more data, even when possible is not always helpful
  • 46. 44 2. ML in Tech vs ML in Finance Tasks ML in Tech ML for Finance Noise-to-signal ratio typically low typically high Financial data are typically quite noisy, “true” signals are unobservable!
  • 47. 45 2. ML in Tech vs ML in Finance Tasks ML in Tech ML for Finance Interpretability of results Typically, not important, or not the main focus Typically, either desired or required Interpretability of results is: • Desired for trading • Required for regulation (General Data Protection Regulation, 2018)
  • 48. 46 2. ML in Tech vs ML in Finance Tasks ML in Tech ML for Finance Action (RL) tasks Low dimensional state-action space, low uncertainty High-dimensional state- action space, high uncertainty • ML in Tech: Dimensionality of the state-action space is usually in hundreds. The action space is often more discrete (except in robotics) Uncertainty is low to moderate (think self-driving cars!) • ML in Finance: Dimensionality of the state-action space is often in thousands. The action space is usually continuous. Uncertainty is low to high (think Brexit!)
  • 49. 47 1. Fundamentals of Machine Learning A Gentle Model (Statistical Learning Framework)  Domain set: features  Label set (discrete or continuous)  Training data: also called training set (seen) The learner’s input:  Prediction function (hypothesis)  Data-generation model: probability distribution of  Measure of success: error of predictor, loss function The learner’s output:
  • 50. 48 1. Fundamentals of Machine Learning Types of Error • The ability to perform well on previously unobserved inputs is called generalization • What separates machine learning from optimization is that we want the generalization error to be low as well • Estimate generalization error by a test set of examples that were collected separately from the training set Error measure on the training set Training error 𝐿 𝐷,𝑓 ℎ ≝ 𝑃𝑥 𝐷 ℎ 𝑥 ≠ 𝑦 Generalization error (Test error):
  • 51. 49 1. Fundamentals of Machine Learning • We sample the training set, then use it to choose the parameters to reduce training set error. Under this process, the expected test error is greater than or equal to the expected value of training error • The factors determining how well a machine learning algorithm will perform are its ability to 1. Make the training error small (underfitting) 2. Make the gap between training and test error small (overfitting) Types of Error
  • 52. 50 1. Fundamentals of Machine Learning Papayas Example 𝐿 𝐷 ℎ 𝑆 = 1 2 𝐿 𝑆 ℎ 𝑆 = 0 • No matter what the sample is , • Predicts label 1 only an finite number of instances: • We have found a predictor whose performance on the training set is excellent, yet its performance on the true “world” is very poor
  • 53. 51 1. Fundamentals of Machine Learning Example
  • 54. 52 1. Fundamentals of Machine Learning • Overfitting occurs when our hypothesis fits the training data “too well” (perhaps like the everyday experience that a person who provides a perfect detailed explanation for each of his single actions may raise suspicion). Altering Capacity • Model’s capacity is its ability to fit a wide variety of functions. • Capacity is controlled by Restrict hypothesis class (size or complexity), VC dimension, techniques, program bits, … • Restrict to axis aligned rectangles guarantees not to overfit • If H is a finite class, then ERMH will not overfit
  • 55. 53 1. Fundamentals of Machine Learning Bias – Complexity Tradeoff Error Decomposition Approximation Error • Due to underfitting • the minimum risk achievable by a predictor in the hypothesis class. • how much risk we have because we restrict ourselves to a specific class (bias) • depends on the chosen hypothesis class • Reflects the quality of prior knowledge Estimation Error • Due to overfitting • the difference between the approximation error and the predictor error • It exists because the training error is only an estimate of the generalization error • depends on the training set size and on the size or complexity of the hypothesis class
  • 56. 54 1. Fundamentals of Machine Learning Bias – Complexity Tradeoff
  • 57. 55 1. Fundamentals of Machine Learning Model Capacity DataComplexity Bias – Complexity Tradeoff
  • 58. 56 1. Fundamentals of Machine Learning Generalization Design Matrix • A model is trained using only a training set • A test set is used to estimate algorithm’s ability to generalize, i.e. perform well on unseen data.
  • 59. 57 1. Fundamentals of Machine Learning • To generalize well, machine learning algorithms need to be guided by prior beliefs about what kind of function they should learn. • the stronger the prior knowledge (or prior assumptions) that one starts the learning process with, the easier it is to learn from further examples. However, the stronger these prior assumptions are, the less flexible the learning is (it is bound, a priori, by the commitment to these assumptions.) Prior Knowledge • Restricting our hypothesis class (Finiteness, VC Dimension) • Assumption on distribution Examples
  • 60. 58 1. Fundamentals of Machine Learning Prior Knowledge Bait Shyness The rats seem to have some “built in” prior knowledge telling them that, while temporal correlation between food and nausea can be causal, it is unlikely that there would be a causal relationship between food consumption and electrical shocks or between sounds and nausea.
  • 61. 59 1. Fundamentals of Machine Learning Pigeon Superstition Prior Knowledge
  • 62. 60 1. Fundamentals of Machine Learning ML vs Statistical Modeling
  • 63. 61 3. Bank Failures Example FDI C • US-based commercial banks are regulated by the FDIC • FDIC provides insurance for commercial banks, and charges them insurance premium according to an internal (and non-public) rating based on the CAMELS supervisory system
  • 64. 62 3. Bank Failures Example Importance
  • 65. 63 3. Bank Failures Example CAMEL S • Rate 1: Best, Rate 5: Worst • Rating 4 or 5 is likely to be closed soon Capital inadequacy is the most common cause of a bank closure (other reasons: violation of financial rules, management failures) If FDIC decides to close the bank, it takes over both its assets and its liabilities and then tries to sell the assets at the best price possible to pay up the liabilities. • CAMEL ratings are not publicly known; However, Call Reports are available. • In addition, FDIC provides historical data for failed banks: (https://www.fdic.gov/bank/individual/failed/)
  • 66. 64 3. Bank Failures Example Call Report • 28 schedules in total • Form FFIEC 031: for banks with both domestic (US) and foreign offices • Form FFIEC 041: for banks with domestic (US) offices only
  • 67. 65 3. Bank Failures Example Call Report Content (Schedules)
  • 68. 66 3. Bank Failures Example Call Report Content (Schedules)
  • 69. 67 3. Bank Failures Example Correlation Matrix of features In this problem we want to predict failed(defaulter) Banks and non-failed Banks NI: net income log_TA: logarithm of total assets TL: total loans NPL: non-performing loans Assessment Base: average consolidated assets minus tangle equity …
  • 70. 68 3. Bank Failures Example Defaulter by log_TA in Training data
  • 71. 68 3. Bank Failures Example Defaulter by log_TA in Test data
  • 73. 70 3. Bank Failures Example Logistic Regression used for classification
  • 74. 71 3. Bank Failures Example Training
  • 75. 72 3. Bank Failures Example Training
  • 76. 73 3. Bank Failures Example Testing
  • 78. 75 4. Deep Learning and Neural Networks The performance of simple machine learning algorithms depends heavily on the representation of the data they are given. Goal: separate the factors of variation Problem: influence every single piece of data we are able to observe. (car image at night, car ) Most applications require us to disentangle the factors of variation and discard the ones that we do not care about Representation Learning: use ML to discover not only the mapping from representation to output but also the representation itself. quintessential example: Autoencoder the combination of an encoder function, which converts the input data into a different representation, and a decoder function, which converts the new representation back into the original format.  Representation
  • 79. 76 4. Deep Learning and Neural Networks Example  Representation
  • 80. 77 4. Deep Learning and Neural Networks Deep learning solves this problem by introducing representations that are expressed in terms of other, simpler representations. (build complex concepts out of simpler concepts. ) Example
  • 81. 77 4. Deep Learning and Neural Networks  Depth Depth enables the computer to learn a multistep computer program Layer: state of the computer’s memory after executing another set of instructions in parallel Networks with greater depth can execute more instructions in sequence. (later instructions can refer back to the results of earlier instructions. Measuring Depth 1. Depth of computational graph: number of sequential instructions (length of the longest path through a flow chart) 2. Depth of the concepts graph: describing how concepts are related to each other. • Depth of the flowchart of the computations needed to compute the representation of each concept may be much deeper than the graph of the concepts themselves
  • 82. 78 4. Deep Learning and Neural Networks Depth = 3 Depth = 2
  • 83. 79 4. Deep Learning and Neural Networks
  • 84. 80 4. Deep Learning and Neural Networks
  • 85. 81 4. Deep Learning and Neural Networks History of DL • Dates back to 1940s (only appears to be new) • Different Names: 1. 1940s - 1960: Cybernetics 2. 1980s – 1990s: Connectionism 3. Beginning of 2006: Deep Learning 4. learning algorithms for biological learning (models of how learning happens or could happen in brain): Artificial Neural Networks Neural Perspective on DL 1. Brain provides a proof that intelligent behavior is possible 2. Reverse engineer the computational principles behind the brain • Today, neuroscience is regarded as an important source of inspiration for DL researchers, but it is no longer the predominant guide for the field because To obtain a deep understanding of the actual algorithms used by the brain, we would need to be able to monitor the activity of (at the very least) thousands of interconnected neurons simultaneously. • The basic idea of having many computational units that become intelligent only via their interactions with each other is inspired by the brain • 1980s algorithms work quite well, but this was not apparent circa 2006 because they were too computationally costly.
  • 86. 82 4. Deep Learning and Neural Networks • Increasing Dataset sizes: Some skill is required to get good performance from a DL algorithm. Fortunately, the amount of skill required reduces as the amount of training data increases. The age of “Big Data” has made ML much easier because the key burden of statistical estimation (generalizing to new data after observing only a small amount) has been considerably lightened. • Increasing Model Sizes: animals become intelligent when many of their neurons work together. Larger networks are able to achieve higher accuracy on more complex tasks. History of DL
  • 87. 83 4. Deep Learning and Neural Networks Challenges motivating DL • Curse of Dimensionality Regions Regions Regions statistical challenge arises because the number of possible configurations of x is much larger than the number of training examples.
  • 88. 84 4. Deep Learning and Neural Networks www.playground.tensorflow.org • Local Constancy and Smoothness Among the most widely used of these implicit “priors” is the smoothness prior, or local constancy prior. It states that the function we learn should not change very much within a small region. Much of the modern motivation for deep learning is derived from studying the limitations of local template matching and how deep models are able to succeed in cases where local template matching fails (Bengio et al., 2006b).
  • 89. 85 4. Deep Learning and Neural Networks Neural Networks Feedforward Neural Network (MLP) Goal: approximate some function with some Feedforward: information flows through the function with no feedback connections Neural: loosely inspired by neuroscience Network: composing together many different functions . ( is the ’th layer and final layer is output layer) Depth: overall length of the chain Width: dimensionality of hidden layers Hidden Layer: Training data does not show the desired output for each of these layers • During NN training, we drive to match • Each hidden layer is vector valued
  • 90. 86 4. Deep Learning and Neural Networks Depth 𝑓 1 𝑓 2 𝑓 3 Feedforward Neural Network (MLP) Width
  • 91. 87 Feedforward Neural Network (MLP) MLP as a kernel technique extend linear models to represent nonlinear functions of by applying the linear model not to , but to a transformed input How to choose 1. Generic: infinite-dimensional(based on RBF kernel). Enough capacity but poor generalization 2. Manually Engineer : Requires decades of human effort for each separate task 3. Learn : This is an example of a deep feedforward network, with defining a hidden layer • The advantage of 3’rd approach is that the human designer only needs to find the right general function family rather than finding precisely the right function.
  • 92. 88 4. Deep Learning and Neural Networks Feedforward Neural Network (MLP) Example: Learning XOR • After solving: and , where and • Most neural networks establish a nonlinear function by using a affine transformation controlled by learned parameters, followed by a fixed nonlinear function called an activation function. or , where
  • 93. 89 4. Deep Learning and Neural Networks When , the model’s output must increase as increases. When , the model’s output must decrease as increases.
  • 94. 90 4. Deep Learning and Neural Networks ,
  • 95. 91 4. Deep Learning and Neural Networks Recurrent Neural Network (RNN) • For processing a sequence of values . ( can be variable) • Parameter sharing: using the same parameter for more than one function in a model (tied weights). If we had separate parameters for each value of the time index, we could not generalize to sequence lengths not seen during training, nor share statistical strength across different sequence lengths and across different positions in time. Such sharing is particularly important when a specific piece of information can occur at multiple positions within the sequence. (“I went to Nepal in 2009” and “In 2009, I went to Nepal) • Each member of the output is a function of the previous members of the output. Each member of the output is produced using the same update rule applied to the previous outputs. • Include cycles that represent the influence of the present value of a variable on its own value at a future time step. • Any function involving recurrence can be considered a recurrent neural network.
  • 96. 92 4. Deep Learning and Neural Networks Parameter Sharing Recurrent Neural Network (RNN)
  • 97. 93 4. Deep Learning and Neural Networks Unfolding Computational Graphs The unfolding process thus introduces two major advantages: 1. Regardless of the sequence length, the learned model always has the same input size, because it is specified in terms of transition from one state to another state, rather than specified in terms of a variable-length history of states. 2. It is possible to use the same transition function f with the same parameters at every time step. Recurrent Neural Network (RNN)
  • 98. 94 4. Deep Learning and Neural Networks Some types of RNNs Recurrent Neural Network (RNN) I. Produce an output at each time step and have recurrent connections between hidden units II. Produce an output at each time step and have recurrent connections only from the output at one time step to the hidden units at the next time step. III. With recurrent connections between hidden units, that read an entire sequence and then produce a single output • The network with recurrent connections only from the output at one time step to the hidden units at the next time step is strictly less powerful because it lacks hidden-to- hidden recurrent connections. For example, it cannot simulate a universal Turing machine. It requires that the output units capture all the information about the past that the network will use to predict the future.
  • 99. 95 4. Deep Learning and Neural Networks I
  • 100. 96 4. Deep Learning and Neural Networks II
  • 101. 97 4. Deep Learning and Neural Networks
  • 102. 98 4. Deep Learning and Neural Networks III
  • 103. 99 4. Deep Learning and Neural Networks Teacher Forcing Recurrent Neural Network (RNN) a procedure that emerges from the maximum likelihood criterion, in which during training the model receives the ground truth output as input at time . 𝑙𝑜𝑔𝑝 𝑦 1 , 𝑦 2 𝑥 1 , 𝑥 2 = 𝑙𝑜𝑔𝑝 𝑦 2 𝑦 1 , 𝑥 1 , 𝑥 2 + 𝑙𝑜𝑔𝑝 𝑦 1 𝑦 1 , 𝑥 1 , 𝑥 2 • avoid back-propagation through time in models that lack hidden-to-hidden connections. Teacher forcing may still be applied to models that have hidden-to-hidden connections as long as they have connections from the output at one time step to values computed in the next time step. • As soon as the hidden units become a function of earlier time steps, however, the BPTT algorithm is necessary. • Some models may thus be trained with both teacher forcing and BPTT.
  • 104. 104 100 4. Deep Learning and Neural Networks
  • 105. 101 4. Deep Learning and Neural Networks Any time we choose a specific machine learning algorithm, we are implicitly stating some set of prior beliefs we have about what kind of function the algorithm should learn. Choosing a deep model encodes a very general belief that the function we want to learn should involve composition of several simpler functions. This can be interpreted from a representation learning point of view as saying that we believe the learning problem consists of discovering a set of underlying factors of variation that can in turn be described in terms of other, simpler underlying factors of variation. Alternately, we can interpret the use of a deep architecture as expressing a belief that the function we want to learn is a computer program consisting of multiple steps, where each step makes use of the previous step’s output. These intermediate outputs are not necessarily factors of variation but can instead be analogous to counters or pointers that the network uses to organize its internal processing. Empirically, greater depth does seem to result in better generalization Last Note
  • 106. 102 References 1. Understanding Machine Learning: From Theory to Algorithms (Shai Ben-David and Shai Shalev- Shwartz) 2. Deep Learning (Aaron C. Courville, Ian Goodfellow, and Yoshua Bengio) 3. “Machine Learning in Finance” course (www.coursera.org) 4. Advances in Financial Machine Learning (marcos lopez de prado)

Hinweis der Redaktion

  1. http://aigamedev.com/open/article/top-down-vs-bottom-up-design/
  2. Another examples: anomaly detection (fraud), any suggestion on social media, google news, learning someone’s taste
  3. Another fancy example: speech synch
  4. Knowledge Representation: representing information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. This field incorporates findings from psychology[1] about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets. (Knowledge-Based approach) Automated Reasoning: he study of automated reasoning helps produce computer programs that allow computers to reason completely, or nearly completely, automatically. Although automated reasoning is considered a sub-field of artificial intelligence, it also has connections with theoretical computer science, and even philosophy. NLP: Natural language processing (NLP) is a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data. Challenges in natural language processing frequently involve speech recognition, natural language understanding, and natural language generation.
  5. You can't say to an "Applied AI" agent: go out and find out what you do on your own. Example of sub-symbolic information: Yann LeCun: The phrase "He took his bag and left the room", implies in particular that the person walked out of the room rather than, for instance jumped out of the window or teleported to another planet Other even more remote tasks include algorithmic theories of creativity, curiosity and surprise as pursued by Juergen Schmidhuber. one expects that AI intelligence will be able to solve arbitrary intellectual tasks, which is expected to around 2,045 according to Ray Kurzweil, a famous entrepreneur and futurologist.
  6. Types are based on agent’s interaction with environment program synthesis is the task to automatically construct a program that satisfies a given high-level specification[1]. In contrast to other automatic programmingtechniques, the specifications are usually non-algorithmic statements of an appropriate logical calculus.[2] Often, program synthesis employs techniques from formal verification. For reinforcement learning example, one may try to learn a value function that describes for each setting of a chess board the degree by which White’s position is better than the Black’s. Yet, the only information available to the learner at training time is positions that occurred throughout actual chess games, labeled by who eventually won that game
  7. In IRL setting, everything is the same as the direct reinforcement learning, but there is no information on rewards received by the agent upon taking actions. Instead we are simply given a sequence of states of the environment and actions by the agent. And given that we are asked what objective the agent pursued when performing these actions
  8. Demand Forecast: understand and predict customer demand to optimize supply decisions by corporate supply chain and business management. Machine Translation example: Google Translate
  9. As you can see, NN is present in all types. Because of the Universal Approximation Theorem, every function is representable via a NN.
  10. Regression is the most commonly used algorithm in Finance
  11. Asset management refers to systematic approach to the governance and realization of value from the things that a group or entity is responsible for, over their whole life cycles. It may apply both to tangible assets (physical objects such as buildings or equipment) and to intangible assets (such as human capital, intellectual property, goodwill and/or financial assets). 
  12. Quantitative Trading (Algorithmic Trading): Algorithmic trading is a method of executing a large order (too large to fill all at once) using automated pre-programmed trading instructions accounting for variables such as time, price, and volume
  13. Most common uses
  14. The reason that reinforcement learning has application in perceptions tasks of finance: In finance, expectations regarding the future are sometimes embedded in perception of today’s environment. If this future is influenced by actions of rational agents, RL might be an appropriate framework (تصور الان روی قیمت آینده اثر می‌گذارد) Rational financial AI agents: These agents learn to perceive the environment; that is to digest financial and sometimes non-financial data and perform certain actions to maximize some measures of performance
  15. Interoperability is also important in sensitive (life-depending) or moral problems. For more information, see this:
  16. each pair in the training data S is generated by first sampling a point xi according to D and then labeling it by f. Domain set is the set of objects that we may wish to label. For example, the set of all papayas. It is important to note that we do not assume that the learner knows anything about distribution D we assume that there is some “correct” labeling function, f : X ->Y, and that yi = f(xi) for all i. This assumption can be relaxed
  17. = {1,…,m}
  18. The area of the gray square in the picture is 2 and the area of the blue square is 1. Assume that the probability distribution D is such that instances are distributed uniformly within the gray square and the labeling function, f, determines the label to be 1 if the instance is within the inner blue square, and 0 otherwise.
  19. The first component reflects the quality of our prior knowledge choosing H to be a very rich class decreases the approximation error but at the same time might increase the estimation error, as a rich H might lead to overfitting. On the other hand, choosing H to be a very small set reduces the estimation error but might increase the approximation error or, in other words, might lead to underfitting.
  20. Bayesian probability is a special kind of prior knowledge. (prior knowledge about distribution) once we make no prior assumptions about the data-generating distribution, no algorithm can be guaranteed to find a predictor that is as good as the Bayes optimal one
  21. Advantages of Representation Learning: better performance, adapt to new tasks with minimal human intervention Factors: sources of influence… for example: 1)unobserved objects or unobserved forces in the physical world that affect observable quantities ) 2) constructs in the human mind that provide useful simplifying explanations or inferred causes of the observed data speech recording (speaker’s age, their sex, their accent and speaking words) car image analyze (position of the car, its color, and the angle and brightness of the sun.) The individual pixels in an image of a red car might be very close to black at night. The shape of the car’s silhouette depends on the viewing angle
  22. Suppose we have a vision system that can recognize cars, trucks, and birds, and these objects can each be red, green, or blue. One way of representing these inputs would be to have a separate neuron or hidden unit that activates for each of the nine possible combinations: red truck, red car, red bird, green truck, and so on. This requires nine different neurons, and each neuron must independently learn the concept of color and object identity. One way to improve on this situation is to use a distributed representation, with three neurons describing the color and three neurons describing the object identity. This requires only six neurons total instead of nine, and the neuron describing redness is able to learn about redness from images of cars, trucks and birds, not just from images of one specific category of objects.
  23. Just as two equivalent computer programs will have different lengths depending on which language the program is written in, the same function may be drawn as a flowchart with different depths depending on which functions we allow to be used as individual steps in the flowchart. For example, an AI system observing an image of a face with one eye in shadow may initially see only one eye. After detecting that a face is present, the system can then infer that a second eye is probably present as well. In this case, the graph of concepts includes only two layers—a layer for eyes and a layer for faces—but the graph of computations includes 2n layers if we refine our estimate of each concept given the other n times. there is no single correct value for the depth of an architecture, just as there is no single correct value for the length of a computer program. Nor is there a consensus about how much depth a model requires to qualify as “deep.”
  24. While the kinds of neural networks used for machine learning have sometimes been used to understand brain function (Hinton and Shallice, 1991), they are generally not designed to be realistic models of biological function The earliest predecessors of modern deep learning were simple linear models One should not view deep learning as an attempt to simulate the brain. Modern deep learning draws inspiration from many fields, especially applied math fundamentals like linear algebra, probability, information heory, and numerical optimization
  25. Larger networks are able to achieve higher accuracy on more complex tasks. the number of possible distinct configurations of a set of variables increases exponentially as the number of variables increases .
  26. we may also discuss prior beliefs as directly influencing the function itself and influencing the parameters only indirectly, as a result of the relationship between the parameters and the function. Additionally, we informally discuss prior beliefs as being expressed implicitly by choosing algorithms that are biased toward choosing some class of functions over another, even though these biases may not be expressed (or even be possible to express) in terms of a probability distribution representing our degree of belief in various functions. In other words, if we know a good answer for an input x (for example, if x is a labeled training example), then that answer is probably good in the neighborhood of x.
  27. Rather than thinking of the layer as representing a single vector-to-vector function, we can also think of the layer as consisting of many units that act in parallel, each representing a vector-to-scalar function. Each unit resembles a neuron in the sense that it receives input from many other units and computes its own activation value. The idea of using many layers of vector-valued representations is drawn from neuroscience
  28. Depth: deep and shallow networks
  29. Linear models, such as logistic regression and linear regression, are appealing because they can be fit efficiently and reliably, either in closed form or with convex optimization. Linear models also have the obvious defect that the model capacity is limited to linear functions, so the model cannot understand the interaction between any two input variables. We can think of φ as providing a set of features describing x, or as providing a new representation for x. 3’rd approach can capture the benefit of the first approach by being highly generic—we do so by using a very broad family φ(x; θ). Deep learning can also capture the benefit of the second approach. Human practitioners can encode their knowledge to help generalization by designing families φ(x; θ) that they expect will perform well
  30. The only challenge is to fit the training set. By Occam’s Razor, we start with linear models. it may be tempting to make f(1) linear as well. Unfortunately, if f(1) were linear, then the feedforward network as a whole would remain a linear function of its input. Most neural networks establish nonlinear function using an affine transformation controlled by learned parameters, followed by a fixed nonlinear function called an activation function
  31. The bold numbers printed on the plot indicate the value that the learned function must output at each point.
  32. If we use a sufficiently powerful neural network, we can think of the neural network as being able to represent any function f from a wide class of functions, with this class being limited only by features such as continuity and boundedness rather than by having a specific parametric form. (Universal Approximation Theorem)
  33. If we ask a machine learning model to read each sentence and extract the year in which the narrator went to Nepal, we would like it to recognize the year 2009 as the relevant piece of information, whether it appears in the sixth word or in the second word of the sentence. Suppose that we trained a feedforward network that processes sentences of fixed length. A traditional fully connected feedforward network would have separate parameters for each input feature, so it would need to learn all the rules of the language separately at each position in the sentence. By comparison, a recurrent neural network shares the same weights across several time steps. The convolution operation allows a network to share parameters across time but is shallow. The output of convolution is a sequence where each member of the output is a function of a small number of neighboring members of the input. Recurrent networks share parameters in a different way (second dot)
  34. (Top)The black arrows indicate uses of the central element of a 3-element kernel in a convolutional model. Because of parameter sharing, this single parameter is used at all input locations. (Bottom)The single black arrow indicates the use of the central element of the weight matrix in a fully connected model. This model has no parameter sharing, so the parameter is used only once Parameter sharing is a kind of prior knowledge.
  35. the time step index need not literally refer to the passage of time in the real world. Sometimes it refers only to the position in the sequence. S(t): state of the system (dynamical system) Each node represents the state at some time t, and the function f maps the state at t to the state at t + 1. The same parameters (the same value of θ used to parametrize f) are used for all time steps. By unfolding, we avoid cycles in graph
  36. RNN has input to hidden connections parametrized by a weight matrix U, hidden-to-hidden recurrent connections parametrized by a weight matrix W , and hidden-to-output connections parametrized by a weight matrix V any function computable by a Turing machine can be computed by such a recurrent network of a finite size The output can be read from the RNN after a number of time steps that is asymptotically linear in the number of time steps used by the Turing machine and asymptotically linear in the length of the input (Siegelmann and Sontag, 1991; Siegelmann, 1995; Siegelmann and Sontag, 1995; Hyotyniemi, 1996). The functions computable by a Turing machine are discrete, so these results regard exact implementation of the function, not approximations.
  37. A loss L measures how far each o is from the corresponding training target y. When using softmax outputs, we assume o is the unnormalized log probabilities. Unless o is very high-dimensional and rich, it will usually lack important information from the past. This makes the RNN in this figure less powerful, but it may be easier to train because each time step can be trained in isolation from the others, allowing greater parallelization during training
  38. Maximum likelihood thus specifies that during training, rather than feeding the model’s own output back into itself, these connections should be fed with the target values specifying what the correct output should be
  39. Much as almost any function can be considered a feedforward neural network, essentially any function involving recurrence can be considered a recurrent neural network.