Portfolio Selection with Artificial Neural Networks

Portfolio Selection with
Artificial Neural Networks
Andrew J Ashwood
Supervisor: Dr Anup Basu
1

Outline
• Introduction
• Research Motivation
• Neural Network Primer
• Literature Review
• Research Questions
• Network specification
• Results
• Conclusions
• Further work and limitations
2

Research Motivation
• Trading in equities is big business in
Australia
– Value of All Ordinaries c. $1.2 trillion
– Stocks turned over in 2012 c. 244 billion
– Funds invested in superannuation c. $1.4
trillion
3

Research Motivation
• Outperforming the broader market is
difficult
– From 1969 to 2010, an index investor would
achieve returns 80 per cent greater than the
average investor in a managed fund.’
(Malkiel, 2011)
– ‘…if many managers have sufficient skill to
cover costs, they are hidden by the mass of
managers with insufficient skill. …true α in net
returns to investors is negative for most if not
all active funds’ (Fama & French, 2009)
4

Research Motivation
• The Australian experience is similar
– 70% of active retail funds underperformed the
benchmark over both the 1 and 3 year period
(Karaban & Maguire, 2012)
– Over a five year period 69% of active retail
funds failed to outperform the ASX200AI
(Karaban & Maguire, 2012)
5

Research Motivation
• ‘The attempt to predict accurately the future
course of stock prices and thus the
appropriate time to buy or sell a stock must
rank as one of investors’ most persistent
endeavours.’ (Malkiel, 2011)
• Advances in computing power in combination
with the widespread availability of historical
datasets has provided investors with
increased opportunity to test markets for
predictable returns
6

Research Motivation
• Artificial Neural Networks (‘ANNs’) have
several traits that suggest their suitability
for stock price prediction
– Ability to generalise and learn
– Provide an analytic alternative to other
techniques that rely on assumptions of
normality, linearity, and variable independence
7

Neural Network Primer
• An artificial neural network is a mathematical
model that mimics the structure and function
of biological brains
• A structure of highly interconnected
processing neurons
8

• Typical neural processing element
X1
X2
Xn
Neuron
Inputs
X1 to Xn
W1
W2
Wn
Input
Weights
W1 to Wn
WX i
n
i
i
v
0
Neuron
Activation
Transform-
ation
)tanh(vy
Output
Output
Learning
Error Back
propagation
9

• Neural networks comprise interconnected
structure of neurons
Input
Input
Input
Output
Output
Output
Input
Layer
Hidden
Layer
Output
Layer
10

• Network typology
– 2 fundamental classes of network architecture
• Feed-forward network – information travels in the
forward direction only
– Single layer feed-forward network
– Multi layer feed-forward network
• Recurrent network – contains at least one
feedback loop
– Many types (Fully recurrent, Hopfield, Elman, Jordan, etc
etc)
– Several studies demonstrate that feed-
forward networks outperform recurrent
networks at predicting non-linear time series
data
11

• Network learning – The Back Propagation
Algorithm
– Input weights are randomly applied to the network
– Network calculations occur and the network
produces an output signal
– Simulated output compared to target output to
provide system error
– Error is back-propagated through the network
– Process repeats until error reaches desired level
or maximum number of iterations is reached
12

Literature Review
• Neural networks are relatively new.
– No published research in the financial domain
prior to 1990. By 1993, over 10 studies per
year were being published. (Wong &
Yakup, 1998)
13

Literature Review
• Yoon & Swales (1991) used ANNs to
predict prices of 98 companies and
compare the ANN predictions to multiple
discriminant analysis
– The study concluded the ANN technique
significantly outperformed multiple
discriminant analysis
14

Literature Review
• Kryzanowski, Galler & Wright (1992) used
ANNs to pick stocks
– Fourteen fundamental ratios were used as
inputs – coded as trending
upwards/downwards or stable
– The ANN model correctly predicted price
directional movement over the next year in
66% of cases
– The ANN did not predict extreme movements
well
15

Literature Review
• Hall, 1994 used ANN to create style based
portfolio
– ANNs used to identify stocks with largest
recent price increased and apply this style to
the current market to select a portfolio
– Utilised 1,000 largest US stocks
– 37 inputs used to predict future stock prices
– The portfolio has exceeded the S&P500 since
inception after transaction costs, but, ‘Actual
performance results and specific details of the
system are proprietary’.
16

Literature Review
• Jai & Lang (1994) used ANNs to trade the
TSE
– Used 16 technical indicators to predict the
trend of the price movement over the
subsequent 6 trading days
Year 1990 1991
Type of ANN Fixed
structure
Dual
adaptive
structure
Fixed
structure
Dual
adaptive
structure
Total return % -30.66% 21.61% 25.76% 29.20%
Annual TSEWPI
return%
-39.52% 23.42%
Note the large
divergence in
performance
17

Literature Review
• Freitas et al. (2001) used neural network
to provide an estimate of future returns of
66 representative stocks listed on the
Brazilian BOVESPA stock exchange.
– Worst network prediction error 33%
– Best network prediction error 7%
– Average error almost 17%
– Neural network portfolios achieved better
performance in 19 out of 21 weeks and
outperformed the benchmark portfolio by
12.39%. 18

Literature Review
• Ellis & Wilson (2005) used ANNs to
classify Australian listed REITs as value
stocks.
– Value stocks were bought and portfolio
returns measured
– ANN portfolio outperformed the benchmark
index 6.97% per month on average
– Sharpe and Sortino ratios significantly greater
than the benchmark indices
19

Literature Review
• Ko & Lin (2008) used ANNs for portfolio
selection on Taiwan Stock Exchange
(‘TSE’).
– Focussed on 21 different equities
– Used price, variance, covariance as input
variables
– ANN used to predict equity allocation ratio
– 5yr data set including training & validation
sets
– ANN model outperformed TSE (no 20

Literature Review
• Vanstone et al. (2010) used neural networks
based on a stock trading filter rule. The filter
rule bought stocks when the following criteria
were met
• PE < 10
• Market Price < Book Value
• ROE < 12
• Dividend Payout Ratio < 25%
– System yielded low number of trading
opportunities
– The return achieved by the system exceeded the
return available using the original filter rule
– It is worth noting however that the higher return
achieved, was at least partially due to increased
risk.
21

Research Questions
• Core Research Question 1
– Can neural networks that utilise a combination
of fundamental and technical inputs predict
(a) future stock prices, (b) stock price
directional movement?
• Secondary Research Question 1
– Which network inputs contribute most to
network accuracy?
22

Research Questions
– What network parameters lead to optimal
predictive capability?
– Do neural networks predict any of the known
performance attribution factors?
• All research questions are to be answered
with reference to stocks listed on the ASX
23

Network Specification
• Kaastra & Boyd (1996) framework for
design of financial and econometric time
seriesStep 1: Variable selection
Step 2: Data collection
Step 3: Data preprocessing
Step 4: Training, testing, and
validation sets
Step 7: Neural network training
number of training iterations
learning rate and momentum
Step 6: Evaluation criteria
Step 5: Neural network paradigms
number of hidden layers
number of hidden neurons
number of output neurons
transfer functions
Step 8: Implementation
Iterative
Process
24

Variable Selection
FUNDAMENTAL INPUTS TECHNICAL INPUTS
Profit Margin DPS Open High Low Close
Dividend
Yield
CF per share 20d MA 50d MA 20d EMA 50d EMA
BV per share BV per share
growth
MACD RSI CLV ADI
Current ratio Quick Ratio 20d Momentum 50d
Momentum
6mth
Momentum
12mth
Momentum
Asset
Turnover
Debt to
Equity
Fast Stochastic
Oscillator
Fast
Stochastic
Indicator
Slow
Stochastic
Oscillator
Slow
Stochastic
Indicator
Interest
Cover
Price to CF Rate of Change
(Price 1wk)
Rate of Change
(Price 4wk)
Rate of Change
(Price 10wk)
Rate of Change
(MACD)
PE Price to Sales Rate of Change
(MACD Signal)
Rate of Change
(RSI)
Rate of Change
(ADI)
Rate of Change
(20d MA)
ROA ROE Rate of Change
(50d MA)
Rate of Change
(20d EMA)
Rate of Change
(50d EMA)
Rate of Change
(12mth Mom)
FCF per share Sales growth Rate of Change
(FSO)
Rate of Change
(FSI)
Rate of Change
(SSO)
Rate of Change
(SSI)
- - Volume - - - 25

Data Set
• Dataset obtained from Bloomberg
• Weekly data from Jan 1997 to Dec 2011
• Early data (1997-1999) used to calculate
technical indicators
• 20 widely held ASX50 stocks selected
AMC ANZ BHP CBA CCA
CSL DJS GPT LEI NAB
ORG QAN QBE RIO SUN
TLS WBC WDC WES WOW
26

Data Preprocessing
• 3 pre-processing algorithms utilised –
– Remove duplicate data
• Duplicate data provides no useful information to the
ANN
• Speeds calculation
– Scale data
• Ensures all inputs treated equally
• Avoids saturation of the tansig transformation function
– Mark missing data
• Ensures that missing data is not used in the learning
process
27

Frequency of Data
• Little theory to assist in determining
optimum frequency of data for a given
prediction horizon Deboeck (1994) –
– Long term prediction are unreliable
– Short term predictions are unreliable
– There is some period into the future for which
useful predictions can be made
28

Training, Testing, Validation Sets
• The data set needs to be divided into
training, validation, and testing datasets
(Beale et al., 2011)–
– Training set –used to compute the gradient
and updating weights
– Validation set – the data set used for
monitoring the error during training
– Testing set –out-of-sample test data set
29

Training, Testing, Validation Sets
• Walk forward testing –
t0 tn
Time
Window 1
Window 2
Window 3
Window 4
Window 5
Training Window
Validation Window
Testing Window
Testing Period
30

Network Implementation
Summing junction+ Tapped delay lineBias unitbx
Input WeightTDL IW LW Layer Weight Transfer function
Input Layer
5918 Fundamental Inputs
41 Technical Inputs
59
Tansig
b1
4 → 20
TDL
1
Hidden Layer
30-150
IW1,1
+
LW1,2
TDL
Output
Tansig
Output Layer
b2
LW2,1
1
1
+
31

Evaluation Criteria & Training
• Mean Squared Error (‘MSE’) selected as
error function to be minimised by the network
• Convergence method adopted for network
training. Training occurs until the earlier of –
– 100,000 training iterations occurs
OR
– 50 iterations with no reduction in validation set
error
32

Implementation
Parameter Options No. of
Options
Stocks/portfolios 20 stocks +
Equally and value weighted
portfolios of these stocks
22
Input type Price OR
Technical indicators OR
Fundamental indicators OR
Price + Technical OR
Price + Fundamental OR
Price + Technical +
Fundamental
6
Lookback window 4, 8, 12, 16, 20 periods 5
Hidden layer size 30, 60, 90, 120, 150 neurons 5
Training period
length
3mths, 6mths, 12mths, 24mths 4
Walk forward
testing
10 x 1-year testing periods from
2001 to 2011
10
Total networks = 132,000
• All testing undertaken on QUT
supercomputer ‘Lyra’
• At time of testing Lyra
consisted of -
• 106 compute nodes
• 1572 x 64 bit Intel Xeon
cores (consisting 11,736
core processors)
• 32.8 TeraFlop Theoretical
(double precision), 58.6
TeraFlop (single
precision)
• 10.4 TeraBytes of main
memory
33

Results
• The high number of network models specified (132,000)
meant that manual review and interpretation of individual
networks was not feasible.
• The algorithm selected the network configuration that
provides the lowest mean squared error over the one
year walk forward testing window. The algorithm
therefore provides 220 outputs. This is comprised as
follows:
– 20 stocks + 2 portfolios
– 10 walk forward testing windows for each
stock/portfolio
– Total outputs = 220
34

Input Type
0
5
10
15
20
25
30
35
40
45
50
Frequency
Input Type
Histogram of Input Type
17%
14%
10%
21%
22%
16%
35

Lookback Window
0
5
10
15
20
25
30
35
40
45
50
4 8 12 16 20
Frequency
Lookback Period (no. inputs)
Histogram of Lookback
19%
21%
19%
21%
20%
36

Hidden Layer Size
0
20
40
60
80
100
120
30 60 90 120 150
Frequency
Hidden Layer Size (no. of nodes)
Histogram of Hidden layer
Smaller hidden
layer produced
optimum network
in 47% of cases
47%
16%
9%
18%
10%
37

Training Period
0
10
20
30
40
50
60
70
80
3mths 6mths 12mths 24mths
Frequency
Training Period Length
Histogram of Training period
35%
20%
15%
30%
38

Predictive Capability
0%
5%
10%
15%
20%
25%
30%
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
1 2 3 4 5 6 7 8 9 10
RMSE%ofSharePrice
Year/
Timestep
RMSE as % of Share Price
AMC
ANZ
BHP
CBA
CCL
CSL
DJS
GPT
LEI
NAB
ORG
QAN
QBE
RIO
SUN
TLS
WBC
WDC
WES
WOW
Generally there is a major
spike in error in 2008 &
2009
Generally predictions are
fairly consistent from 2002-
2007
39

Predictive Capability
2,000
2,500
3,000
3,500
4,000
4,500
5,000
5,500
6,000
6,500
7,000
ASX200 - 2002 to 2011
ASX200 Adjusted Close Price
Sustained
benign trading
conditions
leading up to
GFC
2008 Major
reversal, Bear
Market
2009 Major
reversal, Bull
Market
2010 Return
to benign
trading
conditions
40

Directional Movement
0%
10%
20%
30%
40%
50%
60%
70%
80%
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
1 2 3 4 5 6 7 8 9 10
PredictionAccuracy
Year/
Timestep
Directional Movement Prediction Accuracy
AMC
ANZ
BHP
CBA
CCL
CSL
DJS
GPT
LEI
NAB
ORG
QAN
QBE
RIO
SUN
TLS
WBC
WDC
WES
WOW
EW
VW
41

Directional Movement
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
PredictionAccuracy
Stocks & Portfolios
Directional Movement Prediction Accuracy
Over 10-year Testing Horizon
AMC
ANZ
BHP
CBA
CCL
CSL
DJS
GPT
LEI
NAB
ORG
QAN
QBE
RIO
SUN
TLS
WBC
WDC
WES
WOW
EW
VW
Significant at
5% level
42

Portfolio Selection
• The simulated price returns were then
used to form a long and a long-short
portfolio
• Portfolios formed at the beginning of each
month during the 10-year testing period
and held until the end of the month
• Stocks weighted according to the absolute
value of their simulated price return
43

Portfolio Price Returns
Portfolio Price Return
(2002-2011)
ASX200 19.6%
ANN Long Portfolio 205.5%
ANN Long-Short
Portfolio
196.6%
• Raw Results –
• Do not include transaction costs
• No performance measurement
framework
44

Regression
Results
Long Portfolio Long-Short
Portfolio
α 0.016** 0.013***
RMRF 0.712*** -0.011
SMB -0.317 -0.604**
HML -0.416** -0.306
WML -0.011 0.038
R2
0.283*** 0.215***
Performance Measurement
• The portfolio price returns were then
regressed against the Carhart 4-factor
model
Both portfolios
have a
significantly
positive alpha
Both models are
significant
45

Optimal Network Portfolios
• This analysis confirms that ANNs can be
used for portfolio selection to achieve positive
alpha
HOWEVER
• These results are based upon the most
accurate network specification – which is only
know post hoc
• The problem is whether ANNs can achieve
positive alpha without this post hoc
knowledge
46

Median Network Portfolios
• For each walk-forward testing window for
each stock, 600 different network
specifications were trialled
– 6 input options
– 5 lookback windows
– 5 hidden layer sizes
– 4 training period lengths
– 600 network options per stock per testing period
• We have confirmed that the most accurate
network achieves positive alpha – what about
the median network?
47

Regression
Results
Long Portfolio Long-Short
Portfolio
α 0.012** -0.001
RMRF 0.137 0.003
SMB -0.050 -0.011
HML -0.310 -0.064
WML -0.326 -0.115
R2
0.084* 0.012
Median Network Portfolios
Only the Long
portfolio
achieved a
significant alpha
Only the Long
portfolio model is
significant and
the Carhart style-
factors explain a
lower proportion
of variation in
data
48

Distribution of Alphas
• Network parameter combinations were
applied uniformly to all stocks and the long
portfolio based on expected returns was
formed.
• Mimics a real world trading situation
• This process resulted in 600
measurements of alpha.
49

Distribution of Alphas
0
20
40
60
80
100
120 Histogram for Alpha
Median = 0.0061
Only 12% of the
networks tested
have negative alpha
127 of 529
positive alphas
(27%) were
statistically
significant (p <
0.05)
50

Conclusions
• Network parameter testing was undertaken into
several network parameters –
– input type
– hidden layer size
– lookback window
– training period length.
• No useful guidance on heuristics for the input
type, lookback window, or training period length.
• Hidden layer size showed some promise. Almost
half of the most accurate networks models utilised
the smallest hidden layer of 30 nodes.
51

Conclusions
• Price prediction performance was mixed.
• Most networks performed poorly at the
beginning of the global financial crisis
(beginning of 2008) and at the end (end of
2009). The networks fail to predict major
reversals.
• The networks did not predict the portfolio
price returns well. The value weighted
portfolio error ranged from 9% (best year) to
32% (worst year) and only achieved a RMSE
below 10% in 1 of the 10 testing years.
52

Conclusions
• Stock directional movement prediction
performance was assessed.
• The ANN models predicted directional
movement with better than 50% accuracy
for all stocks and portfolios except for
BHP.
• Significant results were achieved for 13 of
the 22 stocks and portfolios tested.
53

Conclusions
• Price predictions used for long and long-
short portfolio selection
• Both portfolios achieved significantly
positive alpha.
• The optimal network specification was
made by comparing the simulated prices
to the target prices, thus requiring post hoc
knowledge.
54

Conclusions
• A distribution of alpha for all 600 different
network specifications was produced.
• Could not reject hypothesis that the
distribution was normal (p < 0.05).
• 88% of the networks produced positive
alpha (529 out of 600). 127 of which were
significant (27%).
• 12% of networks produced negative alpha
(71 out of 600). None of which were
significant.
55

Limitations & Future Research
• Research tested several network
parameters.
• Hidden layer size provided promising
results.
• Further research is required to develop
heuristics.
• Future research should look at individual
stocks and industry sectors.
56

Limitations & Future Research
• The analysis has been undertaken on a
price return basis, rather than an overall
returns basis.
• Study utilised 20 widely held stocks.
Needs testing over wider investment
universe. Survivorship bias.
• Impact of trading costs not included in
analysis
57

Portfolio Selection with Artificial Neural Networks

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (19)

Ähnlich wie Portfolio Selection with Artificial Neural Networks

Ähnlich wie Portfolio Selection with Artificial Neural Networks (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Portfolio Selection with Artificial Neural Networks

Hinweis der Redaktion