1. Stock Portfolio Optimization Report
Albert Chu
Introduction:
Financial and data analyst firms need a way to assess people’s money to decide which options
has the least most potential return. Thus the main goal for mathematicians is to find a way to do
so by using different numerical analysis methods to assess which portfolios is most suitable for
them. The two explanations of how to optimize portfolios is to maximize return and to minimize
the risk, This experiment is given data from previous examples of previous trends and mishaps of
the stock market, given the company name, number of sales/trades, and the value of stock during
trade. The method of analysis that I chose to use is the Gauss-Hermite Quadrature to create a
minimizing loss in a risky environment.
Method:
How can we make purchasing stock a less of a risk and more rewarding? We can use data sets
that are which contain numerous factors and trends that computers can analyze and find similar
patterns that may happen in the future. A code of Gaussian Quadrature can be implemented if we
have the nodes/weight given the figure of the stock chart, which can then be used to accurately
spot out noticeable trends and make purchasing a stock optimally pleasing. We will be using this
general formula:
The reason we are using Gauss-Hermite Quadrature is because in order to find expected loss and
probability of losing, we use the PDF of the loss and integrate it.
f(x) will be a complicated function that will be integrated to but in this situation, is assumed by a
random stock function found,and in this case, a form of logarithmic function, making it a little
2. different then the actual stock market.. This is the PDF (probability density function) of a normal
distribution function with random variable x to calculate the expected profit/gain of the stock
portfolio which is derived to be:
To use this formula we first need to find the variables and define them.
Minimize the variance σ2=xtVx (where t is time)
Maximize the expected return r = atx
Sample set derived from R: σ2 (x)=σ2x+σrx+V+error
Let's say we are given data sets named as Train_file and Test_file. These datasets were taken off
off the Kaggle website. The data given are the sale price of stocks at the closing time of the stock
market. The sales are then recorded for each specific company and the trends are put into the
formula. The code that I used to run reads in the dataset in the file, runs the Gauss-Hermite
Quadrature, giving us values which are the probability of gains. The other value that it gives is
the expected return, or the amount of profit gained from the transaction. The expected value of
the exact value is very pretty close to the predicted model.
And the Formula for the Probability of gain is:
Formula for Expected Gain is:
As stated earlier, the
Results:
A tabular result from the the output file shows:
Id Sales Id Sales
3. 1 4770.5 11 7041.0
2 7864.5 12 8110.5
3 8955.0 13 7322.0
4 6893.0 14 9156.0
5 6697.0 15 6017.0
6 5829.0 16 4847.0
7 8004.0 17 6042.0
8 8078.0 18 9735.0
9 5365.0 19 11966.0
10 6007.0 20 10178.0 etc.
Gauss-Hermite Quadrature
With xi zeroes
Probability of Gain Expected gain
2 .5 .081
3 .5168 .125
4 .5 .0912
5 .5199 .132
Exact probability of gain was .51
As you can see, the reason stock markets are so hard to cheat is because of the random behaviors
that occur, mostly due to the impact of other companies. The probability of gaining is only ~51%
or ever so slightly more than that, and the compliment of the probability is almost the same
which is to say the probability of loss. The majority of the other results show around 50%, with a
few outliers. These outliers could have different reasons for such high probability of gain, such
as start-ups just getting a public offering, or for many other reasons. There are just very slight
4. ways to lower risk, but no method to totally remove the risk aspect.
Code:
import pandas as pd
Import numpy as np
Train_file = 'C:/Users/ACHU/input/train.csv'
Test_file = 'C:/Users/ACHU/input/test.csv'
Output_file ='stock.csv'
train = pd.read_csv( train_file )
test = pd.read_csv( test_file )
train = train.loc[train.Change > 0] #days that stock up in price
columns = ['Company', 'DayOfWeek', 'Sales']
medians = train.groupby( columns )['Sales'].median()
medians = medians.reset_index()
test2 = pd.merge( test, medians, on = columns, how = 'left' )
assert( len( test2 ) == len( test ))
test2.loc[ test2.Open == gaussian_MC, 'Sales' ] = 0
assert( test2.Sales.isnull().sum() == 0 )
test2[[ 'Company', 'Sales' ]].to_csv( output_file, index = False )
function gaussian_MC(M=200,integral, columns)
ex = np . zeros (M)
k = x # no. of experiments
N = 10∗∗np . arange (1 , k+1)
v = [ ] ; e = [ ]
for n in N:
for m in xrange (M) :
x = np . p i ∗np . rando m . rand ( n )
c t v = np . exp(−np . cos ( x ) ) − 1. + np . cos ( x ) − 0.5∗np . cos ( x ) ∗np . cos ( x )
ex [m] = 1.25 + sum( c t v ) / n # quadrature
ev = sum( ex ) /M
vex = np . dot ( ex , ex ) /M
vex −= ev∗∗2 31 v += [ vex ] ; e += [ ev ]
print n , vex , ev