StudyDaddy Business & Finance

Waiting for answer This question has not been answered yet. You can hire a professional tutor to get the answer.

QUESTION

Mar 08, 2019

We will build and test a simple long only trading strategy. First, obtain daily data of "SPY" from 2010- 10-01 to 2016-02-19. Compute the daily log...

We will build and test a simple long only trading strategy. First, obtain daily data of "SPY" from 2010- 10-01 to 2016-02-19. Compute the daily log return using the adjusted closing price. Divide the dataset into two parts: the first 1254 for training and the rest 100 days as holdout samples. library(quantmod) source("backtest.r") getSymbols("SPY",from='2010-10-01',to='2016-02-19',src="yahoo") (a) Construct the best ARMA model based on the first 1254 daily log return data using the auto.arima function in the fpp package. This function will automatically select the best orders (namely p, d, and q in the arima model) for you. We call this model 1. We will then fix the order of the arima model, and at each step, only parameter estimation changes. Use this model to predict one day ahead. Compare the predicted direction (namely if the prediction return is positive or negative) with the direction of the actual return. Were you right? For the next day re-estimate the parameters using data 2 to 1255 and predict the next day's direction. Repeat until the test data is exhausted. Report the probability of being correct (calculated as the percentage of times the model predicts the directions correctly). (b) Now, look at the ACF plot of the training data only, propose an MA(q) model. Look at the PACF plot of the training data only, propose an AR(p) model. Please give your reasons for choosing the p and q values. Call these model 2 and 3. Compare these three models using backtest procedure to compare the models via 1-step ahead forecasts. You may use t = 1254 as the starting forecast origin. Which one has the best accuracy?

Here is the backtest code to be pasted in r for this question's use!

"backtest" <- function(m1,rt,orig,h=1,xre=NULL,fixed=NULL,include.mean=TRUE){

# m1: is a time-series model object

# orig: is the starting forecast origin

# rt: the time series

# xre: the independent variables

# h: forecast horizon

# fixed: parameter constriant

# inc.mean: flag for constant term of the model.

regor=c(m1$arma[1],m1$arma[6],m1$arma[2])

seaor=list(order=c(m1$arma[3],m1$arma[7],m1$arma[4]),period=m1$arma[5])

nT=length(rt)

if(orig > nT)orig=nT

if(h < 1) h=1

rmse=rep(0,h)

mabso=rep(0,h)

nori=nT-orig

err=matrix(0,nori,h)

jlast=nT-1

if(!is.null(xre))xre <- matrix(xre)

for (n in orig:jlast){

jcnt=n-orig+1

x=rt[1:n]

if (is.null(xre))

pretor=NULL else pretor=xre[1:n,]

mm=arima(x,order=regor,seasonal=seaor,xreg=pretor,fixed=fixed,include.mean=include.mean)

if (is.null(xre)){nx=NULL}

else {nx=matrix(xre[(n+1):(n+h),],h,ncol(xre))}

fore=predict(mm,h,newxreg=nx)

kk=min(nT,(n+h))

# nof is the effective number of forecats at the forecast origin n.

nof=kk-n

pred=fore$pred[1:nof]

obsd=rt[(n+1):kk]

err[jcnt,1:nof]=obsd-pred

}

for (i in 1:h){

iend=nori-i+1

tmp=err[1:iend,i]

mabso[i]=sum(abs(tmp))/iend

rmse[i]=sqrt(sum(tmp^2)/iend)

}

print("RMSE of out-of-sample forecasts")

print(rmse)

print("Mean absolute error of out-of-sample forecasts")

print(mabso)

backtest <- list(origin=orig,error=err,rmse=rmse,mabso=mabso)

}