PLoS ONE
Home Index tracking strategy based on mixed-frequency financial data
Index tracking strategy based on mixed-frequency financial data
Index tracking strategy based on mixed-frequency financial data

Competing Interests: The authors have declared that no competing interests exist.

Article Type: Research Article Article History
Abstract

To obtain market average return, investment managers need to construct index tracking portfolio to replicate target index. Currently, most literatures use financial data that has homogenous frequency when constructing the index tracking portfolio. To make up for this limitation, we propose a methodology based on mixed-frequency financial data, called FACTOR-MIDAS-POET model. The proposed model can utilize the intraday return data, daily risk factors data and monthly or quarterly macro economy data, simultaneously. Meanwhile, the out-of-sample analysis demonstrates that our model can improve the tracking accuracy.

Cui,Zhang,and Gherghina: Index tracking strategy based on mixed-frequency financial data

Introduction

The index tracking strategy, which aim at tracking the return of a given stock index when constructing the portfolio, is a major strategy adopted by fund managers. [17] theoretically and experimentally study the index tracking strategy under different constraints in reality, e.g. the number of stocks in the portfolio is limited.

The simplest and most widely used index tracking strategy is the global minimum variance strategy. Let Rt be the vector of daily excess returns of stocks over the target index and ωt be the global minimum variance portfolio weights. The difference between return of the index tracking strategy and return of the target index is ωtTRt on day t. Mathematically, the investors aim to minimize tracking error, which is measured by variance of the difference between return of the index tracking strategy and return of the target index, i.e.,

where Σt is the N × N conditional covariance matrix of Rt. Obviously, the optimal index tracking strategy is
We can see that the key part of global minimum variance strategy is to estimate the covariance matrix or inverse covariance matrix.

In the literature, methods for estimating the covariance matrix or inverse covariance matrix mainly focus on financial data with homogenous frequency. [810] try to estimate the covariance matrix based on quarterly, monthly and daily returns, respectively. [5, 1118] aim to improve covariance matrix estimation using intraday data. Differently [1923], focus on the estimation of inverse covariance matrix. With the improvement in high-speed computation and large amounts of storage, financial data streams become more and more real-time and complex, such as high-frequency data and ultra-high-frequency data. Besides the historical data in financial markets, monthly or quarterly macro-economic factors are also valuable information sources of stocks’ volatilities (see [2429] further reveal that macro-economic factors, such as GDP growth, exchange rate and short-term interest rate, are important explanatory variables of the slow-moving component in volatilities.

However, due to the heterogeneous frequency of macro-economic factors and historical data, it is a great challenge to construct a unified econometric model. Among all proposed models, the mixed data sampling (MIDAS) method in [30] attracts great attentions, and induces several important extensions, such as GARCH mixed-data sampling (GARCH-MIDAS) model (see [31]), Factor-based mixed-data sampling (Factor-MIDAS) model (see [32]). Different from the MIDAS method, other economic models handling mix-frequency data, such as High-frequency-based volatility (HEAVY) model (see [33]); Factor GARCH-Itô model (see [34]), are mainly focusing on integrating the intraday financial data and daily financial data. Compared with homogenous-frequency models, mixed-frequency model contains more information in the original data, which can better capture market and have better accuracy in prediction. It provides a timely update on portfolio and helps fund managers achieve targeted index tracking performance.

In this paper, we propose a general framework for mixed-frequency financial data, called FACTOR-MIDAS-POET model, to estimate the covariance matrix. The proposed model combines monthly macro-economic factors, daily observable factors (market return and the innovation of VIX) and intraday returns to improve covariance matrix estimation. In empirical analysis, we compare our model with existing models in the literature, and find that the tracking accuracy of minimum variance tracking strategy is greatly improved by using the proposed model. The reason is that compared with other models, our model contains macro-economic information and option market information. We also find that when the amount of historical data decreases, performances of the index tracking portfolios based on different models all decrease, but our model is less affected. Moreover, when the estimation window is short (e.g., 3 months), integrating intraday return data into our model may yield better tracking performance than not using the intraday return data.

The remaining paper is organized as follows. In the estimations of the covariance and its inverse, we introduce existing methods for estimating the covariance matrix and inverse covariance matrix based on homogenous frequency data. In FACTOR-MIDAS-POET method and estimation, we introduce the FACTOR-MIDAS-POET model and its estimation method. In data and descriptive analysis, we explain the data used in this paper. In empirical study, we conduct the empirical study and compare performances of different models. In conclusions and discussion, we conclude our paper.

The estimations of the covariance and its inverse

To obtain the minimum variance index tracking strategy, there are two main ways. The first one is to estimate the covariance matrix and then obtain the inverse matrix; The second one is to estimate the inverse covariance matrix directly. The rest of this section summarizes the important existing methods for estimating the covariance matrix and inverse covariance matrix.

Estimators of covariance matrix

Based on the daily financial data, we summarize three estimators as follows,

St,1 is the sample covariance matrix. St,2 is the weighted lead and lag covariance matrix proposed by [35], which is designed to eliminate the non-synchronous trading effect. In this paper, L is set to 3 for daily return following [36]. St,3 is the backward-looking rolling estimator proposed by [10].

When we have intraday financial data, these estimators are modified as follows,

where Ri,tk is the vector of excess returns at time i in day tk. St,1 is proposed by [12]. St,2 is proposed by [5]. If overnight returns are involved in the estimators, we have
where R0,tk is the overnight return in day tk.

Estimators of inverse covariance matrix

The estimators of inverse covariance matrix are often built upon the estimators of covariance matrix. We summarize these estimators in this subsection.

where tr(⋅) is the trace of a matrix, u=Ndet(St,1)1Ntr(St,1)-1, det(⋅) is the determinant of a matrix, H is an orthogonal matrix and L is a diagonal matrix such that St,1 = HLHT, C is a diagonal matrix with elements ci = T + N − 2i − 1 for i = 1, …, N. St,4inv is proposed by [19]. St,5inv is the shrinkage estimator proposed by [19]. St,6inv is proposed by [20]. St,7inv is proposed by [21]. St,8inv is proposed by [22], St,9inv is proposed by [23].

FACTOR-MIDAS-POET method and estimation

There are two restrictions of the methods summarized in Section 2. The first one is that these classical methods require a homogenous sampling frequency, leading to a low usage of information contained in mixed-frequency financial data; The second is that these classical methods do not take monthly or quarterly macro-economic factors into consideration. To remedy these limitations, we propose a model involving multi-frequency financial data to better reflect financial market.

In our model, excess returns are driven by both observable and unobservable factors. Meanwhile, excess returns are influenced by the status of macro economy. More specifically, the model is shown as follows,

where ri,t,m is the vector of excess returns of N stocks at time i in day t and month m, Fobs,t,m is the vector of daily observable factors in day t and month m, Funobs,t,m is the vector of daily unobservable factors in day t and month m, ϵi,t,m is the N-dimensional residual vector, τm is the long-run component associated with monthly observable macro-economic factors, Xk is the vector of the macro-economic factors in month k, B(k, θ) is the widely used Beta weighted lag structure in MIDAS model. We need to mention that τm is known for any day t in month m, and is updated in the beginning of the next month m + 1.

Based on principal orthogonal complement thresholding(POET) method in [37], the daily covariance matrix of excess returns, Σ^t,m,, is estimated as follows,

where R^J=j=1Jλ^ju^ju^jT, R^N-J=k=J+1Nλ^ku^ku^kT=(r^ij)N×N, λ^1,λ^n are the eigenvalues of the covariance matrix of ri,t,maT τm Fobs,t,m, u^j is the eigenvector associated with λ^j. It is unrealistic to assume that the matrix R^J is sparse because of the existence of common factors. But it is reasonable to assume that the matrix R^N-J is sparse. To guarantee sparsity, it is natural to set a threshold to shrink the non-diagonal elements of R^N-J into
where s(⋅) is a generalized shrinkage function, τ is the threshold, I(⋅) is the indicator function. Then, we obtain the FACTOR-MIDAS-POET estimator of the covariance matrix as follows,

Based on [38], we apply the method of linear compression to obtain the shrinkage inverse covariance matrix estimator as follows:

where c1, c2 and c3 are combination coefficients, Σ^t,m,FMP is the target matrix. The optimized solution and expression can be found in [38].

Data and descriptive analysis

Data

In empirical analysis, we use the stocks listed in Dow Jones Industrial Average (DJIA) to track the S&P 500 index. The stock tickers and full company names of 30 stocks listed in DJIA are available in Table 1. As there are too many missing values in the TAQ data files of TRV company, we remove this company and use the rest 29 DJIA stocks to track S&P 500 index. The sample period in our analysis is from Jan. 1st, 2006 to Dec. 31st, 2011. The daily and 5 minutes (5-min) data of S&P 500 index are obtained from Tick Data. The 5-min data of 30 DJIA stocks are collected from NYSE Trade and Quotations (TAQ) database. Considering the higher possibility of including biases and reporting errors in the first 30 minutes after opening, we discard the first 30 minutes data. Thus, there are 72 intraday 5-min returns and one overnight return in each trading day.

Table 1
Tickers and company full name of DJIA component stocks on June 8, 2009.
TickerCompany Name
AAAlcoa Inc.
AXPAmerican Express Company
BAThe Boeing Company
BACBank of America Corporation
CATCaterpillar Inc.
CSCOCisco Systems, Inc.
CVXChevron Corporation
DDE.I. du Pont de Nemours & Company
DISThe Walt Disney Company
GEGeneral Electric Company
HDThe Home Depot, Inc.
HPQHewlett-Packard Company
IBMInternational Business Machines Corporation
INTCIntel Corporation
JNJJohnson & Johnson
JPMJPMorgan Chase & Co.
KFTKraft Foods Inc.
KOThe Coca-Cola Company
MCDMcDonald’s Corporation
MMM3M Company
MRKMerck & Co., Inc.
MSFTMicrosoft Corporation
PFEPfizer Inc.
PGThe Procter & Gamble Company
TAT&T Inc.
TRVThe Travelers Companies, Inc.
UTXUnited Technologies Corporation
VZVerizon Communications Inc.
WMTWal-Mart Stores, Inc.
XOMExxon Mobil Corporation

The daily value-weighted market return and daily VIX are considered as daily observable factors and obtained from CRSP database and Chicago Board Options Exchange (CBOE), respectively.

The monthly macro-economic factors includes eight important ones:

    Short-term interest rate (Interest), which is measured by 3-month US treasury bill rate and obtained from the Federal Reserve Board’s H.10.

    Exchange rate (Exch.), which is measured by the major currencies index collected from Federal Reserve Banks and obtained from the Federal Reserve Board’s H.15.

    Inflation (Inflation), which is measured by the Consumer Price Index (CPI) and obtained from CRSP database.

    Slope of the yield curve (Slope), which is measured by the spread between 10-year treasury rate and 3-month treasury rate and obtained from CRSP database.

    Default rate (Default), which is measured by the difference between Moody’s Baa and Aaa corporate bond yields of the same maturity and obtained from Federal Reserve Board data files in WRDS database.

    Consumer confidence (CC), which is measured by the Michigan Consumer Sentiment Index and obtained from Trading Economics.

    Growth rate in the Industrial Production Index (IPI), which is obtained from Federal Reserve Board data files in WRDS database.

    Unemployment rate (Unempl.), which is obtained from Trading Economics.

Utilizing the method in [28], we conduct the Principle Components Analysis (PCA) to these eight variables and select the first two principle components as macro-economic factors used in our model.

Descriptive analysis

Tables 2 and 3 present the descriptive statistics for daily returns and 5-min returns of 29 DJIA stocks and S&P index, respectively.

Table 2
Descriptive statistics for daily returns.
StockMean × 100025th Percentile × 100075th Percentile × 1000SD × 1000SkewnessKurtosis
AA-0.8021-14.744015.071233.2524-0.24535.8662
AXP-0.0798-11.074511.569129.84040.10496.4575
BA0.0286-10.227110.748220.79520.10653.6824
BAC-1.3867-13.552510.958643.5349-0.292113.7690
CAT0.3155-11.621912.838124.0225-0.07283.5591
CSCO0.0425-9.889010.724521.5366-0.22706.6922
CVX0.4070-8.977810.571019.67620.136111.2978
DD0.0494-10.194810.662220.7331-0.41404.6500
DIS0.3111-9.28239.686419.93780.30295.8237
GE-0.4444-8.97388.833023.1800-0.12636.7691
HD0.0271-10.165010.002720.09260.35433.3508
HPQ0.0150-9.474610.495920.44920.05184.1362
IBM0.5299-6.59848.306915.17770.02124.2217
INTC-0.0186-10.749911.114821.0361-0.18673.7357
JNJ0.0490-4.74355.274910.96090.118110.1205
JPM-0.1124-12.319011.246632.53660.36229.9782
KFT0.1963-6.48567.166813.50950.00973.8798
KO0.3546-5.57306.081812.71970.46368.5198
MCD0.7355-6.51787.816413.4517-0.04393.8460
MMM0.0280-6.85188.025616.3772-0.32994.7732
MRK0.1455-8.47689.453618.6671-0.29576.2243
MSFT-0.0063-8.62088.762118.90690.10188.1691
PFE-0.0239-7.98988.344916.1736-0.04954.5433
PG0.0897-4.81035.562612.1929-0.14067.2973
T0.1492-7.70327.908616.22320.50718.2793
UTX0.1827-7.65668.827417.41470.26965.8673
VZ0.1781-7.09868.054215.80200.23477.0804
WMT0.2052-6.78746.587813.15210.12915.5092
XOM0.2658-8.14949.260518.2490-0.016313.0925
S&P 5000.0485-5.47786.401415.3059-0.19097.4016
Table 3
Descriptive statistics for 5-min returns.
StockMean × 100025th Percentile × 100075th Percentile × 1000SD × 1000SkewnessKurtosis
AA-0.0170-1.12511.10362.6924-0.092215.9145
AXP0.0135-0.87750.88092.51810.291722.9642
BA-0.0011-0.74570.74781.7395-0.204021.3570
BAC-0.0240-0.93070.88243.2044-0.136632.5194
CAT0.0009-0.91210.91872.10860.284212.7307
CSCO-0.0089-0.85040.82521.87200.151615.1925
CVX0.0011-0.77530.78821.7979-0.005520.2444
DD-0.0002-0.80050.80661.8509-0.232525.5643
DIS0.0083-0.71630.73071.7045-0.038923.6222
GE-0.0041-0.76200.75042.10560.491828.2999
HD0.0017-0.80850.79731.89430.104413.9980
HPQ0.0027-0.74840.75811.77361.167282.0893
IBM0.0048-0.59130.60601.4757-0.633944.7640
INTC-0.0037-0.85590.85441.8677-0.036424.9960
JNJ0.0022-0.44370.44131.0765-0.467949.5585
JPM0.0002-0.93560.92662.65680.082923.6656
KFT0.0061-0.53690.54971.29370.072422.4096
KO0.0020-0.49960.50291.1862-0.010822.3711
MCD0.0034-0.56070.57921.3207-0.133521.7493
MMM0.0015-0.64130.64121.4869-0.098222.6983
MRK-0.0012-0.70060.70231.67730.127049.9546
MSFT-0.0036-0.71700.71341.6315-0.094613.9688
PFE-0.0052-0.69880.68741.5323-0.015616.3319
PG0.0008-0.50520.51321.2062-0.512950.5912
T-0.0004-0.65140.64971.6550-0.443332.2104
UTX0.0046-0.65040.65881.5672-0.274037.8865
VZ0.0064-0.62050.63641.54760.026627.5602
WMT-0.0010-0.58370.57791.33470.541230.5054
XOM0.0003-0.72620.74441.6712-0.157425.4161
S&P 5000.0009-0.51110.51831.34970.044022.5620

Although the mean of daily returns and 5-min returns for S&P 500 index are approximately zero, their standard deviations are significantly different. It indicates that when sampling frequency increase, there is a dramatic change in volatility. Moreover, the skewness switches from -0.1909 for daily return to 0.0440 for 5-min return, which is more right-skewed. The kurtosis increases from 7.4016 for daily return to 22.5620 for 5-min return.

The standard deviations of individual stocks obtained from daily returns are significantly larger than those obtained from 5-min returns. Among the 29 individual stocks, 13 stocks have left-skewed distributions for daily return and 18 stocks have left-skewed distributions for 5-min return. Furthermore, the 29 stocks have larger kurtosis for 5-min return. The distributions of individual stock return are unsymmetrical with the long tail and sharp peak.

Table 4 further displays the correlation coefficients of eight macro-economic factors. It reveals the strong correlations between short-term interest rate and slope of the yield curve, between exchange rate and default rate, between growth rate in the Industrial Production Index and unemployment rate. Introducing all macroeconomic variables in covariance estimation model at one time will lead to a lack of statistical significance due to the presence of multicollinearity. Thus, we use principal component analysis (PCA) to create new independent variables and estimate inverse covariance matrix based on principle components in empirical study.

Table 4
Correlation coefficients between macro-economic variables.
InterestExch.InflationSlopeDefaultCCIPIUnempl.
Interest1
Exch.0.0561
Inflation0.397-0.3091
Slope-0.521-0.039-0.3331
Default-0.3160.512-0.3710.1631
CC-0.074-0.077-0.092-0.040-0.1711
IPI0.175-0.0460.182-0.059-0.025-0.3171
Unempl.-0.0530.157-0.1420.0030.087-0.010-0.4171

Table 5 shows the principle component matrix using PCA. The first principal component and second principal component we use explain 96.4945% of the total variance. The first component has significantly correlations with short-term interest rate and slope of the yield curve, which is called monetary factor. It is viewed as an indicator of monetary policy and bond market. The second component named as economic factor has positive correlations with default rate and exchange rate, which mostly captures the periodic fluctuations in economic activity.

Table 5
Principle component matrix.
Macro-economic variablesPrinciple component 1Principle component 2
IPI0.00160.0007
Default-0.03610.1003
Exch.0.00060.0078
Inflation0.0019-0.0011
Interest0.41480.0085
Slope-0.0257-0.0002
Unempl.-0.00160.0024
CC-0.0046-0.0212

Empirical study

In our model, excess returns, which could be daily returns or 5-min returns, are driven by two daily observable factors: (i) the value-weighted stock market index; (ii) the innovation of the VIX index. The VIX index is constructed by the implied volatilities of S&P 500 index options. It indicates the investors’ expectation for future 30-day volatility of S&P500 index. We also include two monthly observable principle components of the eight macro-economic factors. We fit the proposed model, estimate the inverse covariance matrix, St,m,FMPinv, and compute the minimum variance index tracking strategy with a rolling window scheme. And then we apply the derived strategy for the next day.

Comparison of current different estimations

Here, we choose a rolling window of one year (252 trading days). Table 6 compares out-of-sample performances of minimum variance index tracking strategies, which are derived according to different covariance matrix or inverse covariance matrix estimators based on daily return, intraday return with overnight return and intraday return without overnight return, respectivley.

Table 6
Out-of-sample performances of minimum tracking error portfolios (T = 1 y).
Panel A: daily return
CovarianceSD × 100One-tail t testTurnover (%)25th Turnover (%)75th Turnover (%)
St,14.3973-0.74533.31741.68974.2155
St,24.9495-3.22056.68243.70518.2612
St,34.35690.57551.56110.72661.9814
alternatives 4.3973-0.74533.31741.68974.2155
alternatives 4.27860.96732.50551.28263.1920
alternatives 4.3805-0.53023.23101.63964.1386
alternatives 4.38370.13121.45940.78391.7540
alternatives 4.3943-0.60263.30651.68674.2125
alternatives 4.3483-0.47073.45071.88974.3523
Panel B: intraday return (with overnight)
CovarianceSD × 100One-tail t testTurnover(%)25th Turnover (%)75th Turnover (%)
St,14.5334-1.49281.98591.13082.2552
St,24.40371.55622.15511.23852.4591
St,34.5944-2.51031.13310.65471.2697
alternatives 4.5334-1.49281.98591.13082.2552
alternatives 4.40530.68651.15720.76271.3270
alternatives 4.4553-0.74831.78501.05002.0424
alternatives 4.55831.06200.94770.65541.0710
alternatives 4.4782-0.69051.89691.09142.1464
alternatives 4.5042-2.01101.97241.16962.3179
Panel C: intraday return (without overnight)
CovarianceSD × 100One-tail t testTurnover(%)25th Turnover (%)75th Turnover (%)
St,14.7817-3.84861.87871.15612.0094
St,24.46800.13062.25871.36222.4046
St,34.7988-3.16981.10200.68041.2085
alternatives 4.7817-3.84861.87871.15612.0094
alternatives 4.46030.34871.07100.72221.1658
alternatives 4.6069-2.91741.63641.01271.7634
alternatives 4.56610.41260.94830.63991.0783
alternatives 4.6563-3.31591.74431.07611.8579
alternatives 4.7317-4.12281.88181.14742.1217

Notes: standard deviations are annualized.

Several conclusions can be obtained from Table 6. First, tracking error achieved by St,1 with daily return is 0.043973, which is smaller than the tracking errors of using intraday return. Second, the shrinkage inverse covariance matrix estimator, St,5inv, has the best out-of-sample performance (i.e., smallest tracking error). Third, when including intraday returns, introducing overnight returns often yields better tracking performance, which implies that the overnight returns are useful. Fourth, compared with covariance matrix estimators, most inverse covariance matrix estimators do not have significantly better performances. Furthermore, Table 6 provides turnover rates of applying different strategies. The turnover rates of strategies based on intraday return are smaller, which indicates the value of intraday returns. The tracking strategy with the estimator St,7inv has the lowest turnover rate.

Surprisingly, involving intraday returns may lower the tracking performance but improve the turnover rates for these estimators. What’s more, overnight returns are meaningful to improve the tracking performance.

Performance of FACTOR-MIDAS-POET method

We consider two models. The first one is based on daily returns of stocks, two daily observable factors, and two monthly macro-economic principle components. The second one is based on 5-min returns of stocks, two daily observable factors, and two monthly macro-economic principle components. Similar to [30], we estimate the models with slow weights (θ1 = 1 and θ2 = 4). We also try other forms of weights and obtain the similar conclusions.

Table 7 presents out-of-sample tracking performances of strategies based on FACTOR-MIDAS-POET model. The lag period column reports how many monthly macro-economic principle components are included in the models. Panel A presents the results based on daily returns, while Panel B and Panel C present the results based on 5-min returns. We report the one-tailed t test of tracking errors of different portfolios based on Newey-West standard deviations with six lags in Tables 6 and 7. Following [39], we examine the average squared excess returns over the target index of various estimates with the average squared excess returns of FACTOR-MIDAS-POET method (K = 3). To test the average of S1,F-M-P2, S2,F-M-P2, …, SN,F-M-P2 is significantly smaller than the average of S1,others2, S2,others2, …, SN,others2, we test whether the mean of the sequence log(S1,F-M-P2/S1,others2), log(S2,F-M-P2/S2,others2), …, log(SN,F-M-P2/SN,others2) is significantly smaller than zero using a one-tailed test.

Table 7
Out-of-sample performances of FACTOR-MIDAS-POET method (T = 1 y).
Panel A: daily return
Lag period (K)SD × 100One-tail t testTurnover (%)25th Turnover (%)75th Turnover (%)
3 months4.30682.53131.33833.1822
6 months4.30370.55122.53741.33733.1671
9 months4.30031.55562.52591.31813.1328
1 year4.29831.52152.53571.32373.1416
Panel B: intraday return (with overnight)
Lag period (K)SD × 100One-tail t testTurnover (%)25th Turnover (%)75th Turnover (%)
3 months4.34927.94113.828210.6202
6 months4.37320.82247.41643.27559.7167
9 months4.39680.87987.75292.505810.4505
1 year4.4043-0.43627.71522.654510.2169
Panel C: intraday return (without overnight)
Lag period (K)SD × 100One-tail t testTurnover (%)25th Turnover (%)75th Turnover (%)
3 months4.37117.57172.696710.5070
6 months4.39680.50927.14112.75689.6251
9 months4.41790.01777.82592.759510.4848
1 year4.4240-0.28337.85352.824010.5540

Some conclusions can be obtained from Table 7. First, the tracking performance and turnover ratio based on daily returns are both better than those based on 5-min returns. Second, the tracking performance of our model depends on how many monthly macro-economic principle components are included. Third, except St,3, St,5inv and St,7inv, the tracking performance and turnover ratio of proposed model are both better than other covariance estimation models based on daily returns. Meanwhile, the one-tail t value of St,3, St,5inv and St,7inv are not great. Similarly, compared with most models reported in Table 6, the proposed FACTOR-MIDAS-POET model has better out-of-sample tracking performance using intraday return. Because our method utilizes financial data with different resources and frequencies. However, the turnover rate of our model is high. This indicates that investment strategies constructed by our method are more active. The high turnover rate is not always a negative indicator. When trying to minimize the index tracking errors for investors, the index investment strategy constructed by our model can reach the targeted performance.

Robust analysis

In this robust analysis, we change the rolling window into three months (3 m), six months (6 m) and nine months (9 m), in order to check whether the length of rolling windows affects our main results.

Tables 810 summarize out-of-sample performances of strategies based on different models under three different lengths of rolling windows. When the length of estimation window decreases, tracking errors and turnover rates of all models increase. Because less information is used. Meanwhile, as the length of estimation window decreases, the value of 5-min return data becomes more and more important. Therefore, the intraday information can help us estimate covariance matrix or inverse covariance matrix and construct investment strategies when there is a lack of other information. Most importantly, the proposed FACTOR-MIDAD-POET model has better tracking performance than other models in general, especially when the rolling window is three months. Thus, when the length of estimation window is relatively short, introducing macro-economic information and option market information can greatly improve index tracking strategy.

Table 8
Out-of-sample performances of portfolios with T = 9 m.
daily returnintraday return (with overnight)intraday return (without overnight)
SD × 100One-tail t testTurnover (%)SD × 100One-tail t testTurnover (%)SD × 100One-tail t testTurnover (%)
St,14.3962-0.76954.38674.5215-2.58422.48934.8109-3.84182.2793
St,25.0352-4.22569.60524.34430.80502.70814.4254-0.26262.8732
St,34.35980.22211.56794.5997-3.04351.13604.7988-3.55961.1020
alternatives 4.3962-0.76954.38674.5215-2.58422.48934.8109-3.84182.2793
alternatives 4.26960.92053.18464.3977-0.25011.24254.4610-0.64511.1080
alternatives 4.3725-0.49604.22684.4134-1.47292.11064.5698-3.48451.8424
alternatives 4.3613-0.17501.82724.54540.44130.96024.5559-0.47690.9560
alternatives 4.3921-0.29104.36564.4424-1.58732.31114.6302-3.61132.0235
alternatives 4.3237-0.38484.66594.4805-3.41802.40424.7448-4.02672.2522
FACTOR-MIDAS-POET Method
K = 3 m4.30792.95234.30347.64264.37097.1357
K = 6 m4.31721.38722.94394.2937-0.31529.30444.36061.23959.0471
K = 9 m4.32021.22882.93334.28610.532810.24194.3515-0.398110.2960
K = 1 y4.31810.74662.92684.28611.818910.50344.3491-0.391910.8934
Table 9
Out-of-sample performances of portfolios with T = 6 m.
daily returnintraday return (with overnight)intraday return (without overnight)
SD × 100One-tail t testTurnover (%)SD × 100One-tail t testTurnover (%)SD × 100One-tail t testTurnover (%)
St,14.5932-1.58377.15644.5772-1.38153.43414.9135-3.06592.9827
St,25.6487-5.965317.15804.35740.00903.79234.5930-0.91544.1070
St,34.36760.34671.58054.6044-2.58551.14124.7986-2.83461.1020
alternatives 4.5932-1.58377.15644.5772-1.38153.43414.9135-3.06592.9827
alternatives 4.3597-0.56704.84904.40631.03461.35574.48010.81041.1356
alternatives 4.5404-1.71536.76574.3990-0.35202.61264.5555-1.54502.0895
alternatives 4.37501.13812.92384.52920.27770.98604.54850.06800.9668
alternatives 4.5829-2.12517.10174.4431-0.52753.01794.6282-2.02292.4208
alternatives 4.4795-0.46447.69524.5050-1.66953.25694.8155-3.28563.0981
FACTOR-MIDAS-POET Method
K = 3 m4.35364.17014.287113.40004.385813.8514
K = 6 m4.34821.37384.25964.30920.321312.90604.40690.170613.3304
K = 9 m4.34751.57344.19164.31061.131512.53554.40640.055113.0920
K = 1 y4.34671.65944.23974.30750.863112.21624.4041-0.079312.8645
Table 10
Out-of-sample performances of portfolios with T = 3 m.
daily returnintraday return (with overnight)intraday return (without overnight)
SD × 100One-tail t testTurnover (%)SD × 100One-tail t testTurnover (%)SD × 100One-tail t testTurnover (%)
St,27.5510-9.948754.51144.7384-0.11679.35695.13820.211910.7838
St,34.37201.35741.60264.6092-2.67601.14744.7990-3.12011.1023
alternatives 5.7823-4.800424.03164.7317-1.48586.92064.9736-3.04045.5949
alternatives 4.6812-1.142313.83124.41191.12021.45024.4645-0.12401.1311
alternatives 5.4748-3.744321.69044.29592.16703.46874.32631.31672.3255
alternatives 4.6957-0.622313.14634.39810.60041.15694.4206-0.72991.0495
alternatives 5.7178-4.418023.64444.34321.89644.78184.34080.87903.0793
alternatives 5.2228-3.153424.96884.5047-0.61966.02994.7502-3.47485.3643
FACTOR-MIDAS-POET Method
K = 3 m4.50927.55654.385214.81794.433014.2007
K = 6 m4.50230.06297.62504.3960-0.179314.88754.4467-0.352013.9836
K = 9 m4.4996-0.34667.64624.4041-0.960614.61584.4541-0.676713.7511
K = 1 y4.4993-0.79477.63414.4032-0.738714.07254.4540-0.669313.1841

Conclusions and discussion

In this paper, we propose the FACTOR-MIDAS-POET model, which integrates the intraday return data, daily risk factors data and monthly or quarterly macro economy data, simultaneously. In empirical analysis, we show that the FACTOR-MIDAS-POET model with macro-economic factors has better out-of-sample tracking performance than most current used models in the literature. The proposed model can fully utilize the financial data with different resources and frequencies. However, the better tracking performance often accompanies with higher turnover rates. Meanwhile, we find that our model has better performance as the length of the estimation window decreases.

Our work is a preliminarily study on the index tracking with mix-frequency data. There are still a lot of aspects we do not cover. For instance, we do not consider the noise of intraday data in the model. Also, when studying index tracking problems, the transaction cost is not explicitly appeared in the model. These problems are worth to be further explored in future studies.

References

MRudolf, HWolter, and HZimmermann. A linear model for tracking error minimization. Journal of Banking and Finance, 23(1):85103, 1999. 10.1016/S0378-4266(98)00076-4

RJansen and RVDijk. Optimal benchmark tracking with small portfolios. The journal of portfolio management, 28(2):3339, 2002. 10.3905/jpm.2002.319830

TFColeman, YLi, and JHenniger. Minimizing tracking error while restricting the number of assets. Journal of Risk, 8(4):33, 2006. 10.21314/JOR.2006.134

FCorielli and MMarcellino. Factor based index tracking. Journal of Banking & Finance, 30(8):22152233, 2006. 10.1016/j.jbankfin.2005.07.012

QLiu. On portfolio optimization: How and when do we benefit from high-frequency data? Journal of Applied Econometrics, 24(4):560582, 2009. 10.1002/jae.1062

NACanakgoz and JEBeasley. Mixed-integer programming approaches for index tracking and enhanced indexation. European Journal of Operational Research, 196(1):384399, 2009. 10.1016/j.ejor.2008.03.015

GGuastaroba and MGSperanza. Kernel search: An application to the index tracking problem. European Journal of Operational Research, 217(1):5468, 2012. 10.1016/j.ejor.2011.09.004

TBollerslev, RFEngle, and JMWooldridge. A capital asset pricing model with time-varying covariances. Journal of Political Economy, 96(1):116131, 1988. 10.1086/261527

OLedoit and MWolf. Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. Journal of empirical finance, 10(5):603621, 2003. 10.1016/S0927-5398(03)00007-0

10 

JFleming, CKirby, and BOstdiek. The economic value of volatility timing using realized volatility. Journal of Financial Economics, 67(3):473509, 2003. 10.1016/S0304-405X(02)00259-3

11 

RCMerton. On estimating the expected return on the market: An exploratory investigation. Journal of financial economics, 8(4):323361, 1980. 10.1016/0304-405X(80)90007-0

12 

TGAndersen and TBollerslev. Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International Economic Review, 39(4):885905, 1998. 10.2307/2527343

13 

TGAndersen, TBollerslev, FXDiebold, and HEbens. The distribution of realized stock return volatility. Journal of financial economics, 61(1):4376, 2001. 10.1016/S0304-405X(01)00055-1

14 

TGAndersen, TBollerslev, FXDiebold, and PLabys. Modeling and forecasting realized volatility. Econometrica, 71(2):579625, 2003. 10.1111/1468-0262.00418

15 

FMBandi, JRRussell, and YZhu. Using high-frequency data in dynamic portfolio choice. Econometric Reviews, 27(1-3):163198, 2008. 10.1080/07474930701870461

16 

MDPooter, MMartens, and DVDijk. Predicting the daily covariance matrix for s&p 100 stocks using intraday data but which frequency to use? Econometric Reviews, 27(1-3):199229, 2008. 10.1080/07474930701873333

17 

RChiriac and VVoev. Modelling and forecasting multivariate realized volatility. Journal of Applied Econometrics, 26(6):922947, 2011. 10.1002/jae.1152

18 

NHautsch, LMKyj, and PMalec. Do high-frequency data improve high-dimensional portfolio allocations? Journal of Applied Econometrics, 30(2):263290, 2015. 10.1002/jae.2361

19 

BEfron, CMorris. Multivariate empirical bayes and estimation of covariance matrices. The Annals of Statistics, 4(1):2232, 1976. 10.1214/aos/1176343345

20 

LRHaff. Minimax estimators for a multinormal precision matrix. Journal of Multivariate Analysis, 7(3):374385, 1977. 10.1016/0047-259X(77)90079-3

21 

LRHaff. Estimation of the inverse covariance matrix: Random mixtures of the inverse wishart matrix and the identity. Annals of Statistics, 7(6):12641276, 1979. 10.1214/aos/1176344845

22 

DKDey, MGhosh, and CSrinivasan. A new class of improved estimators of a multinormal precision matrix. Statistics & Risk Modeling, 8(2):141152, 1990.

23 

TKubokawa. A revisit to estimation of the precision matrix of the wishart distribution. Journal of Statistical Research, 39:91114, 2005.

24 

MJFlannery and AAProtopapadakis. Macroeconomic factors do influence aggregate stock returns. The review of financial studies, 15(3):751782, 2002. 10.1093/rfs/15.3.751

25 

KLChang. Do macroeconomic variables have regime-dependent effects on stock return dynamics? evidence from the markov regime switching model. Economic Modelling, 26(6):12831299, 2009. 10.1016/j.econmod.2009.06.003

26 

LBaele, GBekaert, and KInghelbrecht. The determinants of stock and bond return comovements. The Review of Financial Studies, 23(6):23742428, 2010. 10.1093/rfs/hhq014

27 

RFEngle, EGhysels, and BSohn. Stock market volatility and macroeconomic fundamentals. Review of Economics and Statistics, 95(3):776797, 2013. 10.1162/REST_a_00300

28 

HAsgharian, AJHou, and FJaved. The importance of the macroeconomic variables in forecasting stock return variance: A garch-midas approach. Journal of Forecasting, 32(7):600612, 2013. 10.1002/for.2256

29 

HAsgharian, CChristiansen, and AJHou. Macro-finance determinants of the long-run stock–bond correlation: The dcc-midas specification. Journal of Financial Econometrics, 14(3):617642, 2015. 10.1093/jjfinec/nbv025

30 

EGhysels, ASinko, and RValkanov. Midas regressions: Further results and new directions. Econometric Reviews, 26(1):5390, 2007. 10.1080/07474930600972467

31 

Engle RF, Ghysels E, and Sohn B. On the economic sources of stock market volatility. AFA 2008 New Orleans Meetings Paper, 2008.

32 

MMarcellino and CSchumacher. Factor midas for nowcasting and forecasting with ragged-edge data: A model comparison for german gdp. Oxford Bulletin of Economics and Statistics, 72(4):518550, 2010. 10.1111/j.1468-0084.2010.00591.x

33 

NShephard and KSheppard. Realising the future: Forecasting with high-frequency-based volatility (heavy) models. Journal of Applied Econometrics, 25(2):197231, 2010. 10.1002/jae.1158

34 

DKim and JFan. Factor garch-ito models for high-frequency data with application to large volatility matrix prediction. Journal of Econometrics, 208(2):395417, 2019. 10.1016/j.jeconom.2018.10.003

35 

WKNewey and KDWest. A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55:703708, 1987. 10.2307/1913610

36 

KJCohen, GAHawawini, SFMaier, RASchwartz, and DKWhitcomb. Friction in the trading process and the estimation of systematic risk. Journal of Financial Economics, 12(2):263278, 1983. 10.1016/0304-405X(83)90038-7

37 

JFan, YLiao, and MMincheva. Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(4):603680, 2013. 10.1111/rssb.12016

38 

AKourtis, GDotsis, and RNMarkellos. Parameter uncertainty in portfolio selection: Shrinking the inverse covariance matrix. Journal of Banking & Finance, 36(9):25222531, 2012. 10.1016/j.jbankfin.2012.05.005

39 

Basak GK, Ma T, and Jagannathan R. Assessing the risk in sample minimum risk portfolios. Working paper, Northwestern University, 2004.