msrc¶
-
hf.
msrc
(tick_series_list, M=None, N=None, pairwise=True)[source]¶ The multi-scale realized volatility (MSRV) estimator of Zhang (2006). It is extended to multiple dimensions following Zhang (2011). If
pairwise=True
estimate correlations with pairwise-refresh time previous ticks and variances with all available ticks for each asset.- Parameters
- tick_series_listlist of pd.Series
Each pd.Series contains tick-log-prices of one asset with datetime index. Must not contain nans.
- Mint, >=1, default=None
The number of scales If
M=None
all scales \(i = 1, ..., M\) are used, where M is chosen \(M = n^{1/2}\) acccording to Eqn (34) of Zhang (2006).- Nint, >=0, default=None
The constant \(N\) of Tao et al. (2013) If
N=None
\(N = n^{1/2}\). Lam and Qian (2019) need \(N = n^{2/3}\) for non-sparse integrated covariance matrices, in which case the rate of convergence reduces to \(n^{1/6}\).- pairwisebool, default=True
If
True
the estimator is applied to each pair individually. This increases the data efficiency but may result in an estimate that is not p.s.d.
- Returns
- outnumpy.ndarray
The mrc estimate of the integrated covariance matrix.
Notes
Realized variance estimators based on multiple scales exploit the fact that the proportion of the observed realized variance over a specified interval due to microstructure noise increases with the sampling frequency, while the realized variance of the true underlying process stays constant. The bias can thus be corrected by subtracting a high frequency estimate, scaled by an optimal weight, from a medium frequency estimate. The weight is chosen such that the large bias in the high frequency estimate, when scaled by the weight, is exactly equal to the medium bias, and they cancel each other out as a result.
By considering \(M\) time scales, instead of just two as in
tsrc()
, Zhang2006 improves the rate of convergence to \(n^{-1 / 4}\). This is the best attainable rate of convergence in this setting. The proposed multi-scale realized volatility (MSRV) estimator is defined as \begin{equation} \langle\widehat{X^{(j)}, X^{(j)}}\rangle^{(MSRV)}_T=\sum_{i=1}^{M} \alpha_{i}[Y^{(j)}, Y^{(j)}]^{\left(K_{i}\right)}_T \end{equation} where \(\alpha_{i}\) are weights satisfying \begin{equation} \begin{aligned} &\sum \alpha_{i}=1\ &\sum_{i=1}^{M}\left(\alpha_{i} / K_{i}\right)=0 \end{aligned} \end{equation} The optimal weights for the chosen number of scales \(M\), i.e., the weights that minimize the noise variance contribution, are given by \begin{equation} a_{i}=\frac{K_{i}\left(K_{i}-\bar{K}\right)} {M \operatorname{var}\left(K\right)}, \end{equation} where %\(\bar{K}\) denotes the mean of \(K\). $$\bar{K}=\frac{1}{M} \sum_{i=1}^{M} K_{i} \quad \text { and } \quad \operatorname{var}\left(K\right)=\frac{1}{M} \sum_{i=1}^{M} K_{i}^{2}-\bar{K}^{2}. $$ If all scales are chosen, i.e., \(K_{i}=i\), for \(i=1, \ldots, M\), then \(\bar{K}=\left(M+1\right) / 2\) and \(\operatorname{var}\left(K\right)= \left(M^{2}-1\right) / 12\), and hence\begin{equation} a_{i}=12 \frac{i}{M^{2}} \frac{i / M-1 / 2-1 / \left(2 M\right)}{1-1 / M^{2}}. \end{equation} In this case, as shown by the author in Theorem 4, when \(M\) is chosen optimally on the order of \(M=\mathcal{O}(n^{1/2})\), the estimator is consistent at rate \(n^{-1/4}\).
References
Zhang, L. (2006). Efficient estimation of stochastic volatility using noisy observations: A multi-scale approach, Bernoulli 12(6): 1019–1043.
Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise, Journal of Econometrics 160.
Examples
>>> np.random.seed(0) >>> n = 200000 >>> returns = np.random.multivariate_normal([0, 0], [[1,0.5],[0.5,1]], n)/n**0.5 >>> prices = 100*np.exp(returns.cumsum(axis=0)) >>> # add Gaussian microstructure noise >>> noise = 10*np.random.normal(0, 1, n*2).reshape(-1, 2)*np.sqrt(1/n**0.5) >>> prices +=noise >>> # sample n/2 (non-synchronous) observations of each tick series >>> series_a = pd.Series(prices[:, 0]).sample(int(n/2)).sort_index() >>> series_b = pd.Series(prices[:, 1]).sample(int(n/2)).sort_index() >>> # get log prices >>> series_a = np.log(series_a) >>> series_b = np.log(series_b) >>> icov = msrc([series_a, series_b], M=1, pairwise=False) >>> icov_c = msrc([series_a, series_b]) >>> # This is the biased, uncorrected integrated covariance matrix estimate. >>> np.round(icov, 3) array([[11.553, 0.453], [ 0.453, 2.173]]) >>> # This is the unbiased, corrected integrated covariance matrix estimate. >>> np.round(icov_c, 3) array([[0.985, 0.392], [0.392, 1.112]])