PDF e-Pub

Section: New Results

Affine Invariant Covariance Estimation for Heavy-Tailed Distributions

In this work we provide an estimator for the covariance matrix of a heavy-tailed multivariate distribution. We prove that the proposed estimator $\stackrel{^}{S}$ admits an affine-invariant bound of the form

$\left(1-ϵ\right)S\le \stackrel{^}{S}\left(1+ϵ\right)S$

in high probability, where S is the unknown covariance matrix, and $\le$ is the positive semidefinite order on symmetric matrices. The result only requires the existence of fourth-order moments, and allows for $ϵ=O\left(\sqrt{{k}^{4}dlog\left(d/\delta \right)/n}\right)$ where ${k}^{4}$ is a measure of kurtosis of the distribution, d is the dimensionality of the space, n is the sample size, and $1-\delta$ is the desired confidence level. More generally, we can allow for regularization with level $\lambda$, then d gets replaced with the degrees of freedom number. Denoting $cond\left(S\right)$ the condition number of S, the computational cost of the novel estimator is $O\left({d}^{2}n+{d}^{3}log\left(cond\left(S\right)\right)\right)$, which is comparable to the cost of the sample covariance estimator in the statistically interesting regime $n\ge d$. We consider applications of our estimator to eigenvalue estimation with relative error, and to ridge regression with heavy-tailed random design.