We introduce and evaluate mixed-frequency multivariate GARCH models for forecasting low-frequency (weekly or monthly) multivariate volatility based on high-frequency intra-day returns (at five-minute intervals) and on the overnight returns. The low-frequency conditional volatility matrix is modelled as a weighted sum of an intra-day and an overnight component, driven by the intra-day and the overnight returns, respectively. The components are specified as multivariate GARCH (1,1) models of the BEKK type, adapted to the mixed-frequency data setting. For the intra-day component, the squared high-frequency returns enter the GARCH model through a parametrically specified mixed-data sampling (MIDAS) weight function or through the sum of the intra-day realized volatilities. For the overnight component, the squared overnight returns enter the model with equal weights. Alternatively, the low-frequency conditional volatility matrix may be modelled as a single-component BEKK-GARCH model where the overnight returns and the high-frequency returns enter through the weekly realized volatility (defined as the unweighted sum of squares of overnight and high-frequency returns), or where the overnight returns are simply ignored. All model variants may further be extended by allowing for a non-parametrically estimated slowly-varying long-run volatility matrix. The proposed models are evaluated using five-minute and overnight return data on four DJIA stocks (AXP, GE, HD, and IBM) from January 1988 to November 2014. The focus is on forecasting weekly volatilities (defined as the low frequency). The mixed-frequency GARCH models are found to systematically dominate the low-frequency GARCH model in terms of in-sample fit and out-of-sample forecasting accuracy. They also exhibit much lower low-frequency volatility persistence than the low-frequency GARCH model. Among the mixed-frequency models, the low-frequency persistence estimates decrease as the data frequency increases from daily to five-minute frequency, and as overnight returns are included. That is, ignoring the available high-frequency information leads to spuriously high volatility persistence. Among the other findings are that the single-component model variants perform worse than the two-component variants; that the overnight volatility component exhibits more persistence than the intra-day component; and that MIDAS weighting performs better than not weighting at all (i.e., than realized volatility)