(De)-regularized Maximum Mean Discrepancy Gradient Flow

Zonghao (Hudson) Chen, Aratrika Mustaf, Pierre Glaser, Anna Korba, Arthur Gretton, Bharath K. Sriperumbudur

September, 2024

Abstract

We introduce a (de)-regularization of the Maximum Mean Discrepancy (DrMMD) and its Wasserstein gradient flow. Existing gradient flows that transport samples from source distribution to target distribution with only target samples, either lack tractable numerical implementation (f-divergence flows) or require strong assumptions, and modifications such as noise injection, to ensure convergence (Maximum Mean Discrepancy flows). In contrast, DrMMD flow can simultaneously (i) guarantee near-global convergence for a broad class of targets in both continuous and discrete time, and (ii) be implemented in closed form using only samples. The former is achieved by leveraging the connection between the DrMMD and the χ2-divergence, while the latter comes by treating DrMMD as MMD with a de-regularized kernel. Our numerical scheme uses an adaptive de-regularization schedule throughout the flow to optimally trade off between discretization errors and deviations from the χ2 regime. The potential application of the DrMMD flow is demonstrated across several numerical experiments, including a large-scale setting of training student/teacher networks.

Type

Journal article

(De)-regularized Maximum Mean Discrepancy Gradient Flow

Abstract

Zonghao (Hudson) Chen

PhD Student