Gradient Group Lasso Identifies Sparse Functional Basis for Molecular Manifolds

Samson Jonathan Koelle

We present a method for analyzing low-energy paths between molecular conformations by combining techniques in both manifold learning, which identifies such paths, and functional regression, which can parameterize them by explanatory non-linear functions. Unsupervised manifold learning approaches are useful for understanding molecular dynamics simulations since they disregard small-scale information such as peripheral hydrogen vibrations that can nevertheless drastically affect the observed energy. However, understanding the role of covariates such as bond rotation in determining the energy landscape is made difficult by non-trivial data topology and geometry. In order to deal with these difficulties, we regress gradients of embedding coordinates on functional covariate gradients, and use a group-lasso inspired penalty for inducing sparsity. Differentiation of functional covariates is done automatically, while embedding gradients are estimated. This method replaces visual inspection for determining which bonds describe the slow dynamical modes of small molecules.