In this exam, we develop scalable learning methods for sequential data models with latent (hidden) states. Two popular examples of such models are state space models (SSMs) and recurrent neural networks (RNNs). By augmenting an observed sequence with a latent states sequence, SSMs and RNNs model complex temporal dynamics with a simpler, smaller parametrization. Unfortunately, learning the parameters of these latent state sequence models requires processing the latent states along the entire sequence, which scales poorly for long sequential data.
We consider testing marginal independence versus conditional independence in a trivariate Gaussian setting. The two models are non-nested and their intersection is a union of two marginal independences. We consider two sequences of such models, one from each type of independence, that are closest to each other in the Kullback-Leibler sense as they approach the intersection. They become indistinguishable if the signal strength, as measured by the product of two correlation parameters, decreases faster than the standard parametric rate.
Convolutional Neural Networks, as most artificial neural networks, are commonly viewed as methods different in essence from kernel-based methods. In this talk I will provide a systematic translation of Convolutional Neural Networks (ConvNets) into their kernel-based counterparts, Convolutional Kernel Networks (CKNs), and demonstrate that this perception is unfounded both formally and empirically.