State space models (SSMs) are a popular modeling approach for time series. By augmenting an observed time series with a latent state sequence, SSMs capture complex dynamics with a simpler Markov dependence structure. Unfortunately, inference in SSMs requires passing messages along the entire sequence, which scales poorly for both long and high dimensional time series.

In the first part of the talk, we present a stochastic gradient MCMC approach for scalable Bayesian inference in SSMs given long time series. Naive stochastic gradient estimates based on subsequences are biased as they break crucial dependencies in the latent state sequence. We instead propose using `buffered' stochastic gradient estimates, which correct for bias by passing additional messages in a buffer around each subsequence, and prove error bounds for these estimators that decay geometrically in buffer size. We apply this framework to a variety of SSMs (including discrete, continuous and mixed-type latent states) and find it provides significant speed-ups in both synthetic and real data sets with millions of time points.

In the second part of the talk, we present an approximate collapsed Gibbs sampling scheme for time series clustering. Existing Bayesian methods for inferring clusters of time series either mix slowly (naive Gibbs) or scale cubically in the number of series (collapsed Gibbs). Our proposed method improves mixing by approximately collapsing out parameters using expectation propagation (EP), while scaling linearly instead of cubically. We empirically show that our approximate sampler has similar performance to a collapsed Gibbs sampler at a fraction of the runtime.