Matt Graham/Zhanxing Zhu Chair: Charles Sutton |

Tue 16 Jun 2015, 11:00 - 12:00 |

IF 4.31/4.33 |

If you have a question about this talk, please contact: Mary-Clare Mackay (mmackay3)

**Matt Graham**

**Title: Dynamics, geometry and randomness: sampling from Boltzmann machine relaxations**

The Boltzmann machine is an undirected probabilistic graphical model where a distribution over the binary states of a set of nodes is defined by pairwise interaction terms. As exact inference in Boltzmann machines is intractable for all but the smallest networks, commonly MCMC methods are used instead with Gibbs sampling being particularly common.

Sampling from large fully connected Boltzmann machine models is however a challenging problem. Typically these distributions will be in a sense multi-modal - there will be multiple distinct sets of binary state vectors with high probability mass under the distribution, with these 'modes' separated by sets of states in which all configurations have low probability. In such cases Gibbs sampling tends to perform particularly badly, having very low probability of transitioning between different modes.

In this talk I will describe a method for relaxing a Boltzmann machine distribution defined over binary state vectors to an equivalent unimodal distribution defined by a density over a real-valued vector space (equivalence here meaning that samples from the relaxed distribution can be used directly to compute Monte Carlo estimates of expectations of the binary states). The geometric structure of the relaxed distribution will be explored and used to motivate a proposed sampling dynamic closely related to existing Hamiltonian Monte Carlo methods. I will present experiments showing some of the advantages of the proposed method while also explaining some of the challenges still to be overcome.

**Zhanxing Zhu**

**Title: Stochastic Parallel Block Coordinate Descent for Separable Saddle Point Problems**

We consider a general convex-concave saddle point problem with a separable structure, especially for non-strongly convex functions. Given this form of problem, we propose a simple and efficient stochastic block coordinate descent method using adaptive primal-dual updates, which enables flexible parallel optimization for large-scale optimization. Our method shares the efficiency and flexibility of coordinate descent methods while keeping the simplicity of primal-dual methods and utilizing the structure of the separable convex-concave saddle point problem.

The method is capable of solving a wide range of machine learning applications, including robust principal component analysis, Lasso and feature selection by group Lasso, etc. Both theoretically and empirically, I will show that the proposed method performs significantly better than other state-of-the-art methods in all these applications.