Iain Murray (imurray2)
Tue 29 Nov 2016, 12:15 - 13:30
JCMB 6207

If you have a question about this talk, please contact: Dominik Csiba (s1459570)

The meeting does not cover any specific paper, it will be a hands-on discussion on the topic.

Abstract: Neural networks can be used for regression. Given an input x, guess the output y. The standard optimization task is to minimize some regularized measure of mismatch between guesses and observed training outputs.
Neural networks can also express their own uncertainty. For example, we can fit two functions, a guess m(x) and an "error-bar" s(x), by maximizing the total log probability of training outputs under a Gaussian model: \sum_n log N(y_n; m(x_n), s(x_n)^2). Fitting functions representing Gaussian outputs by stochastic steepest descent can be hard: the gradients of the loss with respect to the mean depend strongly on the standard deviation, making it hard to adapt step-sizes. Moving beyond the Gaussian assumption, we might represent p(y|x) with a mixture of Gaussians, or with quantiles. For multivariate y we can use multivariate Gaussians or RNADE. Gaussians are also fitted in stochastic variational inference, sometimes with diagonal covariances, sometimes low-rank + diagonal. We are able to optimize all these things to some extent, but it's harder than conventional neural networks, which hinders wide-spread adoption of the methods.
Relevant papers Mixture Density Networks (MDNs), Multivariate MDN, RNADE, Bayesian MDN, matrix manifold optimization for Gaussian mixtures