Lucas Deecke, Chuanya Yang, Boyan Gao
Thu 28 Mar 2019, 12:45 - 14:00
IF, 4.31/33

If you have a question about this talk, please contact: Jodie Cameron (jcamero9)

Lucas Deecke

Normalization methods are a central building block in the deep learning toolbox. They accelerate and stabilize training, while decreasing the dependence on manually tuned learning rate schedules. When learning from multi-modal distributions, the effectiveness of batch normalization (BN), arguably the most prominent normalization method, is reduced. As a remedy, we propose a more flexible approach: by extending the normalization to more than a single mean and variance, we detect modes of data on-the-fly, jointly normalizing samples that share common features. We demonstrate that our method outperforms BN and other widely used normalization techniques in several experiments, including single and multi-task datasets.

Chuanya Yang

Learning robust locomotion policies for humanoid robots

I will present my recent works on using deep reinforcement learning to learn policies for balancing and walking. The policies are trained in a physics simulator with realistic setting of robot model and low-level impedance control that are easy to transfer the learned skills to real robots. For the walking policy, additional human motion reference is used to guide the DRL agent to learn a more human-like locomotion behavior.  Both the balancing policy and walking policy are able to withstand a certain amount of external disturbance and are able to produce a wide range of adaptive, versatile and robust behaviors.

Boyan Gao

Title: unsupervised domain adaptation and Deep K-means



Unsupervised domain adaptation algorithmm aims to transfer the knowledge learned from source domain to the unlabelled target domain. Under this setting, we want to apply our Deep Clustering algorithm on target domain to formalise a K-means friendly feature space which makes the data instances from different categories separable by updating both the feature extractor, which is jointed trained by the classification task on source domain and an unsupervised clustering task on target domain, and learnable K-means cluster centers simultaneously. Compared with existing Deep K-means, our algorithm estimates the gradients blocked by nondifferentiable assignment function, then making the Deep K-means achieve end-to-end training.