Sharon Goldwater
Fri 04 Nov 2016, 11:00 - 12:30
Informatics Forum (IF-4.31/4.33)

If you have a question about this talk, please contact: Diana Dalla Costa (ddallac)


(Note: This is a practice talk for my Needham Award talk, a public lecture which I will be delivering on 21 Nov at the Royal Society in London. Please come along to give me feedback!)

Computer processing of speech and language has advanced enormously in the last decade, with many people now using applications such as automatic translation, voice-activated search, and even language-enabled personal assistants. Yet these systems still lag far behind human capabilities, and the success they do have relies on machine learning methods that learn from very large quantities of human-annotated data (for example speech data with transcriptions or text labelled with syntactic parse trees). These resource-intensive methods mean that effective technology is available for only a tiny fraction of the world's 5000 or more languages, mainly those spoken in large rich countries. This talk will argue that in order to solve this problem, we need a better understanding of how humans learn and represent language in our minds, and we need to consider how human-like learning biases can be built into computational systems. I will illustrate these ideas using examples from my own research. I will discuss why language is such a difficult problem, say a bit about what we know about human language learning, and then show how my own work has taken inspiration from that to develop better methods for computational language learning.