Pontus Stenetorp
Fri 02 Feb 2018, 11:00 - 12:30
Informatics Forum (IF-4.31/4.33)

If you have a question about this talk, please contact: Diana Dalla Costa (ddallac)


While being one of the most active areas of research in Natural Language Processing, existing Reading Comprehension (RC) datasets — SQuAD, TriviaQA, etc. — are dominated by queries that can be answered based on the content of a single paragraph or document. However, enabling models to combine pieces of textual information from different sources would drastically extend the scope of RC – effectively allowing models to even answer queries whose answers are never explicitly stated. In this talk, we will introduce a novel Multi-hop RC task, [1] where a model has to learn how to find and combine disjoint pieces of textual evidence, effectively performing multi-step (alias multi-hop) inference. We present two datasets, WikiHop and MedHop, from different domains — both constructed using a unified methodology. We will then discuss the behaviour of several baseline models, including two established end-to-end RC models, BiDAF and FastQA. For example, one model is capable of integrating information across documents, but both models struggle to select relevant information. Overall the end-to-end models outperform multiple baselines, but their best accuracy is still far behind human performance, leaving ample room for improvements. It is our hope that these new datasets will drive future RC model development, leading to new and improved applications in areas such as Search, Question Answering, and scientific text mining.


I am a researcher and educator that finds Natural Language Processing and Machine Learning research to be fascinating. Currently, I am a Senior Research Associate at University College London (UCL) and a member of the Machine Reading Group. I received a PhD from the University of Tokyo in 2013 and a MSc Eng from the Royal Institute of Technology (KTH) in 2010. My current research focuses on end-to-end Deep Learning models that learns with a minimal amount of human supervision – with a particular focus on allowing computers to pass real-world exams – and is supported by the Paul G Allen Family Foundation.