Christian Hardmeier
Fri 24 Apr 2015, 11:00 - 12:00
Informatics Forum (IF-4.31/4.33)

If you have a question about this talk, please contact: Diana Dalla Costa (ddallac)


Translating pronouns is a challenge that current machine translation systems are ill-prepared to handle because it requires complex cross-sentence dependencies in the target language and because the pronouns often can't be translated literally even in a very literal translation style. My talk is going to focus on the problem of modelling pronominal anaphora in the framework of phrase-based SMT. Pronoun translation is first cast as a classification task that is separate from SMT. I describe a neural network classifier for this task that models anaphoric links as latent variables and can be trained on parallel bitexts without explicit coreference annotations. Then, I present Docent, my document-level local search decoder for phrase-based SMT, and show some results from the integration of the anaphora classifier in this framework.


Christian Hardmeier is a post-doctoral researcher at Uppsala University in Sweden, where he also completed his PhD in 2014 with an award-winning thesis on Discourse in Statistical Machine Translation. Before coming to Uppsala, he worked with the MT group at FBK in Trento (Italy) after obtaining an M.A. in Nordic philology from the University of Basel (Switzerland). His current research still focuses on both the linguistic and the technical aspects of cross-sentence, discourse-level problems in machine translation.