University Seminar Site

Add to calendar (vCal)

Roberto Navigli
Mon 05 Sep 2016, 11:00 - 12:30
Informatics Forum (IF-4.31/4.33)

If you have a question about this talk, please contact: Diana Dalla Costa (ddallac)

ABSTRACT:

In this talk I will present different kinds of representation of word senses and concepts. I will start with latent representations obtained as sense embeddings from the application of word2vec to the Wikipedia corpus, sense-tagged with a multilingual disambiguation algorithm based on BabelNet, the largest multilingual semantic network and encyclopedic dictionary covering 14 million concepts and entities and 271 languages.

I will then move on to two explicit vector representations of meaning (NASARI), based on lexical co-occurrence and multilingual semantic generalization, respectively, and a third latent version obtained from the word embeddings of the lexical vector.

Experimental results in several tasks, including word similarity, sense clustering, identification of sense predominance, and word sense disambiguation highlight high performance and show that, whenever a comparison is possible, sense representations consistently outperform word representations.

This is joint work with José Camacho-Collados, Ignacio Iacobacci and Mohammad Taher Pilehvar.

BIOGRAPHY:

Roberto Navigli is an Associate Professor in the Department of Computer Science of the Sapienza University of Rome. He was awarded the Marco Somalvico 2013 AI*IA Prize for the best young researcher in AI. He is the first Italian recipient of an ERC Starting Grant in computer science, on multilingual word sense disambiguation (2011-2016), and a co-PI of a Google Focused Research Award on Natural Language Understanding. In 2015 he received the META prize for groundbreaking work in overcoming language barriers with BabelNet, a project also highlighted in TIME magazine this year. His research lies in the field of Natural Language Processing (including multilingual word sense disambiguation and induction, multilingual entity linking, large-scale knowledge acquisition, ontology learning from scratch, gamification for NLP, open information extraction and relation extraction). Currently he is an Associate Editor of the Artificial Intelligence Journal.

This talk is part of the Informatics: Institute for Language, Cognition and Computation/HCRC Seminar Series series

Monolingual and multilingual, explicit and latent vector representations of meaning