Lucian Popa
Fri 05 Oct 2018, 10:00 - 11:00
MF2, School of Informatics

If you have a question about this talk, please contact: Heng Guo (hguo)

Entity resolution is a key form of reasoning that allows to establish
explicit connections among entities across heterogeneous
datasets. Such connections can represent "same-as" links between
different representations of the same real-world entity or, more
generally, can represent various types of relationships among
entities. Along with other ubiquitous operations such as information
extraction, data transformation and fusion, entity resolution is a
crucial step for building high-value, domain-specific knowledge bases
from raw data. In this talk, I will describe our work at IBM Research
- Almaden towards better abstraction and tools for entity
resolution. First, I will describe a declarative approach that uses
constraints and provides a logical foundation towards reasoning about
various types of entity linking specifications and their expressive
power. This also forms the theoretical underpinning for a concrete
high-level language that is used in production by IBM. I will then
talk about human-in-the-loop, active learning techniques to further
lower the human effort needed to reach high-accuracy entity resolution
algorithms in concrete application scenarios.

Lucian Popa is a Principal Research Staff Member and Manager at IBM
Research - Almaden, which he joined in 2000 after receiving his PhD in
Computer Science from the University of Pennsylvania. He is known for
his work on data exchange, schema mapping and, more recently, entity
resolution. At IBM, he has contributed to several products, and leads
a research team focused on human-in-the-loop systems for structured
knowledge creation and learning.