Wayne Hayes
Tue 27 Sep 2016, 11:00 - 12:00
IF 4.31/4.33

If you have a question about this talk, please contact: Gareth Beedham (gbeedham)

Biological Network Alignment Comes of Age

Biological network alignment has the potential to be as useful as sequence alignment has in relation to learning about biology, evolution, and disease.  Although about two dozen network alignment algorithms have been proposed, none as yet have proven to fulfill this potential, due to many shortcomings.  Some of these shortcomings include: lack of knowledge about how to best use network topology to recover biological information (EC? S3? Graphlets? Spectral?); how to balance biological information such as sequence against topological information; confusion in the literature between an alignment algorithm and the objective function used to guide the alignment, as well as confusion between how to produce the alignment vs. how to measure its quality post-alignment; lack of a good multiple network alignment algorithm; lack of an effective method to eliminate the 1-to-1 nature of global network alignment, since 1-to-1 mappings are not faithful to the evolutionary relationship between biological entities such as genes and proteins; and finally, due to the NP-complete nature of the problem, a lack of knowledge about how far we are from producing the best alignments possible.

In this talk, I will introduce a novel method that already solves some of these problems and for which there is a clear path towards solving all of the others listed above, and more. We clearly delineate the measure(s) M that measure the quality of an alignment, from the algorithm S that searches the space of all alignments looking for good ones according to M. This allows us to directly compare many measures M. We also demonstrate that our new algorithm S outperforms all existing algorithms by all the various measures M that we've tried.  Furthermore, we demonstrate that on synthetic networks where the answer is known, our algorithm recovers the best possible value of all objective functions we've tried --- and thus demonstrate that none of these objective functions are capable of actually recovering the best solution because none of them have sufficiently high correlation with the only measure we're actually interested in -- biological correctness.

The path towards having network alignment add significantly to biological knowledge is still not clear, but we argue that our algorithm is the first to show significant promise.