Health and Biomedical Informatics Meetups are monthly gatherings of LASIGE members with interests in Bioinformatics, Chemoinformatics, Text Mining, Biomedical Decision Systems and related topics.
Title: Multi-domain semantic similarity
Presenter: João D Ferreira
Background: Given the increasing amount of biomedical resources that are being annotated with concepts from more than one ontology and covering multiple domains of knowledge, it is important to devise mechanisms to compare these resources that take into account the various domains of annotation. For example, metabolic pathways are annotated with their enzymes and their metabolites, and thus similarity measures should compare them with respect to both of those domains simultaneously.
Results: In this paper, we propose two approaches to lift existing single-ontology semantic similarity measures into multi-domain measures. The aggregative approach compares domains independently and averages the various similarity values into a final score. The integrative approach integrates all the relevant ontologies into a single one, calculating similarity in the resulting multi-domain ontology using the single-ontology measure.
Conclusions: We evaluated the two approaches in an multidisciplinary epidemiology dataset by evaluating the capacity of the similarity measures to predict new annotations based on the existing ones. The results show a promising increase in performance of the multi-domain measures over the single-ontology ones in the vast majority of the cases. These results show that multi-domain measures outperform single-domain ones, and should be considered by the community as a starting point to study more efficient multi-domain semantic similarity measures.
[Joint work with Francisco M. Couto presented at DTMBio, a workshop in CIKM 2018]