The paper “K-RET: Knowledgeable Biomedical Relation Extraction System”, authored by LASIGE’s PhD student Diana F. Sousa has been published in the Bioinformatics journal, a top-ranked journal (h5-index 142; Scimago Q1). The paper’s co-author is LASIGE’s integrated researcher Francisco M. Couto.
Relation Extraction (RE) is a crucial process to deal with the amount of text published daily, for example, to find missing associations in a database. RE is a text mining task for which the state-of-the-art approaches use bidirectional encoders, namely, BERT. However, state-of-the-art performance may be limited by the lack of efficient external knowledge injection approaches, with a larger impact in the biomedical area given the widespread usage and high quality of biomedical ontologies. This knowledge can propel these systems forward by aiding them in predicting more explainable biomedical associations. With this in mind, we developed K-RET, a novel, knowledgeable biomedical relation extraction system that, for the first time, injects knowledge by handling different types of associations, multiple sources and where to apply it, and multi-token entities.
The researchers tested K-RET on three independent and open-access corpora (DDI, BC5CDR, and PGR) using four biomedical ontologies handling different entities. K-RET improved state-of-the-art results by 2.68% on average, with the DDI Corpus yielding the most significant boost in performance, from 79.30% to 87.19% in F-measure, representing a p-value of 2.91 × 10−12. The code supporting this system is available here.
The paper is available in early access here.