Skip to main content
Fig. 2 | Journal of Biomedical Semantics

Fig. 2

From: Using predicate and provenance information from a knowledge graph for drug efficacy screening

Fig. 2

Schematic overview of the feature extraction and classification process. For the sake of readability, this overview figure only shows the process for predicates. The input set contains the combinations of drug targets (DT) and disease proteins (DP) that are to be classified. Step 1: Extract paths. The paths between drug targets and disease proteins are extracted from the knowledge graph. Paths can be direct or indirect. Indirect paths have one intermediate protein (IP) and are separated in two steps: DTIP (drug target – intermediate protein) and IPDP (intermediate protein – disease protein). Step 2: Extract features. The feature set consists of all possible predicates and provenance, for each of the three scenarios (cf. Fig. 1). Based on the extracted paths for a combination, the presence or absence of each feature is set. Step 3: Classify. Based on the extracted features, the combinations are classified by a random forest classifier

Back to article page