Skip to main content

Table 4 Evaluation of subgraph matching and co-occurrence baseline approach for protein-residue relation extraction on silver corpus.

From: Literature mining of protein-residue associations with graph rules learned through distant supervision

Corpus

Corpus

Precision (%)

Recall (%)

F-Mes (%)

Development

Corpus

E+P+A

80.26

77.05

78.62

 

E+P*+A*

79.10

78.10

78.60

 

E+P+A+Rule ranking

81.20

76.42

78.74

 

E+P*+A*+Rule ranking

79.35

77.68

78.51

 

Sentence co-occurrence

baseline

59.45

100

75.28

Test Corpus

E+P+A

84.07

79.43

81.69

 

E+P*+A*

82.72

80.10

81.39

 

E+P+A+Rule ranking

86.83

78.26

82.32

 

E+P*+A*+Rule ranking

83.60

78.43

80.93

 

Sentence co-occurrence baseline

62.42

100

76.86

 

Approximate subgraph matching (ASM) with distance threshold 0.6

81.96

86.62

84.22

  1. E+P+A - Match edge labels, Parts of speech, All tokens; E+P+A* - Match only Edge labels and Parts of speech.