Skip to main content

Table 2 Coverage of term recognition for concepts and relations in experimental data

From: Ranking relations between diseases, drugs and genes for a curation task

PharmGKB data set

  

t

tm

tc

tmc

Entity

N

abs

rel

abs

rel

abs

rel

abs

rel

Di

3830

2550

66.58

2872

74.99

2557

66.76

2872

74.99

Dr

4751

3527

74.24

3632

76.45

3668

77.20

3668

77.20

Ge

7522

5838

77.61

5930

78.84

5989

79.62

5994

79.69

TOTAL

16103

11915

73.99

12434

77.22

12214

75.85

12534

77.84

Di-Di

68

22

32.35

24

35.29

22

32.35

24

35.29

Di-Dr

2715

1279

47.11

1484

54.66

1326

48.84

1494

55.03

Di-Ge

5432

3102

57.11

3555

65.45

3181

58.56

3585

66.00

Dr-Dr

181

128

70.72

132

72.93

135

74.59

135

74.59

Dr-Ge

6181

3858

62.42

4016

64.97

4097

66.28

4099

66.32

Ge-Ge

248

141

56.85

142

57.26

145

58.47

145

58.47

TOTAL

14825

8530

57.54

9353

63.09

8906

60.07

9482

63.96

CTD data set

  

t

tm

tc

tmc

Entity

N

abs

rel

abs

rel

abs

rel

abs

rel

Di

12639

6939

54.90

9502

75.18

6941

54.92

9502

75.18

Dr

38523

27541

71.49

29531

76.66

30119

78.18

30129

78.21

Ge

39150

28389

72.51

28975

74.01

29169

74.51

29199

74.58

TOTAL

90312

62869

69.61

68008

75.30

66229

73.33

68830

76.21

Di-Ge

6956

4117

59.19

5100

73.32

4163

59.85

5126

73.69

Dr-Di

12154

5335

43.90

8219

67.62

5700

46.90

8356

68.75

Dr-Ge

52746

31015

58.80

33971

64.40

34832

66.04

34883

66.13

TOTAL

71856

40467

56.32

47290

65.81

44695

62.20

48365

67.31

  1. Distribution of identifiable gold standard concepts and relations given the output from our term recognizer and split according to the inclusion of metadata: text only (t), text and MeSH terms (tm), text and chemical substance list (tc), text and all metadata (tmc).