Skip to main content

Table 2 Performance assessment results of the Whatizit ANA module

From: Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles

Database

Evaluation

#TP

#FP

#FN

Precision (%)

Recall (%)

F-score (%)

New

Old

New

Old

New

Old

New

Old

New

Old

New

Old

ENA

Automatic

276

267

10

7

170

181

96.50

97.45

61.88

59.60

75.41

73.96

Manual

286

274

0

0

170

181

100

100

62.72

60.22

77.10

75.17

UniProt

Automatic

574

569

28

8

39

39

95.35

98.61

93.64

93.59

94.49

96.03

Manual

601

577

1

0

39

39

99.83

100

93.91

93.67

96.78

96.73

PDBe

Automatic

568

529

32

30

12

50

94.67

94.63

97.93

91.36

96.27

92.97

Manual

620

559

0

0

12

50

100

100

98.10

91.79

99.04

95.72

  1. FP: False Positive, FN: False Negative, Old: Old Whatizit-ANA settings, New: New Whatizit-ANA settings.
  2. Manual and automatic evaluation: In the automatic evaluation; we estimated the performance of the tool by assuming that publisher-supplied accession numbers in the articles are a gold standard for annotation. However, when we manually analysed the false positive annotations provided from our pipeline, we realised that the accession numbers provided in articles (the annotations that we assumed as gold standard in the automatic evaluation) might not be always complete or correct. Therefore, the annotations made by our tool, which were not already annotated in the article, were deemed false positives by the automatic evaluation, however, such annotations could be reassigned as true positives on manual inspection.