Skip to main content

Table 7 Performance measures used in the included studies

From: Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies

Description

Formula

n (%)

References

Confusion Matrix

Lists the True Positives (TP), True Negatives (TN), False Positives (FP), False Negatives (FN), and the Total (n) amount in a 2 × 2 contingency Table.

TP: Text annotated with ontology concept when ontology concept is present in reference standard

TN: Text not annotated with ontology concept when ontology concept is absent in reference standard

FP: Text annotated with ontology concept when ontology concept is absent in reference standard

FN: Text not annotated with ontology concept when ontology concept is present in reference standard

12 (16%)

[34, 44, 47, 51, 56, 58, 60, 61, 84, 87, 91, 93]

Performance measures

Recall

\( \frac{TP}{FN+ TP} \)

68 (88%)

[11, 12, 29,30,31, 33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53, 56,57,58, 60,61,62,63,64, 66,67,68,69,70,71,72,73, 75,76,77,78,79,80,81,82,83,84,85,86,87,88, 90,91,92,93,94, 96, 99,100,101,102,103,104]

Precision

\( \frac{TP}{FP+ TP} \)

66 (86%)

[11, 12, 29,30,31, 33,34,35,36, 38,39,40,41,42,43,44,45,46,47,48,49,50,51, 53, 56,57,58, 60,61,62,63,64,65,66,67,68,69,70,71,72,73, 75,76,77,78,79,80,81,82,83,84,85,86,87,88, 90, 91, 93, 94, 96, 99,100,101,102,103,104]

F-score

\( 2\bullet \frac{Precision\bullet Recall}{Precision+ Recall} \)

57 (74%)

[11, 12, 30, 31, 33,34,35,36, 39,40,41, 44, 46,47,48,49,50, 52, 53, 55, 57,58,59,60,61,62,63, 66,67,68,69,70,71,72,73, 75,76,77,78,79,80, 82,83,84, 86,87,88, 90, 91, 95, 96, 98,99,100, 102,103,104]

Accuracy

\( \frac{TP+ TN}{n} \)

11 (14%)

[30, 32, 34, 41, 48, 52, 67, 74, 78, 92, 96]

Specificity

\( \frac{TN}{FP+ TN} \)

6 (7.8%)

[29, 34, 85, 92, 93, 96]

AUC

Not applicable

5 (6.5%)

[29, 39, 57, 95, 99]

Kappa

\( \frac{p_o-\kern0.5em {p}_e}{1-{p}_e}=1-\frac{1-{p}_o}{1-{p}_e} \)

3 (3.9%)

[85, 89, 97]

Processing time

Not applicable

3 (3.9%)

[32, 47, 83]

Negative Predictive Value

\( \frac{TN}{FN+ TN} \)

3 (3.9%)

[29, 85, 93]

False Positive Rate

\( \frac{FP}{FP+ TN} \)

1 (1.3%)

[34]

False Negative Rate

\( \frac{FN}{TP+ FN} \)

1 (1.3%)

[34]

Information entropy

\( -{\sum}_{i=1}^n{P}_i\ \mathit{\log}\left({P}_i\right) \)

1 (1.3%)

[64]

Mean Reciprocal Rank

\( \frac{1}{Q}{\sum}_{i=1}^Q\frac{1}{{\mathit{\operatorname{rank}}}_i} \)

1 (1.3%)

[74]

Initial annotator agreement

Not applicable

1 (1.3%)

[79]

Match/no match (%)

Not applicable

1 (1.3%)

[89]

Overgeneration

\( \frac{FP}{TP+ FP} \)

1 (1.3%)

[93]

Undergeneration

\( \frac{FN}{TP+ FN} \)

1 (1.3%)

[68]

Error

\( \frac{FN+ FP}{TP+ FN+ FP} \)

1 (1.3%)

[68]

Fallout

\( \frac{FP}{TN+ FP} \)

1 (1.3%)

[68]

Mean Standard Error

\( \frac{1}{n}{\sum}_{i=1}^n{\left({Y}_i-{\hat{Y}}_i\right)}^2 \)

1 (1.3%)

[57]