Skip to main content

Table 5 Characteristics of the included studies

From: Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies

Description

n (%)

References

Main objective

Information extraction

45 (58%)

[29, 32,33,34,35,36, 38, 40,41,42,43,44,45, 49, 51, 58,59,60, 63,64,65,66, 68,69,70, 72, 73, 75, 76, 78,79,80, 82, 84,85,86,87, 89, 90, 94, 95, 100, 101, 103, 104]

Information enrichment

9 (12%)

[30, 31, 39, 48, 50, 52, 56, 67, 81]

Classification

8 (10%)

[11, 12, 53, 88, 92, 93, 96, 99]

Software development and evaluation

6 (7.8%)

[37, 46, 47, 61, 83, 102]

Prediction

4 (5.2%)

[57, 91, 97, 98]

Information comparison

2 (2.6%)

[62, 77]

Computer-assisted coding

2 (2.6%)

[55, 71]

Text processing

1 (1.3%)

[74]

Part of challenge

i2b2

(Informatics for Integrating Biology and the Bedside)

10 (13%)

[11, 44, 47, 58, 68, 69, 73, 76, 78, 83]

Entire system

8 (10%)

[11, 44, 58, 68, 69, 73, 76, 78]

Parts of the system

2 (2.6%)

[47, 83]

SemEval (Semantic Evaluation)

2 (2.6%)

[41, 83]

Entire system

1 (1.3%)

[41]

Parts of the system

1 (1.3%)

[83]

ShARe/CLEF

(Shared Annotated Resources/Conference and Labs of the Evaluation Forum)

1 (1.3%)

[83]

Parts of the system

1 (1.3%)

[83]

Dataset: language

English

60 (78%)

[11, 12, 29, 30, 32, 35, 37,38,39, 41,42,43,44,45,46,47, 49, 53, 55, 56, 58, 60, 62,63,64,65,66,67,68,69,70,71,72,73, 75,76,77,78,79,80,81, 83,84,85,86, 89, 90, 92,93,94,95,96,97,98,99,100,101,102,103,104]

Spanish

5 (6.5%)

[31, 36, 52, 74, 82]

French

3 (3.9%)

[51, 87, 88]

German

3 (3.9%)

[33, 34, 61]

Italian

2 (2.6%)

[40, 43]

Portuguese

2 (2.6%)

[48, 50]

Dutch

1 (1.3%)

[57]

Japanese

1 (1.3%)

[91]

Korean

1 (1.3%)

[59]

Dataset: Origin

Data present in institute

55 (71%)

[12, 29, 31, 32, 34,35,36, 38,39,40, 42, 43, 45, 47, 48, 50,51,52,53, 56, 57, 59,60,61,62,63,64,65,66,67, 70, 71, 74, 77,78,79,80,81,82,83,84,85,86, 88, 89, 91,92,93,94, 96, 97, 99, 101,102,103]

Existing dataset

25 (33%)

[11, 30, 33, 35, 37, 41, 44, 46, 49, 55, 58, 64, 68, 69, 72, 73, 75, 76, 83, 87, 90, 95, 98, 100, 104]

Included reference to dataset

21 (27%)

[11, 30, 35, 37, 41, 44, 46, 49, 55, 58, 64, 72, 75, 76, 83, 87, 90, 95, 98, 100, 104]

Training of algorithm

Trained

47 (61%)

[11, 12, 29, 31, 32, 34, 37, 39, 41, 42, 44, 45, 48,49,50,51,52,53, 55,56,57,58,59, 62, 63, 65, 66, 68, 69, 73, 74, 76, 78,79,80,81,82,83,84, 87, 88, 90, 95, 96, 98, 99, 104]

Not listed

3 (3.9%)

[30, 101, 102]

Development of algorithm

Use of development set

16 (21%)

[12, 29, 31, 34, 37, 49, 55, 60, 63, 69, 74, 80, 87, 90, 94, 95]

Not listed

4 (5.2%)

[30, 82, 83, 101]

Used NLP system or algorithm

New NLP system or algorithm

29 (38%)

[31, 32, 37, 43, 45, 47,48,49,50,51,52, 55, 57, 59, 68, 73, 74, 80, 82, 83, 85, 88, 89, 91, 94, 95, 100,101,102]

New NLP system or algorithm with existing components

25 (33%)

[12, 29, 34, 39, 41, 42, 44, 46, 58, 60,61,62,63, 66, 67, 69, 71, 75, 76, 78, 84, 87, 90, 98, 99]

Existing NLP system or algorithm

23 (30%)

[11, 30, 33, 35, 36, 38, 40, 53, 56, 64, 65, 70, 72, 77, 79, 81, 86, 93, 96, 97, 103, 104]

Use in practice

Plans to implement / still under development and testing

12 (16%)

[31, 33, 51, 56, 62, 66,67,68, 82, 91, 96, 101]

Implemented in practice

10 (13%)

[34, 42, 43, 46,47,48, 78, 83, 87, 102]

Availability of code

Published algorithm or source code

15 (20%)

[31, 45,46,47, 60, 78, 80, 82,83,84,85, 87, 90, 97, 98]

Pseudocode in manuscript

3 (3.9%)

[43, 56, 62]

Planning to publish algorithm or source code

1 (1.3%)

[32]

Not applicable, used an existing system

20 (26%)

[11, 30, 33, 35, 36, 38, 40, 53, 64, 65, 70, 72, 77, 79, 81, 86, 93, 96, 103, 104]