Skip to main content

Table 4 Included publications and their evaluation methodologies

From: Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies

Author

Year

Ref. std.

Validation

External

Generalizability a

Ref

Afshar

2019

Existing EHR data

Hold-out validation (train, test, development)

No

No, validation is needed

[29]

Alnazzawi

2016

Existing annotated corpus

External

ShARe/CLEF, NCBI disease, Heart failure and pulmonary embolism corpora

Yes, achieves competitive performance on other corpora

[30]

Atutxa

2018

Manual retrospective review

Hold-out validation (train, test, development)

No

Yes, easily portable to other languages

[31]

Barrett

2013

Manual annotations

10-fold cross validation

Multiple datasets (different provider)

Yes, expect that it is generalizable

[32]

Becker

2016

Existing annotated corpus

Not used

No

Not listed

[33]

Becker

2019

Manual annotations

Hold-out validation (train, test, development)

No

Not listed

[34]

Bejan

2015

Manual annotations

External

i2b2 data (2010)

Yes, good performance on the i2b2 dataset, even though not optimized on it

[35]

Castro

2010

Manual annotations

Not used

No

Not listed

[36]

Catling

2018

Existing annotated corpus

Hold-out validation (train, test, development)

No

Not listed

[37]

Chapman

2004

Manual annotations

Not used

No

Yes, generalizable to other domains within and outside of bio surveillance

[38]

Chen

2016

Manual annotations

10-fold cross validation

No

Not listed

[39]

Chiaramello

2016

Manual annotations

Not used

No

Not listed

[40]

Chodey

2016

Existing annotated corpus

Hold-out validation (train, test)

No

Not listed

[41]

Chung

2005

Manual annotations

Hold-out validation (train, test)

Reports from a second hospital

Not listed

[42]

Combi

2018

Manual annotations

Not used

No

Not listed

[43]

deBruijn

2011

Existing annotated corpus

15-fold cross validation

No

Not listed

[44]

Deisseroth

2019

Manual annotations

Hold-out validation (train, test)

Data from a second hospital

Yes, it can be immediately incorporated into clinical practice

[45]

Demner-Fushman

2017

Existing annotated corpus

External

Multiple datasets

Not listed

[46]

Divita

2014

Manual annotations

Not used

No

Not listed

[47]

Duarte

2018

Manual annotations

Hold-out validation (train, test)

Second dataset

Not listed

[48]

Falis

2019

Existing annotated corpus

Hold-out validation (train, test, development)

No

Yes, method is not specific to an ontology, and could be used for a graph of any formation

[49]

Ferrão

2013

Existing EHR data

Hold-out validation (train, test)

No

Not listed

[50]

Gerbier

2011

Manual annotations

Hold-out validation (train, test)

No

Yes, it could also serve other types of clinical decision support systems

[51]

Goicoechea Salazar

2013

Manual annotations

Hold-out validation (train, test)

No

Not listed

[52]

Hamid

2013

Manual annotations

10-fold cross validation

No

Possible, the classifier may be applicable in academic hospital samples

[53]

Hassanzadeh

2016

Existing annotated corpus

Hold-out validation (train, test)

No

Not applicable

[54]

Helwe

2017

Existing annotated corpus

Hold-out validation (train, test, development)

No

Not listed

[55]

Hersh

2001

Manual annotations

Hold-out validation (train, test)

No

Not listed

[56]

Hoogendoorn

2015

Existing EHR data

5-fold cross validation

No

Not listed

[57]

Jindal

2013

Existing annotated corpus

Hold-out validation (train, test)

No

Yes, broad applicability

[58]

Kang

2009

Manual annotations

Hold-out validation (train, test)

No

Yes, extensible to other languages

[59]

Kersloot

2019

Manual annotations

Hold-out validation (development, test)

No

Possible, but external validation is needed

[60]

König

2019

Existing EHR data

Not used

No

Still to be tested

[61]

Li

2015

Manual annotations

10-fold cross validation

No

Not listed

[62]

Li

2019

Existing annotated corpus

Hold-out validation (train, test, development)

No

Not listed

[63]

Lingren

2016

Manual annotations

Hold-out validation (train, test, development)

No

Not listed

[12]

Liu

2019

Manual annotations

Not used

No (but multiple datasets / non-trained)

No, limited because of NYP/CUIMC and Mayo notes.

[64]

Lowe

2009

Manual retrospective review

Hold-out validation (train, test)

No

Yes, has the potential to index other classes of clinical documents

[65]

Luo

2014

Existing EHR data

10-fold cross validation

No

No, challenging, not currently working on it

[66]

Meystre

2006

Manual retrospective review

Not used

No

Not listed

[67]

Meystre

2010

Existing annotated corpus

Hold-out validation (train, test)

No

Not listed

[68]

Minard

2011

Existing annotated corpus

Hold-out validation (train, test, development)

No

Not listed

[69]

Mishra

2019

Manual annotations

Not used

No

Not listed

[70]

Nguyen

2018

Existing EHR data

Not listed

No

Not listed

[71]

Oellrich

2015

Existing annotated corpus

External

Multiple datasets

Not listed

[72]

Patrick

2011

Existing annotated corpus

10-fold cross validation

No

Yes, adaptable to different requirements in clinical information extraction and classification by choosing relevant feature sets

[73]

Pérez

2018

Existing annotated corpus

Hold-out validation (train, test, development)

No

Yes, extensible to different hospital-sections and hospitals

[74]

Reátegui

2018

Existing annotated corpus

Not used

No

Not listed

[75]

Roberts

2011

Existing annotated corpus

Hold-out validation (train, test)

No

Not listed

[76]

Rousseau

2019

Manual annotations

Not used

No

Not listed

[77]

Savova

2010

Manual annotations

10-fold cross validation

No

Yes, implemented in several applications

[78]

Shivade

2015

Manual annotations

Hold-out validation (train, test)

No

Not listed

[11]

Shoenbill

2019

Manual annotations

Hold-out validation (train, test)

No

Yes, can allow further evaluation and improvement in care delivery models and treatment approaches to multiple chronic illnesses

[79]

Sohn

2014

Manual annotations

Hold-out validation (train, test, development)

No

Yes, with adaptions: create flexible mechanism for adaptation process

[80]

Solti

2008

Manual annotations

Hold-out validation (train, test)

No

Not listed

[81]

Soriano

2019

Manual annotations

Not listed

No

Not listed

[82]

Soysal

2018

Existing annotated corpus

Hold-out validation (train, test)

No

Yes, can be used to quickly develop customized clinical information extraction pipelines

[83]

Spasić

2015

Manual annotations

Hold-out validation (train, test)

No

Not listed

[84]

Strauss

2013

Manual annotations

Not used

No

Yes, can be shared between institutions and used to support clinical + epidemiological research

[85]

Sung

2018

Manual annotations

Not listed

No

Not listed

[86]

Tchechmedjiev

2018

Existing annotated corpus

Hold-out validation (train, test, development)

No

Yes, but not universally

[87]

Ternois

2018

Existing EHR data

5-fold cross validation + Hold-out validation (train, test)

No

Not listed

[88]

Travers

2004

Manual retrospective review

Not used

No

Not listed

[89]

Tulkens

2019

Existing annotated corpus

Hold-out validation (train, test, development)

No

Not listed

[90]

Usui

2018

Manual annotations

Not used

No

Not listed

[91]

Valtchinov

2019

Manual annotations

Not used

No

No

[92]

Wadia

2018

Manual annotations

Not used

No

Not listed

[93]

Walker

2019

Manual retrospective review

Hold-out validation (development, test)

No

Yes, it can be incorporated in institutional data warehouse

[94]

Xie

2019

Existing annotated corpus

Hold-out validation (train, test, development)

No

Not listed

[95]

Xu

2011

Manual annotations

Hold-out validation (train, test)

No

Yes, generable approach to combine information from heterogeneous data sources in EHRs

[96]

Yadav

2013

Manual annotations

Not used

No

Yes, should be broadly applicate to outcomes of clinical interest

[97]

Yao

2019

Existing annotated corpus

Hold-out validation (train, test)

No

Not listed

[98]

Zeng

2018

Manual annotations

5-fold cross validation + Hold-out validation (train, test)

No

Yes, potential to be replicated

[99]

Zhang

2013

Existing annotated corpus

External

Two different sets with same settings

Yes, can be adapted to different semantic categories and text genres

[100]

Zhou

2006

Manual annotations

5-fold cross validation

No

Not listed

[101]

Zhou

2011

Manual retrospective review

Hold-out validation (train, test)

No

Not listed

[102]

Zhou

2014

Manual annotations

Not used

No

Not listed

[103]

  1. a As reported by authors