Skip to main content

Table 4 Included publications and their evaluation methodologies

From: Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies

Author Year Ref. std. Validation External Generalizability a Ref
Afshar 2019 Existing EHR data Hold-out validation (train, test, development) No No, validation is needed [29]
Alnazzawi 2016 Existing annotated corpus External ShARe/CLEF, NCBI disease, Heart failure and pulmonary embolism corpora Yes, achieves competitive performance on other corpora [30]
Atutxa 2018 Manual retrospective review Hold-out validation (train, test, development) No Yes, easily portable to other languages [31]
Barrett 2013 Manual annotations 10-fold cross validation Multiple datasets (different provider) Yes, expect that it is generalizable [32]
Becker 2016 Existing annotated corpus Not used No Not listed [33]
Becker 2019 Manual annotations Hold-out validation (train, test, development) No Not listed [34]
Bejan 2015 Manual annotations External i2b2 data (2010) Yes, good performance on the i2b2 dataset, even though not optimized on it [35]
Castro 2010 Manual annotations Not used No Not listed [36]
Catling 2018 Existing annotated corpus Hold-out validation (train, test, development) No Not listed [37]
Chapman 2004 Manual annotations Not used No Yes, generalizable to other domains within and outside of bio surveillance [38]
Chen 2016 Manual annotations 10-fold cross validation No Not listed [39]
Chiaramello 2016 Manual annotations Not used No Not listed [40]
Chodey 2016 Existing annotated corpus Hold-out validation (train, test) No Not listed [41]
Chung 2005 Manual annotations Hold-out validation (train, test) Reports from a second hospital Not listed [42]
Combi 2018 Manual annotations Not used No Not listed [43]
deBruijn 2011 Existing annotated corpus 15-fold cross validation No Not listed [44]
Deisseroth 2019 Manual annotations Hold-out validation (train, test) Data from a second hospital Yes, it can be immediately incorporated into clinical practice [45]
Demner-Fushman 2017 Existing annotated corpus External Multiple datasets Not listed [46]
Divita 2014 Manual annotations Not used No Not listed [47]
Duarte 2018 Manual annotations Hold-out validation (train, test) Second dataset Not listed [48]
Falis 2019 Existing annotated corpus Hold-out validation (train, test, development) No Yes, method is not specific to an ontology, and could be used for a graph of any formation [49]
Ferrão 2013 Existing EHR data Hold-out validation (train, test) No Not listed [50]
Gerbier 2011 Manual annotations Hold-out validation (train, test) No Yes, it could also serve other types of clinical decision support systems [51]
Goicoechea Salazar 2013 Manual annotations Hold-out validation (train, test) No Not listed [52]
Hamid 2013 Manual annotations 10-fold cross validation No Possible, the classifier may be applicable in academic hospital samples [53]
Hassanzadeh 2016 Existing annotated corpus Hold-out validation (train, test) No Not applicable [54]
Helwe 2017 Existing annotated corpus Hold-out validation (train, test, development) No Not listed [55]
Hersh 2001 Manual annotations Hold-out validation (train, test) No Not listed [56]
Hoogendoorn 2015 Existing EHR data 5-fold cross validation No Not listed [57]
Jindal 2013 Existing annotated corpus Hold-out validation (train, test) No Yes, broad applicability [58]
Kang 2009 Manual annotations Hold-out validation (train, test) No Yes, extensible to other languages [59]
Kersloot 2019 Manual annotations Hold-out validation (development, test) No Possible, but external validation is needed [60]
König 2019 Existing EHR data Not used No Still to be tested [61]
Li 2015 Manual annotations 10-fold cross validation No Not listed [62]
Li 2019 Existing annotated corpus Hold-out validation (train, test, development) No Not listed [63]
Lingren 2016 Manual annotations Hold-out validation (train, test, development) No Not listed [12]
Liu 2019 Manual annotations Not used No (but multiple datasets / non-trained) No, limited because of NYP/CUIMC and Mayo notes. [64]
Lowe 2009 Manual retrospective review Hold-out validation (train, test) No Yes, has the potential to index other classes of clinical documents [65]
Luo 2014 Existing EHR data 10-fold cross validation No No, challenging, not currently working on it [66]
Meystre 2006 Manual retrospective review Not used No Not listed [67]
Meystre 2010 Existing annotated corpus Hold-out validation (train, test) No Not listed [68]
Minard 2011 Existing annotated corpus Hold-out validation (train, test, development) No Not listed [69]
Mishra 2019 Manual annotations Not used No Not listed [70]
Nguyen 2018 Existing EHR data Not listed No Not listed [71]
Oellrich 2015 Existing annotated corpus External Multiple datasets Not listed [72]
Patrick 2011 Existing annotated corpus 10-fold cross validation No Yes, adaptable to different requirements in clinical information extraction and classification by choosing relevant feature sets [73]
Pérez 2018 Existing annotated corpus Hold-out validation (train, test, development) No Yes, extensible to different hospital-sections and hospitals [74]
Reátegui 2018 Existing annotated corpus Not used No Not listed [75]
Roberts 2011 Existing annotated corpus Hold-out validation (train, test) No Not listed [76]
Rousseau 2019 Manual annotations Not used No Not listed [77]
Savova 2010 Manual annotations 10-fold cross validation No Yes, implemented in several applications [78]
Shivade 2015 Manual annotations Hold-out validation (train, test) No Not listed [11]
Shoenbill 2019 Manual annotations Hold-out validation (train, test) No Yes, can allow further evaluation and improvement in care delivery models and treatment approaches to multiple chronic illnesses [79]
Sohn 2014 Manual annotations Hold-out validation (train, test, development) No Yes, with adaptions: create flexible mechanism for adaptation process [80]
Solti 2008 Manual annotations Hold-out validation (train, test) No Not listed [81]
Soriano 2019 Manual annotations Not listed No Not listed [82]
Soysal 2018 Existing annotated corpus Hold-out validation (train, test) No Yes, can be used to quickly develop customized clinical information extraction pipelines [83]
Spasić 2015 Manual annotations Hold-out validation (train, test) No Not listed [84]
Strauss 2013 Manual annotations Not used No Yes, can be shared between institutions and used to support clinical + epidemiological research [85]
Sung 2018 Manual annotations Not listed No Not listed [86]
Tchechmedjiev 2018 Existing annotated corpus Hold-out validation (train, test, development) No Yes, but not universally [87]
Ternois 2018 Existing EHR data 5-fold cross validation + Hold-out validation (train, test) No Not listed [88]
Travers 2004 Manual retrospective review Not used No Not listed [89]
Tulkens 2019 Existing annotated corpus Hold-out validation (train, test, development) No Not listed [90]
Usui 2018 Manual annotations Not used No Not listed [91]
Valtchinov 2019 Manual annotations Not used No No [92]
Wadia 2018 Manual annotations Not used No Not listed [93]
Walker 2019 Manual retrospective review Hold-out validation (development, test) No Yes, it can be incorporated in institutional data warehouse [94]
Xie 2019 Existing annotated corpus Hold-out validation (train, test, development) No Not listed [95]
Xu 2011 Manual annotations Hold-out validation (train, test) No Yes, generable approach to combine information from heterogeneous data sources in EHRs [96]
Yadav 2013 Manual annotations Not used No Yes, should be broadly applicate to outcomes of clinical interest [97]
Yao 2019 Existing annotated corpus Hold-out validation (train, test) No Not listed [98]
Zeng 2018 Manual annotations 5-fold cross validation + Hold-out validation (train, test) No Yes, potential to be replicated [99]
Zhang 2013 Existing annotated corpus External Two different sets with same settings Yes, can be adapted to different semantic categories and text genres [100]
Zhou 2006 Manual annotations 5-fold cross validation No Not listed [101]
Zhou 2011 Manual retrospective review Hold-out validation (train, test) No Not listed [102]
Zhou 2014 Manual annotations Not used No Not listed [103]
  1. a As reported by authors