Skip to main content
Fig. 1 | Journal of Biomedical Semantics

Fig. 1

From: De-identifying Spanish medical texts - named entity recognition applied to radiology reports

Fig. 1

Summary of the proposed de-identification approach. a Corpus creation, annotation and manual revision, further detailed in Fig. 2. b Selection of databases to develop a randomizer script. The script is used to create the synthetic corpus. c Training and testing of different neural networks to select the best performing model. d When a new report needs to be de-identified, the selected model labels the words that belong to one of the defined named entities. Finally, the randomizer script creates a de-identified report with synthetic information

Back to article page