Skip to main content

Table 2 Count of entries according to each method, and count of concept codes from each source

From: MedLexSp – a medical lexicon for Spanish medical natural language processing

Method

# entries

1. Abbreviations / acronyms

6679

2. Affixes / roots

914

3. Conjugated verbs

867

4. Derivational variants

801

5. String distance method

1463

6. Syntactic variants

134

7. Terms collected using word embeddings

222

8. Terms from corpora:

 

   CANTEMIST

2619

   CODIESP

3384

   CWLC

1511

   MedlinePlus

1682

   PharmaCoNER

173

   SPCs (EasyDLP corpus)

837

9. Thesauri, dictionaries and knowledge bases:

# codes

   DTM

30 816

   ATC + Nomenclátor + SDEdb

2931

   DSM-5

188

   ICD-10

19 888

   ICPC

179

   MedDRA

20 209

   MeSH

20 911

   NCI

7621

   OMIM

15 143

   OrphaData

10 741

   SNOMED-CT

53 893

   WHO

2811

   Other

4939