Skip to main content

Table 2 ShARE/CLEF e-health test corpus semantic types

From: Concept selection for phenotypes and diseases using learn to rank

ID

UMLS Semantic type

Freq.

Unique

Av. term length

T047

Disease or syndrome

1723

371

1.88

T184

Sign or symptom

816

149

1.51

T046

Pathologic function

520

113

1.59

T037

Injury or poisoning

106

33

1.75

T019

Congenital abnormality

96

18

1.88

T190

Anatomical abnormality

125

26

1.74

T191

Neoplastic process

73

34

2.02

T048

Mental or behavioral dysfunction

137

32

1.67

T033

Finding

13

6

1.11

T020

Acquired abnormality

41

21

1.62

  1. Distribution of UMLS semantic types for annotations by frequency and frequency without duplication as well as the average term length in tokens.