Skip to main content

Table 7 The top 5 derivational synonyms that improve performance on the CRAFT corpus

From: Gene Ontology synonym generation rules lead to increased performance in biomedical concept recognition

GO ID

Term name

ΔTP

ΔFP

ΔFN

Generated synonyms

Cellular Component

 

GO:0019814

Immunoglobulin complex

+548

+0

−548

Antibody, antibodies

GO:0005634

Nucleus

+218

+35

−218

Nuclear, nucleated

GO:0005739

Mitochondrion

+135

+0

−135

Mitochondrial

GO:0031982

Vesicle

+11

+3

−11

Vesicular

GO:0005856

Cytoskeleton

+15

+0

−15

Cytoskeletal

Molecular Function

 

GO:0000739

DNA strand annealing activity

+327

+1

−327

Hybridized, hybridization, annealing, annealed

GO:0033592

RNA strand annealing activity

+327

+1

−327

Hybridized, hybridization, annealing, annealed

GO:0031386

Protein tag

+6

+79

−6

Tag

GO:0005179

Hormone activity

+1

+0

−1

Hormonal

GO:0043495

Protein anchor

+1

+10

−1

Anchor

Biological Process

 

GO:0010467

Gene expression

+2235

+361

−2235

Expression, expressed, expressing

GO:0007608

Sensory perception of smell

+445

+1

−445

Olfactory

GO:0008283

Cell proliferation

+97

+71

−97

Cellular proliferation, proliferative

GO:0007126

Meiosis

+93

+2

−93

Meiotic, meiotically

GO:0006915

Apoptosis

+173

+2

−173

Apoptotic

  1. The GO terms that increase performance the most on CRAFT are along with the change (Δ) in number of true positives (TP), false positives (FP), and false negatives (FN) from the baseline B2 (“activity” removed baseline). The generated synonyms that result in this increase are shown under ‘Generated synonyms’