Skip to main content

Table 8 Statistics of annotations produced on the large literature collection by information content

From: Gene Ontology synonym generation rules lead to increased performance in biomedical concept recognition

 

Baseline B2

With generated synonyms

Impact of synonyms

IC

# Terms

# Annotations

# Terms

# Annotations

New concepts

New annotations

Change

Undefined

3,548

16,929,911

4,303

23,653,066

755

6,723,155

+39.7 %

[0,1)

7

3,202,114

7

3,177,333

0

−24,781

−0.1 %

[1,2)

16

2,655,365

17

2,801,431

1

146,066

+0.1 %

[2,3)

43

7,332,003

44

8,016,573

1

684,570

+0.1 %

[3,4)

94

4,474,422

101

5,188,968

7

714,546

+0.2 %

[4,5)

178

4,185,438

191

9,340,757

13

5,155,319

+123.8 %

[5,6)

354

13,547,423

373

22,284,670

19

8,737,247

+64.4 %

[6,7)

666

9,533,940

715

12,060,499

49

2,526,559

+26.3 %

[7,8)

1,044

18,354,299

1,154

21,251,834

110

2,897,535

+16.8 %

[8,9)

1,465

7,932,937

1,648

15,316,476

183

7,383,539

+92.4 %

[9,10)

1,551

4,813,153

1,813

7,671,601

262

2,858,448

+58.3 %

[10,11)

1,396

2,390,061

1,690

4,291,831

294

1,901,770

+79.1 %

[11,12)

942

1,246,758

1,162

2,279,005

220

1,032,247

+83.3 %

[12,13)

732

578,501

953

1,257,956

221

679,455

+117.2 %

Total

12,036

97,176,325

14,171

138,592,000

2,135

41,415,675

+42.5 %

  1. Shows the number of unique terms and total number of annotations produced through baseline B2, both derivational and syntactic recursive rules applied, and the impact the rules have overall. The change is percent change in total annotations