Skip to main content

Table 3 Basic statistics of the SNPPhenA corpus in terms of test and train parts

From: SNPPhenA: a corpus for extracting ranked associations of single-nucleotide polymorphisms and phenotypes from literature

Item

Train

Test

Total

Files

270

90

360

Sentences

1940

685

2625

Key sentences

362

121

483

SNP

691

244

935

Phenotypes

496

158

654

SNP-Phenotype association candidates

935

365

1300

Neutral candidates

142

166

308

Negative candidates

91

29

120

Positive candidates

702

170

872