Skip to main content

Table 1 Main characteristics of our target and source corpora

From: Syntax-based transfer learning for the task of biomedical relation extraction

 

Corpus name

Subcorpus

Train Size

Test Size

#Entity Types

#Relation Types

   

sent.

rel.

sent.

rel.

  

Target

SNPPhenA

362

935

121

365

2

3

 

EU-ADR

drug-disease

244

176

  

4

3

  

drug-target

247

310

4

3

  

target-disease

355

262

  

4

3

Source

SemEval

DrugBank

5,675

3,805

973

889

4

4

 

2013 DDI

MEDLINE

1,301

232

326

95

4

4

 

ADE-EXT

5,939

6,701

2

1

 

reACE

5,984

2,486

4

5

  1. Two corpora are divided into subcorpora. The sizes of the training and test corpora are reported in term of number of sentences (sent.) and annotated relationships (rel.). EU-ADR, ADR-EXT and reACE have no proper test corpus