Skip to main content

Table 4 Mean and standard deviation \(\sigma\) of test \(F_1\) scores across 10 models trained using best-performing (\(F_1\) on validation dataset) configuration found in 30 trials of hyperparameter optimization. Numbers rounded to three decimal places, best configuration of each disease marked bold

From: Comparing generative and extractive approaches to information extraction from abstracts describing randomized clinical trials

Type 2 diabetes

Glaucoma

Model

Mean \(F_1\) (\(\pm \sigma\))

Model

Mean \(F_1\) (\(\pm \sigma\))

Extractive

 flan-t5-base

0.547 (± 0.006)

flan-t5-base

0.636 (± 0.006)

 led-base-16384

0.525 (± 0.009)

led-base-16384

0.572 (± 0.010)

 longformer-base-4096

0.540 (± 0.008)

longformer-base-4096

0.613 (± 0.007)

Generative

 flan-t5-base

0.539 (± 0.029)

flan-t5-base

0.584 (± 0.025)

 led-base-16384

0.400 (± 0.079)

led-base-16384

0.353 (± 0.106)