Skip to main content
Figure 4 | Journal of Biomedical Semantics

Figure 4

From: Evaluating gold standard corpora against gene/protein tagging solutions and lexical resources

Figure 4

The diagram shows the F1-measure performance of the different PGN taggers against the selected corpora using exact as well as cos98 evaluation. The ML-Tag solutions have been trained on BioCreative-II (Banner, Chang2), on BioCreative-I (Abner (BC1)) and on Jnlpba (Abner (Jnlpba)) and therefore perform best on these corpora. The LexTag solutions show similar performance across all corpora. In the left diagram all solutions have been measured using exact matching against the entity boundaries in the GSCs; for BioCreative-II only the gene list has been used for the evaluation. The measurements in the right diagram use a relaxed measure based on cosine similarity (cos98) between the tagged results and the GSC annotations leading to higher F1-measures; for BC2 the gene list and the alternative gene list has been applied. In both diagrams, the the entries in the left third represent the ML-Tag solutions, the middle part the LexTag solutions and the right part the Gnat solutions as a reference to a gene normalisation tagger. The performances of the LexTag solutions against the corpora reach higher F1-measures for FsuPrge and PennBio than for BioCreative-II and Jnlpba.

Back to article page