Skip to main content

Table 1 Detailed information about the datasets used in the methodology

From: SIENA: Semi-automatic semantic enhancement of datasets using concept recognition

Data About Subset Size Column Number Column Name
HGNC Standardized nomenclature to human genes Subset of complete HGNC 382 KB 49 symbol, locus type, ena
CTD Manually curated information Chemical–gene interactions set 326 MB 11 Chemical ID, Gene Forms,
      PubMed IDs
PGKB Information about how human genetic Summary of the gene information 13.6 MB 17 Ensemble Id, Chromosome,
  variation affects response to medications     Cross-references