Skip to main content

Table 1 Source contributions of drug-indication data

From: Toward a comprehensive drug ontology: extraction of drug-indication relations from diverse information sources

source abbrev

source name or description

subset if any

version/date

number of drug-indication pairs

initial

filtered

parsed

ChEBI

Chemicals of Biological Interest Ontology

has_role relations

104/June 1, 2013

16,415

8,598

8,598

CTD

Comparative Toxicogenomics Database

Chemicals-Diseases Associations,“direct evidence” subset

May 2, 2014

82,000

81,214

81,214

DailyMed

NLM’s database of FDA package inserts

single component title (product name) & Indications sections with tractable text length (<540)

March 20, 2011

15,834

1,612

3,840

DrugBank

U. Alberta open access DB of drug target and other info

title (drug name) & Indications sections

3.0/2011

1,599

1,595

6,004

MeSH PA

Medical Subject Headings Pharmacologic Action relations

 

2013/Dec. 3, 2012

26,293

25,847

25,908

NDFRT

National Drug Formulary Reference Terminology

may_treat & may_prevent relations

2009AA (UMLS)

50,775

5,294

5,294

PDR

Physicians’ Desk Reference

Section 3 - Product Category Index

2006

3,150

1,204

2,169

USAN_TC

United States Adopted Names Therapeutic Claims

 

March 31, 2014 (eVOC)

6,569

5,954

7,234

WHO_ATC

World Health Organization Anatomic-Therapeutic-Chemical, Defined Daily Dose index

 

2005

16,276

7,807

9,004

WHO_DD

World Health Organization Drug Dictionary

single generic compounds with ATC codes (minus 2005 WHO-ATC overlap and herbals BNA = “9…”)

Sept. 2013

40,736

21,764

25,674

evoc_ATC

WHO-ATC codes in Merck’s eVOC generic names dictionary

single generic compounds with ATC codes (minus WHO-ATC & WHO-DD overlap)

May 6, 2014

65,552

16,269

19,093

  1. The numbers refer to candidate drug-indication pairs in the initial raw data extract (initial), after filtering for internal redundancy, relevance, and/or tractability (filtered), and after parsing of free text into single concepts (parsed) as described in the main text. The “filtered” count is the number of unique pairs of raw drug name (DID column D) and indication “entire value/string” (column AQ), while the “parsed” count is the number of unique pairs of raw drug name and indication “target/substring” (column AR). evoc_eProj data are not shown