Extraction of potential adverse drug events from medical case reports

The sheer amount of information about potential adverse drug events published in medical case reports pose major challenges for drug safety experts to perform timely monitoring. Efficient strategies for identification and extraction of information about potential adverse drug events from free‐text resources are needed to support pharmacovigilance research and pharmaceutical decision making. Therefore, this work focusses on the adaptation of a machine learning‐based system for the identification and extraction of potential adverse drug event relations from MEDLINE case reports. It relies on a high quality corpus that was manually annotated using an ontology‐driven methodology. Qualitative evaluation of the system showed robust results. An experiment with large scale relation extraction from MEDLINE delivered under‐identified potential adverse drug events not reported in drug monographs. Overall, this approach provides a scalable auto‐assistance platform for drug safety professionals to automatically collect potential adverse drug events communicated as free‐text data.


Background
Adverse drug effects are a very serious issue that confronts patients, healthcare providers, regulatory authorities and drug manufacturers. While stringent measures for detecting risks associated with drug usage are clinical trials, the wide field usage might show additional risks non detectable in the clinical trials due to the limited number of patients involved. After the marketing approval, undesired effect of drugs are reported to the authorities using so called Spontaneous Adverse Event Reporting Systems, that are then timely analyzed to ensure safe use of drugs [1]. A well known problem of pharmacovigilance is however the under reporting, namely the low number of reports that the Authorities receive. Case reports published in the scientific biomedical literature represent an important resource complementary to the SAERS due to their abundant existence, rapid rate of generation, and valuable information enclosed [2]. Due to their unstructured nature, manual analysis of the scientific literature is challenging, cumbersome, and labor intensive. In recent years, development of automatic natural language processing (NLP) and information extraction (IE) techniques have gained large popularity. They include identification of biomedical named entities, relations between the entities, or events associated with them. Noticeable efforts have been invested on mining the potential adverse drug events in different forms of free-text data. Examples include Wang et. al. [3] who applied the MedLEE system on discharge summaries to identify medication events and entities that could be potential adverse drug effects; these were detected using the http://www.jbiomedsem.com/content/3/1/15 strength of statistical association based on their co-occurrences. Leaman et. al. [4] proposed a lenient NLP model for extracting adverse effects of drugs from social media such as blogs. Gurulingappa et. al. [5] developed a machine learning-based system for classifying the sentences in MEDLINE case reports that assert potential adverse drug events. However, according to the author's knowledge, there is a limited focus on identification of semantic relationships between drugs and adverse events in text. This is partly due to the unavailability of suitable open access corpora that could be used for technology development and benchmarking. Extracting relations between drugs and adverse effects can facilitate appropriate indexing, precise searching, visualization, faster information tracing and improve sensitivity of signal detection in pharmacovigilance. The use of ontology of adverse drug events for automated signal generation in pharmacovigilance has already been proposed [6] and its application to information retrieval has been exploited by the same group few years later in the VIGITERMES project [7]. There, the OntoEIM adverse event ontology was used to extend the dictionary of adverse event entities, normalize queries, and consolidate annotations, achieving 29% precision and 67% recall on MEDLINE abstracts. Automatic extraction of potential adverse drug events from clinical records is an active area of research [8]. Mining social internet message boards to identify potential adverse drug events has been reported [9], whereby in that work the extraction of drug-event pairs was determined only using co-occurrence of terms within a window of 20 tokens apart, and the use of machine learning systems was only focused on deidentification for privacy protection. This work reports on the adaptation of a machine learning-based system for identifying the relations between drugs and adverse effects in MEDLINE case reports; it relies on an ontology-driven manually annotated corpus that strictly follows semantic annotation guidelines developed for clinical text [10]. The system has been qualitatively evaluated and studied for its ability of support real time pharmacovigilance studies.

Corpus preparation
The data set used for training and validation of the relation extraction system is the ADE corpus [11]. The ADE corpus contains 2972 MEDLINE case reports that are manually annotated in duplicate and harmonized by three annotators. The corpus contains annotations of 5063 drugs, 5776 conditions (e.g. diseases, signs, symptoms), and 6821 relations between drugs and conditions representing clear adverse events. All annotations are confined to sentence level i.e. drugs and conditions representing adverse events co-occurring only within individual sentences are annotated. Drugs and conditions that are not part of a potential adverse event relation are not annotated. This was done in accordance to the annotation guidelines. The ADE corpus contains annotations of relations between drugs and conditions that represent True relations. This represents a sparsely annotated dataset. For training a supervised classifier, it was essential to generate False relations i.e. drugs and conditions that do not fall into adverse effect relations but that are still within the same sentence. For this purpose, ProMiner, a dictionary-based named entity recognition system [12] was employed. ProMiner was incorporated with DrugBank [13] and MedDRA [14] dictionaries for the identification of drugs and conditions respectively in the ADE corpus that were previously not annotated by human annotators. As a result of named entity recognition, new instances encompassing 2269 drugs and 3437 conditions were http://www.jbiomedsem.com/content/3/1/15 automatically annotated. Drug-condition pairs co-occurring within sentences that were previously not annotated by humans formed False relations. Altogether, 5968 False relations were automatically generated. The corpus enriched with machine annotated drugs, conditions, and relations between them is referred as ADE-EXT (indicating extended ADE corpus). Figure 1 shows an illustration of True and False relations between drug and conditions co-occurring within a sentence.
In the ADE-EXT corpus, 120 manually annotated True relations were not suitable for the NLP task. Examples include overlapping inter-related entities such as acute lithium toxicity where lithium is related to acute toxicity. After removal of nested annotations, the ADE-EXT corpus was decomposed into a training set (ADE-EXT-TRAIN) and a test set (ADE-EXT-TEST). Counts of entities and relations in subsets of ADE-EXT corpora are shown in Table 1.

Relation extraction workflow
For the identification and extraction of drug-condition entity pairs that constitute a potential adverse event relation, the Java Simple Relation Extraction (JSRE) system [15] was employed. JSRE provides a re-trainable and scalable supervised classification platform that uses Support Vector Machines (SVMs) [16] with different kernels specially designed for the NLP and relation extraction. All sentences in ADE-EXT-TRAIN and ADE-EXT-TEST containing drug-condition pairs labelled as either True or False were transformed into the SRE format before subjecting them to relation extraction. The SRE format is a unique way of representing data within the JSRE platform where tokens appearing in sentences are enriched with their parts-of-speech tags, lemmas, and flags indicating if a token is a part of named entity or not. Amongst different kernels available, the shallow linguistic kernel was thoroughly used since it has been widely applied and has shown success during similar relation extraction tasks [17]. The ADE-EXT-TRAIN was used as data for training and cross-validation of JSRE whereas the ADE-EXT-TEST was used as an independent test set.

Mapping annotation ontology against ontology of adverse events
The Clinical E-Science Framework (CLEF) initiative [18] investigated how to generate semantically annotated medical corpora for information extraction. As described by Gurulingappa et. al. [11], we adopted the standard established by the CLEF framework for the annotation workflow [10] however we reshaped the annotation schema by using only two of the original entities (condition, drug) and extended it with a third one (dosage). None of the relationships used by the CLEF annotation schema could be reused for our work, since the CLEF annotation schema did not consider adverse drug events, instead we created two relations: drug-cause-condition, drug-has-dosage. In this work we focused  only on automating the detection of drug-cause-condition thus dosage will not be mentioned further. The ADE corpus has been created using the Knowtator plugin for Protégé [19], an ontology-driven corpus annotation tool also used for the creation of the CLEF corpus. Although we adopted the same tool used in CLEF and also adopted the standard established by the CLEF framework for the annotation workflow, we could not adopt the same annotation ontology since the latter was not able to capture drug-adverse event and drug-dosage relations. The annotation ontology described above was therefore used to create the ADE corpus. Subsequent to the corpus creation, the realism-based biomedical ontology for representation of adverse events (OAE) has been published [20]. OAE has been developed following the principles of Ontological Realism, thus is aligned with the Basic Formal Ontology and the Relation Ontology, and with the Open Biological and Biomedical Ontologies (OBO) Foundry principles of openness, collaboration and use of a common shared syntax. OAE has 484 representational units, annotated by means of 369 terms with specific identifiers and 115 terms imported from existing ontologies. The use of ontologies has proven of great value in biomedicine, also since it enable machine reasoning, abstraction and automatic hypothesis generation. We therefore had interest in investigating if the knowledge encoded in the annotations of the ADE corpus could be semantically connected to the OAE. For doing this, we manually compared the definitions of the entities of OAE and of ADE annotation ontology. Figure 2 shows the basic design patterns of OAE, ADE and CLEF as from the original papers, emphasizing shared entities using green and red colors.

Performance evaluation criteria
The performance of relation extraction was evaluated by 10-fold cross-validation of the training data. During cross-validation of the training data and final evaluation over the test set, classification performances were assessed using the F-score over Truelabelled relations since they represent potential adverse event relations between drugs and conditions that denote a focused relation class being studied.

Assessment of relation extraction
Baseline experiments began with training and cross-validation of JSRE over the ADE-EXT-TRAIN corpus. Results of system's performances are shown in Table 2. The system achieved an overall F-score of 0.87 after cross-validation. Upon the final test over ADE-EXT-TEST, the system attained F-score of 0.87 indicating a consistency in classification. A subset of instances misclassified during the cross-validation and testing were manually investigated to understand the common sources of errors. Limited context http://www.jbiomedsem.com/content/3/1/15 appeared to be one reason for misclassification. For example, the title Niacin maculopathy (PMID:3174043) infers maculopathy as a potential adverse event of niacin that lacks contextual description to support machine classification. Distantly co-occurring inter-related entities constituted couple of errors. For example, in the sentence CASE SUMMARY: A 65-year-old patient chronically treated with the selective serotonin reuptake inhibitor (SSRI) citalopram developed confusion, agitation, tachycardia, tremors, myoclonic jerks and unsteady gait, consistent with serotonin syndrome, following initiation of fentanyl, and all symptoms and signs resolved following discontinuation of fentanyl (PMID:17381671); the relation between confusion and the last appearing drug name fentanyl was incorrectly classified. Case reports often contain frequencies at which potential adverse events were observed.

For instance, The toxic effects of methotrexate included elevated liver transaminases (3/4), nausea (2/4), abdominal pain (2/4), bone pain (2/4), mild neutropenia (1/4), and mild pruritus (1/4) (PMID:433855); this sentence shows examples of relations where
the system had difficulties in identification of correct relations. Potential adverse drug events are categorized according to their severity: serious suspected adverse drug reactions require immediate action by medical professionals. Manual investigation of the predicted results showed that the system was able to capture most of the serious potential adverse events. These findings demonstrate the potential of this approach to facilitate the identification of potential signals from case reports, of great interest for drug safety experts.

Impact of size of the training set on the performance
In order to study the impact of size of the training data on performance of classification, the ADE-EXT-TRAIN was decomposed into random subsets containing 10, 20, 50, 100, 200, 500, 1000, and 2000 documents. The JSRE was trained over these subsets independently in different rounds and evaluated by 10-fold cross-validation. Table 3 shows that already using 200 documents one could achieve performances over the 80% range. Whereby, to obtain a classifier with a standard deviation of 1%, one needs a substantially large training data.

Mapping the ADE annotation ontology to the ontology of adverse events
As clearly shown in Figure 2, both the ADE annotation ontology and OAE represent adverse drug reactions using formal ontological methods. In spite of this common goal, the two ontologies use different naming for the two core entities: a Condition in the ADE annotation ontology coincide with a drug adverse event in OAE; a Drug in the ADE annotation ontology coincide with a drug-administration in OAE. The ADE ontology additionally introduce the entity dosage, not specified in OAE at the time of its development since OAE originally focused on vaccines for which dosing is not an essential medical concept. Both ADE and OAE model a causal relationship between Condition or Adverse event and Drug or Medical intervention, with the latter being the causal source. The only entity shared by the CLEF annotation ontology with OAE and ADE is the Drug-or-device, that coincide with a Drug or Medical intervention.

Use case study: large scale relation extraction
An experiment was conducted in order to understand the real-world use case scenarios for the extraction of potential adverse drug events from text. This was performed by applying the trained extraction tool to the whole MEDLINE and thereafter comparing them to the information present in drug leaflets present in the SIDER [21] database. Some of the automatically extracted potential adverse drug events, not present

Adverse effect extraction from SIDER
Side Effect Resource (SIDER) is a database of adverse drug effects that links 888 drugs to 1450 adverse effects. It has been constructed manually from the summary of product leaflets of each drug. Drugs and their adverse effects were extracted from SIDER version 1.01 that contains drug leaflets published before 2009.

MHRA drug label changes
In 2009, the MHRA proposed safety label updates for 26 drugs. These were of course not all the safety label updates that the MHRA identified in 2009, but those that MHRA decided to give particular visibility through their web site. These new adverse drug effects were manually extracted and they serve as a standard reference for validation of potential adverse drug events automatically extracted from Medline-2009 using the JSRE trained method.

Validation of large scale relation extraction
From the MHRA label change dataset, three drugs were arbitrarily chosen for deeper investigation. They are Rituximab, Efalizumab, and Natalizumab: three anti-neoplastic and immunomodulatory monoclonal antibodies. For the three drugs of interest, potential adverse drug events were selected from the Medline-2009 predictions and SIDER. Potential adverse drug events extracted from Medline-2009 that are not reported in SIDER were manually checked against the label changes of MHRA. Manual investigation of machine predicted potential adverse events showed that the system was able to capture valid potential adverse events from free-text that were not yet reported in product leaflets (Table 4). These adverse effects were later updated on drug labels by the UK regulatory authorities. This instance provides a good example for

Conclusions
This work reports on the adaptation of a machine learning-based JSRE system for the identification and extraction of potential adverse events of drugs in scientific case reports. A methodology has been discussed to enrich a sparsely annotated corpus and its subsequent use to build classification models. Evaluation of the system's performance showed promising results. A use-case study performed on relation extraction from large scale literature showed the system's ability to capture valid, under-reported, and novel potential adverse events not yet present in product leaflets. The performance of the system can be improved in several ways. In the current experiments, only the default features acceptable by JSRE were used. Optimization of feature representation to include additional features for instance from syntactic sentence parse trees may further improve the results. Development of additional strategies like postprocessing to classify relations with missing contextual descriptions can help to recover more relations. Furthermore, extension of handling inter-sentence relations needs to be considered in order to further increase coverage.
The reported experimental results denote the research status on identification from text of potential adverse drug events. There are several strategies that are being followed. The authors plan to benchmark the performances of several named entity taggers against the ADE corpus for the identification of drugs and conditions mentions in text. The current experiments have been performed on the ADE corpus, since that was the only one available when this work was done, however while writing this report a new corpus has been published, namely the EU-ADR corpus [22]. It will be interesting to see if the performance of JSRE on the ADE corpus will be different compared to the EU-ADR corpus.
Similarly, benchmarking results of public and commercial relation extraction systems will be performed [23] and the practical impact of the information extracted from text on predicting drug label changes will be studied in detail.
The use of ontologies for driving information extraction has been reported [24,25]. We plan to explore the use of various available tools (e.g. ODIE, OBCIE,semantixs) using the OAE ontology and compare the performance of the ontology driven / based methods for information extraction against the method presented here.
The current work has demonstrated promising results, it has the potential to reduce the manual reading time, improve the quality of the signal detection process, and therefore positively contribute to safer use of drugs to the benefit of patients and society. We speculate that this work could also pave the road to pharmacovigilance applications on social media and multimedia sources too. http://www.jbiomedsem.com/content/3/1/15