Skip to main content

Extraction of potential adverse drug events from medical case reports



The sheer amount of information about potential adverse drug events publishedin medical case reports pose major challenges for drug safety experts toperform timely monitoring. Efficient strategies for identification andextraction of information about potential adverse drug events fromfree‐text resources are needed to support pharmacovigilance researchand pharmaceutical decision making. Therefore, this work focusses on theadaptation of a machine learning‐based system for the identificationand extraction of potential adverse drug event relations from MEDLINE casereports. It relies on a high quality corpus that was manually annotatedusing an ontology‐driven methodology. Qualitative evaluation of thesystem showed robust results. An experiment with large scale relationextraction from MEDLINE delivered under‐identified potential adversedrug events not reported in drug monographs. Overall, this approach providesa scalable auto‐assistance platform for drug safety professionals toautomatically collect potential adverse drug events communicated asfree‐text data.


Adverse drug effects are a very serious issue that confronts patients, healthcareproviders, regulatory authorities and drug manufacturers. While stringent measuresfor detecting risks associated with drug usage are clinical trials, the wide fieldusage might show additional risks non detectable in the clinical trials due to thelimited number of patients involved. After the marketing approval, undesired effectof drugs are reported to the authorities using so called Spontaneous Adverse EventReporting Systems, that are then timely analyzed to ensure safe use of drugs [1]. A well known problem of pharmacovigilance is however the underreporting, namely the low number of reports that the Authorities receive. Casereports published in the scientific biomedical literature represent an importantresource complementary to the SAERS due to their abundant existence, rapid rate ofgeneration, and valuable information enclosed [2]. Due to their unstructured nature, manual analysis of the scientificliterature is challenging, cumbersome, and labor intensive. In recent years,development of automatic natural language processing (NLP) and informationextraction (IE) techniques have gained large popularity. They include identificationof biomedical named entities, relations between the entities, or events associatedwith them. Noticeable efforts have been invested on mining the potential adversedrug events in different forms of free‐text data. Examples include Wang [3] who applied the MedLEE system on discharge summaries to identifymedication events and entities that could be potential adverse drug effects; thesewere detected using the strength of statistical association based on theirco‐occurrences. Leaman et. al. [4] proposed a lenient NLP model for extracting adverse effects of drugs fromsocial media such as blogs. Gurulingappa et. al. [5] developed a machine learning‐based system for classifying thesentences in MEDLINE case reports that assert potential adverse drug events.However, according to the author’s knowledge, there is a limited focus onidentification of semantic relationships between drugs and adverse events in text.This is partly due to the unavailability of suitable open access corpora that couldbe used for technology development and benchmarking. Extracting relations betweendrugs and adverse effects can facilitate appropriate indexing, precise searching,visualization, faster information tracing and improve sensitivity of signaldetection in pharmacovigilance. The use of ontology of adverse drug events forautomated signal generation in pharmacovigilance has already been proposed [6] and its application to information retrieval has been exploited by thesame group few years later in the VIGITERMES project [7]. There, the OntoEIM adverse event ontology was used to extend thedictionary of adverse event entities, normalize queries, and consolidateannotations, achieving 29% precision and 67% recall on MEDLINE abstracts.Automatic extraction of potential adverse drug events from clinical records is anactive area of research [8]. Mining social internet message boards to identify potential adverse drugevents has been reported [9], whereby in that work the extraction of drug‐event pairs wasdetermined only using co‐occurrence of terms within a window of 20 tokensapart, and the use of machine learning systems was only focused onde‐identification for privacy protection. This work reports on the adaptationof a machine learning‐based system for identifying the relations between drugsand adverse effects in MEDLINE case reports; it relies on an ontology‐drivenmanually annotated corpus that strictly follows semantic annotation guidelinesdeveloped for clinical text [10]. The system has been qualitatively evaluated and studied for its abilityof support real time pharmacovigilance studies.


Corpus preparation

The data set used for training and validation of the relation extraction systemis the ADE corpus [11]. The ADE corpus contains 2972 MEDLINE case reports that are manuallyannotated in duplicate and harmonized by three annotators. The corpus containsannotations of 5063 drugs, 5776 conditions (e.g. diseases, signs, symptoms), and6821 relations between drugs and conditions representing clear adverse events.All annotations are confined to sentence level i.e. drugs and conditionsrepresenting adverse events co‐occurring only within individual sentencesare annotated. Drugs and conditions that are not part of a potential adverseevent relation are not annotated. This was done in accordance to the annotationguidelines. The ADE corpus contains annotations of relations between drugs andconditions that represent True relations. This represents a sparselyannotated dataset. For training a supervised classifier, it was essential togenerate False relations i.e. drugs and conditions that do not fallinto adverse effect relations but that are still within the same sentence. Forthis purpose, ProMiner, a dictionary‐based named entity recognition system [12] was employed. ProMiner was incorporated with DrugBank [13] and MedDRA [14] dictionaries for the identification of drugs and conditionsrespectively in the ADE corpus that were previously not annotated by humanannotators. As a result of named entity recognition, new instances encompassing2269 drugs and 3437 conditions were automatically annotated.Drug‐condition pairs co‐occurring within sentences that werepreviously not annotated by humans formed False relations. Altogether,5968 False relations were automatically generated. The corpus enrichedwith machine annotated drugs, conditions, and relations between them is referredas ADE‐EXT (indicating extended ADE corpus). Figure 1 shows an illustration of True and Falserelations between drug and conditions co‐occurring within a sentence.

Figure 1
figure 1

Example of an annotated sentence in the ADE corpus. Example of asentence annotated with drug, conditions, and relations between them inthe ADE corpus. True indicates presence of adverse effectrelation and False indicates absence of adverse effectrelation.

In the ADE‐EXT corpus, 120 manually annotated True relations werenot suitable for the NLP task. Examples include overlapping inter‐relatedentities such as acute lithium toxicity where lithium isrelated to acute toxicity. After removal of nested annotations, theADE‐EXT corpus was decomposed into a training set(ADE‐EXT‐TRAIN) and a test set (ADE‐EXT‐TEST). Counts ofentities and relations in subsets of ADE‐EXT corpora are shown in Table1.

Table 1 Counts of entities and relations in ADE‐EXT corpus subsets

Relation extraction workflow

For the identification and extraction of drug‐condition entity pairs thatconstitute a potential adverse event relation, the Java Simple RelationExtraction (JSRE) system [15] was employed. JSRE provides a re‐trainable and scalablesupervised classification platform that uses Support Vector Machines (SVMs) [16] with different kernels specially designed for the NLP and relationextraction. All sentences in ADE‐EXT‐TRAIN andADE‐EXT‐TEST containing drug‐condition pairs labelled aseither True or False were transformed into the SRE formatbefore subjecting them to relation extraction. The SRE format is a unique way ofrepresenting data within the JSRE platform where tokens appearing in sentencesare enriched with their parts‐of‐speech tags, lemmas, and flagsindicating if a token is a part of named entity or not. Amongst differentkernels available, the shallow linguistic kernel was thoroughly used since ithas been widely applied and has shown success during similar relation extractiontasks [17]. The ADE‐EXT‐TRAIN was used as data for training andcross‐validation of JSRE whereas the ADE‐EXT‐TEST was used asan independent test set.

Mapping annotation ontology against ontology of adverse events

The Clinical E‐Science Framework (CLEF) initiative [18] investigated how to generate semantically annotated medical corporafor information extraction. As described by Gurulingappa et. al. [11], we adopted the standard established by the CLEF framework for theannotation workflow [10] however we reshaped the annotation schema by using only two of theoriginal entities (condition, drug) and extended it with a third one (dosage).None of the relationships used by the CLEF annotation schema could be reused forour work, since the CLEF annotation schema did not consider adverse drug events,instead we created two relations: drug‐cause‐condition,drug‐has‐dosage. In this work we focused only on automating thedetection of drug‐cause‐condition thus dosage will not be mentionedfurther. The ADE corpus has been created using the Knowtator plugin forProtégé [19], an ontology‐driven corpus annotation tool also used for thecreation of the CLEF corpus. Although we adopted the same tool used in CLEF andalso adopted the standard established by the CLEF framework for the annotationworkflow, we could not adopt the same annotation ontology since the latter wasnot able to capture drug‐adverse event and drug‐dosage relations.The annotation ontology described above was therefore used to create the ADEcorpus. Subsequent to the corpus creation, the realism‐based biomedicalontology for representation of adverse events (OAE) has been published [20]. OAE has been developed following the principles of OntologicalRealism, thus is aligned with the Basic Formal Ontology and the RelationOntology, and with the Open Biological and Biomedical Ontologies (OBO) Foundryprinciples of openness, collaboration and use of a common shared syntax. OAE has484 representational units, annotated by means of 369 terms with specificidentifiers and 115 terms imported from existing ontologies. The use ofontologies has proven of great value in biomedicine, also since it enablemachine reasoning, abstraction and automatic hypothesis generation. We thereforehad interest in investigating if the knowledge encoded in the annotations of theADE corpus could be semantically connected to the OAE. For doing this, wemanually compared the definitions of the entities of OAE and of ADE annotationontology. Figure 2 shows the basic design patterns of OAE,ADE and CLEF as from the original papers, emphasizing shared entities usinggreen and red colors.

Figure 2
figure 2

Ontologies discussed in this work. Mappings between ADE, OAE, andCLEF ontologies have been shown. Identical entities are in boxes withsame colours. Condition in the CLEF ontology is mapped toProcess in the OAE.

Results and discussion

Performance evaluation criteria

The performance of relation extraction was evaluated by 10‐foldcross‐validation of the training data. During cross‐validation ofthe training data and final evaluation over the test set, classificationperformances were assessed using the F‐score overTrue‐labelled relations since they represent potential adverseevent relations between drugs and conditions that denote a focused relationclass being studied.

Assessment of relation extraction

Baseline experiments began with training and cross‐validation of JSRE overthe ADE‐EXT‐TRAIN corpus. Results of system’s performances areshown in Table 2. The system achieved an overallF‐score of 0.87 after cross‐validation. Upon the final test overADE‐EXT‐TEST, the system attained F‐score of 0.87 indicating aconsistency in classification. A subset of instances misclassified during thecross‐validation and testing were manually investigated to understand thecommon sources of errors. Limited context appeared to be one reason formisclassification. For example, the title Niacin maculopathy(PMID:3174043) infers maculopathy as a potential adverse event ofniacin that lacks contextual description to support machineclassification. Distantly co‐occurring inter‐related entitiesconstituted couple of errors. For example, in the sentenceCASE SUMMARY: A65‐year‐old patient chronically treated with the selectiveserotonin reuptake inhibitor (SSRI) citalopram developed confusion,agitation, tachycardia, tremors, myoclonic jerks and unsteady gait,consistent with serotonin syndrome, following initiation of fentanyl, andall symptoms and signs resolved following discontinuation offentanyl(PMID:17381671); the relation between confusion and thelast appearing drug name fentanyl was incorrectly classified. Casereports often contain frequencies at which potential adverse events wereobserved. For instance, The toxic effects of methotrexate included elevatedliver transaminases (3/4), nausea (2/4), abdominal pain (2/4), bone pain(2/4), mild neutropenia (1/4), and mild pruritus (1/4)(PMID:433855);this sentence shows examples of relations where the system had difficulties inidentification of correct relations. Potential adverse drug events arecategorized according to their severity: serious suspected adverse drugreactions require immediate action by medical professionals. Manualinvestigation of the predicted results showed that the system was able tocapture most of the serious potential adverse events. These findings demonstratethe potential of this approach to facilitate the identification of potentialsignals from case reports, of great interest for drug safety experts.

Table 2 Assessment of results of relation extraction

Impact of size of the training set on the performance

In order to study the impact of size of the training data on performance ofclassification, the ADE‐EXT‐TRAIN was decomposed into random subsetscontaining 10, 20, 50, 100, 200, 500, 1000, and 2000 documents. The JSRE wastrained over these subsets independently in different rounds and evaluated by10‐fold cross‐validation. Table 3 shows thatalready using 200 documents one could achieve performances over the 80%range. Whereby, to obtain a classifier with a standard deviation of 1%, oneneeds a substantially large training data.

Table 3 Impact of size of the training set on relation extraction

Mapping the ADE annotation ontology to the ontology of adverse events

As clearly shown in Figure 2, both the ADE annotationontology and OAE represent adverse drug reactions using formal ontologicalmethods. In spite of this common goal, the two ontologies use different namingfor the two core entities: a Condition in the ADE annotation ontologycoincide with a drug adverse event in OAE; a Drug in the ADEannotation ontology coincide with a drug‐administration in OAE.The ADE ontology additionally introduce the entity dosage, notspecified in OAE at the time of its development since OAE originally focused onvaccines for which dosing is not an essential medical concept. Both ADE and OAEmodel a causal relationship between Condition or Adverse eventand Drug or Medical intervention, with the latter being thecausal source. The only entity shared by the CLEF annotation ontology with OAEand ADE is the Drug‐or‐device, that coincide with aDrug or Medical intervention.

Use case study: large scale relation extraction

An experiment was conducted in order to understand the real‐world use casescenarios for the extraction of potential adverse drug events from text. This wasperformed by applying the trained extraction tool to the whole MEDLINE andthereafter comparing them to the information present in drug leaflets present in theSIDER [21] database. Some of the automatically extracted potential adverse drugevents, not present in SIDER, were manually investigated for their validity bycomparison to the Medicines and Healthcare products Regulatory Agency (MHRA) druglabel changes reported in 2009.

Relation extraction from MEDLINE

MEDLINE articles published before 2009 were gathered to form a Medline‐2009corpus. ProMiner was equipped with DrugBank and MedDRA dictionaries for taggingdrugs and conditions occurring in sentences of Medline‐2009. A JSRE modeltrained over the ADE‐TRAIN‐EXT corpus was applied for classificationof relations between drugs and conditions as True or Falsewhere a True relation indicates potential drug‐related adverseevent. As a result of relation extraction, 165680 relations were extractedbetween 1611 drugs and 5079 adverse effects where drugs and adverse effects werenormalized to DrugBank and MedDRA respectively.

Adverse effect extraction from SIDER

Side Effect Resource (SIDER) is a database of adverse drug effects that links 888drugs to 1450 adverse effects. It has been constructed manually from the summaryof product leaflets of each drug. Drugs and their adverse effects were extractedfrom SIDER version 1.01 that contains drug leaflets published before 2009.

MHRA drug label changes

In 2009, the MHRA proposed safety label updates for 26 drugs. These were ofcourse not all the safety label updates that the MHRA identified in 2009, butthose that MHRA decided to give particular visibility through their web site.These new adverse drug effects were manually extracted and they serve as astandard reference for validation of potential adverse drug events automaticallyextracted from Medline‐2009 using the JSRE trained method.

Validation of large scale relation extraction

From the MHRA label change dataset, three drugs were arbitrarily chosen fordeeper investigation. They are Rituximab, Efalizumab, and Natalizumab: threeanti‐neoplastic and immunomodulatory monoclonal antibodies. For the threedrugs of interest, potential adverse drug events were selected from theMedline‐2009 predictions and SIDER. Potential adverse drug eventsextracted from Medline‐2009 that are not reported in SIDER were manuallychecked against the label changes of MHRA.

Manual investigation of machine predicted potential adverse events showed thatthe system was able to capture valid potential adverse events fromfree‐text that were not yet reported in product leaflets (Table 4). These adverse effects were later updated on drug labelsby the UK regulatory authorities. This instance provides a good example for howthe developed framework can help in capturing potential adverse drug events fromliterature and therefore support pharmacovigilance.

Table 4 Potential adverse drug events extracted from MEDLINE not reported indrug leaflets until 2009 and later introduced in packageleaflets


This work reports on the adaptation of a machine learning‐based JSRE system forthe identification and extraction of potential adverse events of drugs in scientificcase reports. A methodology has been discussed to enrich a sparsely annotated corpusand its subsequent use to build classification models. Evaluation of thesystem’s performance showed promising results. A use‐case studyperformed on relation extraction from large scale literature showed thesystem’s ability to capture valid, under‐reported, and novel potentialadverse events not yet present in product leaflets.

The performance of the system can be improved in several ways. In the currentexperiments, only the default features acceptable by JSRE were used. Optimization offeature representation to include additional features for instance from syntacticsentence parse trees may further improve the results. Development of additionalstrategies like post‐processing to classify relations with missing contextualdescriptions can help to recover more relations. Furthermore, extension of handlinginter‐sentence relations needs to be considered in order to further increasecoverage.

The reported experimental results denote the research status on identification fromtext of potential adverse drug events. There are several strategies that are beingfollowed. The authors plan to benchmark the performances of several named entitytaggers against the ADE corpus for the identification of drugs and conditionsmentions in text. The current experiments have been performed on the ADE corpus,since that was the only one available when this work was done, however while writingthis report a new corpus has been published, namely the EU‐ADR corpus [22]. It will be interesting to see if the performance of JSRE on the ADEcorpus will be different compared to the EU‐ADR corpus.

Similarly, benchmarking results of public and commercial relation extraction systemswill be performed [23] and the practical impact of the information extracted from text onpredicting drug label changes will be studied in detail.

The use of ontologies for driving information extraction has been reported [24, 25]. We plan to explore the use of various available tools (e.g. ODIE,OBCIE,semantixs) using the OAE ontology and compare the performance of the ontologydriven / based methods for information extraction against the method presentedhere.

The current work has demonstrated promising results, it has the potential to reducethe manual reading time, improve the quality of the signal detection process, andtherefore positively contribute to safer use of drugs to the benefit of patients andsociety. We speculate that this work could also pave the road to pharmacovigilanceapplications on social media and multimedia sources too.


  1. Hauben M, Bate A: Decision support methods for the detection of adverse events inpost‐marketing data. Drug Discov Today. 2009, 14 (7‐8): 343-357. 10.1016/j.drudis.2008.12.012.

    Article  Google Scholar 

  2. Vandenbroucke JP: In defense of case reports and case series. Ann Intern Med. 2001, 134 (4): 330-334.

    Article  Google Scholar 

  3. Wang X, Hripcsak G, Markatou M, Friedman C: Active computerized pharmacovigilance using natural language processing,statistics, and electronic health records: a feasibility study. J Am Med Inform Assoc. 2009, 16 (3): 328-337. 10.1197/jamia.M3028.

    Article  Google Scholar 

  4. Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G: Towards internet‐age pharmacovigilance: extracting adverse drugreactions from user posts to health‐related social networks. Proceedings of the 2010 Workshop on Biomedical Natural LanguageProcessing. Edited by: Dina Demner‐Fushman K, Cohen Bretonnel, Ananiadou Sophia, PestianJohn, Tsujii Jun’ichi, Webber Bonnie. 2010, Uppsala, Sweden, 117-125.–leaman.pdf,

    Google Scholar 

  5. Gurulingappa H, Fluck J, Hofmann‐Apitius M, Toldo L: Identification of Adverse Drug Event Assertive Sentences in Medical CaseReports. First International Workshop on Knowledge Discovery and Health CareManagement (KD‐HCM), European Conference on Machine Learning andPrinciples and Practice of Knowledge Discovery in Databases (ECML PKDD). Edited by: Rangwala H, Tagarelli A, Wale N, Karypis G. 2011, Athens, Greece, 16‐27-16‐27.–hcm/proc/KDHCM11_procs.pdf,

    Google Scholar 

  6. Henegar C, Bousquet C, Lillo‐Le Louet A, Degoulet P, Jaulent MC: Building an ontology of adverse drug reactions for automated signalgeneration in pharmacovigilance. Comput Biol Med. 2006, 36: 748-767. 10.1016/j.compbiomed.2005.04.009.

    Article  Google Scholar 

  7. Delamarre D, Lillo‐Le Louët A, Guillot L, Jamet A, Sadou E, Ouazine T, Burgun A, Jaulent MC: Documentation in pharmacovigilance: using an ontology to extend and normalizePubmed queries. Stud Health Technol Inform. 2010, 160 (Pt 1): 518-522.

    Google Scholar 

  8. Aramaki E, Miura Y, Tonoike M, Ohkuma T, Masuichi H, Waki K, Ohe K: Extraction of adverse drug effects from clinical records. MEDINFO 2010 ‐ Proceedings of the 13th World Congress on Medicalinformatics, Series: Studies Health Technology Informatics, Volume 160. Edited by: Safran C. 2010, Cape Town, South Africa: IOS Press, 739‐743-739‐743. 10.3233/978. –1–60750–588–4–739,

    Google Scholar 

  9. Benton A, Ungar L, Hill S, Hennessy S, Mao J, Chung A, Leonard C, Holmes J: Identifying potential adverse effects using the web: A new approach tomedical hypothesis generation. J Biomed Informatics. 2011, 44: 989-996.

    Article  Google Scholar 

  10. Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Roberts I, Setzer A: Building a semantically annotated corpus of clinical texts. J Biomed Informatics. 2009, 42: 950-966. 10.1016/j.jbi.2008.12.013.

    Article  Google Scholar 

  11. Gurulingappa H, Mateen‐Rajput A, Roberts A, Fluck J, Hofmann‐Apitius M, Toldo L: Development of a Benchmark Corpus to Support the Automatic Extraction ofDrug‐related Adverse Effects from Medical Case Reports. J Biomed Informatics. 2012, 45: 885-892. 10.1016/j.jbi.2012.04.008.

    Article  Google Scholar 

  12. Hanisch D, Fundel K, Mevissen HT, Zimmer R, Fluck J: ProMiner: rule‐based protein and gene entity recognition. BMC Bioinformatics. 2005, 6 (Suppl 1:S14): 10.1186/1471. [–2105–6–S1–S14]

    Google Scholar 

  13. Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, Djoumbou Y, Eisner R, Guo AC, Wishart DS: DrugBank 3.0: a comprehensive resource for ’omics’ research ondrugs. Nucleic Acids Res. 2011, 39 (Database issue): D1035—D1041-10.1093/nar/gkq1126.

    Google Scholar 

  14. Merrill GH: The MedDRA paradox. Proceedings of the AMIA 2008 Annual Symposium. 2008, Washington, DC, USA, 470-474.–0470–s2008.pdf,

    Google Scholar 

  15. Giuliano C, Lavelli A, Pighin D, Romano L: FBK‐IRST: Kernel Methods for Semantic Relation Extraction. Proceedings of the Fourth International Workshop on SemanticEvaluations. Edited by: Richard W, Lluís M, Agirre E, Lluís M, Richard W. 2007, Prague, Czech Republic, 141‐144-141‐144.–new/S/S07/S07–1000.pdf,

    Google Scholar 

  16. Burges C: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery. 1998, 2: 121‐167-

    Article  Google Scholar 

  17. Tikk D, Thomas P, Palaga P, Hakenberg J, Leser U: A comprehensive benchmark of kernel methods to extract protein‐proteininteractions from literature. PLoS Comput Biol. 2010, 6: e1000837-10.1371/journal.pcbi.1000837.

    Article  MathSciNet  Google Scholar 

  18. Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Roberts I, Setzer A: The CLEF corpus: semantic annotation of clinical text. Proceedings of the AMIA Symposium. 2007, Chicago, IL, USA, 625-629.–0625–s2007.pdf,

    Google Scholar 

  19. Ogren P: Knowtator: a Protégé plug‐in for annotated corpusconstruction. Proceedings of the 2006 conference of the North American chapter of theassociation for computational linguistics on human language technology. Edited by: Moore Robert C, Bilmes Jeff, Chu‐Carroll Jennife, SandersonMark. 2006, New York, NY, USA, 273-275.–new/N/N06/N06–4006.pdf,

    Google Scholar 

  20. Yongqun H, Zuoshuang X, Sarntivijai S, Toldo L, Ceusters W: AEO: A Realism‐Based Biomedical Ontology for the Representation ofAdverse Events. “Representing Adverse Events” at the International Conference onBiomedical Ontology. Edited by: Courtot M, Goldfain A, Yongqun He O, Ruttenberg A. 2011, NY, USA: Buffalo,–events/docs/papers/HeAEICBO2011_submission.pdf,

    Google Scholar 

  21. Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P: A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol. 2010, 6: 343-10.1038/msb.2009.98.

    Article  Google Scholar 

  22. van Mulligen E, Fourrier‐Reglat A, Gurwitz D, Molokhia M, Nieto A, Trifiro G, Kors J, Furlong L: The EU‐ADR Corpus: Annotated Drugs, Diseases, Targets, and theirRelationships. J Biomed Informatics. 2012, 45: 879-884. 10.1016/j.jbi.2012.04.004.

    Article  Google Scholar 

  23. Toldo L, Gurulingappa H, Mateen‐Rajput A, Kors J, Suri S, Tayrouz Y: Impact of Automatic Detection of Adverse Events on Prediction of Drug LabelChanges. J Pharmacoepidemiology and Drug Saf. 2012, [Submitted],

    Google Scholar 

  24. Wimalasuriya D, Dou D: Ontology‐based information extraction: an introduction and a survey ofcurrent approaches. J Information Sci. 2010, 36: 306-323. 10.1177/0165551509360123.

    Article  Google Scholar 

  25. Pandit S, Honavar V: Ontology‐guided extraction of complex nested relationships. 22nd IEEE International Conference on tools with artificial intelligence(ICTAI). Edited by: Pierre M. 2010, France: Arras, 173-178.,

    Google Scholar 

Download references


Harsha Gurulingappa would like to thank his PhD guide Prof. Dr. MartinHofmann‐Apitius and former colleagues at Fraunhofer Institute SCAI forsupporting the foundational aspects of this work.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Luca Toldo.

Additional information

Competing interests

LT is employee of Merck KGaA. AM‐R was founded by Merck KGaA. HG has noconflicts of interests to declare.

Authors’ contributions

HG and LT contributed equally to the experimental settings. HG had a main role in theediting of the manuscript. AMR was involved in the quality control and documentreviewing. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Gurulingappa, H., Mateen‐Rajpu, A. & Toldo, L. Extraction of potential adverse drug events from medical case reports. J Biomed Semant 3, 15 (2012).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: