NCBO Technology: Powering semantically aware applications
© Whetzel and NCBO Team; licensee BioMed Central Ltd. 2013
Published: 15 April 2013
As new biomedical technologies are developed, the amount of publically available biomedical data continues to increase. To help manage these vast and disparate data sources, researchers have turned to the Semantic Web. Specifically, ontologies are used in data annotation, natural language processing, information retrieval, clinical decision support, and data integration tasks. The development of software applications to perform these tasks requires the integration of Web services to incorporate the wide variety of ontologies used in the health care and life sciences. The National Center for Biomedical Ontology, a National Center for Biomedical Computing created under the NIH Roadmap, developed BioPortal, which provides access to one of the largest repositories of biomedical ontologies. The NCBO Web services provide programmtic access to these ontologies and can be grouped into four categories; Ontology, Mapping, Annotation, and Data Access. The Ontology Web services provide access to ontologies, their metadata, ontology versions, downloads, navigation of the class hierarchy (parents, children, siblings) and details of each term. The Mapping Web services provide access to the millions of ontology mappings published in BioPortal. The NCBO Annotator Web service “tags” text automatically with terms from ontologies in BioPortal, and the NCBO Resource Ind ex Web services provides access to an ontology-based index of public, online data resources. The NCBO Widgets package the Ontology Web services for use directly in Web sites. The functionality of the NCBO Web services and widgets are incorporated into semantically aware applications for ontology development and visualization, data annotation, and data integration. This overview will describe these classes of applications, discuss a few examples of each type, and which NCBO Web services are used by these applications.
NCBO Technology overview
BioPortal is an open repository of biomedical ontologies that stores ontologies developed in various formats, such as OWL, OBO format, Protégé frames, and the Rich release format, and provides access to this content via Web browsers and Web services [1, 2]. The BioPortal Web interface allows users to browse the list of ontologies, search and comment on terms in ontologies, annotate text with ontology terms, and search an ontology-based index of biomedical resources. The BioPortal architecture currently includes both LexEVS (http://informatics.mayo.edu/LexGrid) and the Protégé database (http://protege.stanford.edu), however work is underway to replace the dual database backend with a RDF database. A beta version of the BioPortal RDF database is available at:http://sparql.bioontology.org.
Classes of applications incorporating ontologies via NCBO Technology
Ontology development and visualization
With the growing interest in the use of ontologies in the health care and life sciences, additional tools are being developed to support the development of ontologies within new biomedical domains and the re-use of existing ontologies to build application ontologies. To this end, new plugins for ontology editing tools such as Protégé and OBO-Edit have been developed. These plugins use the NCBO Web services to aid in term re-use, to automatically generate ontology terms from text, provide an infrastructure for collaborative ontology development, and provide ways to visualize ontologies.
The OLS2OWL plugin is designed to aid ontology developers during the knowledge elicitation stage and allows ontology developers to search for terms from a repository of ontologies and compare similar classes, properties, and instances. The plugin was developed as part of the Open architecture for Accessible Services Integration and Standardization project, which facilitates interoperability across service providers, mobile devices (wearable devices, phones, palm, etc.) smart home technology, and medical care providers for elderly and disabled population. The Dresden Ontology Generator for Directed Acyclic Graphs plugin for Protégé 4 and OBO-Edit generates ontology terms, definitions, and relationships based on natural language text found in PubMed, the Web, or PDF documents and therefore supports the extension of existing ontologies with terms from resources commonly used in biocuration. These tools use the “List all Ontologies”, “Search”, and “Get Term” Web services.
In addition to tools for ontology re-use, infrastructure now exists for collaborative ontology development, a methodology commonly used in biomedical ontology development. WebProtégé is a web-based ontology-editing environment, which supports collaboration, enabling users to edit an ontology simultaneously, carry out discussions, and add comments to the terms. These comments and new term proposals can be submitted and viewed in BioPortal using the “Notes” Web services.
Ontologies are also commonly incorporated into data annotation applications. While BioPortal contains over 400 ontologies, to help identify ontologies that best cover the text for annotation the Ontology Recommender Web service can be used. The input to this Web service is either a list of terms or corpus of text and generates a ranked list of what ontologies best cover the text. The resulting ontologies can then be selected for use in data annotation applications and terms presented to the user in various ways. Data annotation applications represent the most widely used category of applications using the NCBO Web services.
The NCBO Ontology Web services are also used in applications to harmonize data elements. For example, openMDR uses the “List all Ontologies”, “Search”, and “Get Term” Web services to provide access for curators to select terms from ontologies such as the NCI Thesaurus, Ontology for Clinical Research, or SNOMED-CT. eleMap, a tool developed by the eMERGE Network , follows a similar workflow. The tool provides a mechanism for researchers to harmonize their local phenotype data dictionaries to existing metadata and terminology standards such as the Cancer Data Standards Registry and Repository, the NCI Thesaurus, and SNOMED-CT.
While many projects aim to collect annotated data upon submission of new data sets, unstructured text also accompanies data sets. The Annotator Web service can be used in these cases to identify ontology terms within a corpus of text and the data sets can be linked via these ontology annotations.
The NCBO Resource Index is an ontology-based index of publicly available biomedical databases. The text descriptions of database entries are processed using the Annotator Web service to identify ontology terms and then the results are stored in the Resource Index. The ontology-based index links the data records within a database and across disparate databases, providing a functional linkage based on the content of the data field as opposed to schema matching. These annotations and linkages are useful to more precisely identify data records of interest.
The suite of NCBO Web services power a variety of semantically aware software applications (see additional file 1). The Web services are used in various combinations to enable workflows for ontology development, data annotation, and data analysis. Future work will include expansion of the Web services to enhance selection of terms by ontology sub-setting, to build lexicons for use with the Annotator Web service, and for ontology enrichment analysis.
PLW is the Outreach Coordinator for the National Center for Biomedical Ontology.
The National Center for Biomedical Ontology is supported by the NHGRI, the NHLBI, and the NIH Common Fund under grant U54-HG004028. We thank Alex Skrenchuk from Stanford University for computer support.
The publication costs for this article were funded by the corresponding author's institution.
This article has been published as part of Journal of Biomedical Semantics Volume 4 Supplement 1, 2013: Proceedings of the Bio-Ontologies Special Interest Group 2012. The full contents of the supplement are available online at http://www.jbiomedsem.com/supplements/4/S1
- Noy NF, Shah NH, Whetzel PL, Dai B, Dorf M, Griffith N, Jonquet C, Rubin DL, Storey MA, Chute CG, Musen MA: BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res. 2009, 37 (Web Server issue): W170-3.View ArticleGoogle Scholar
- Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas C, Tudorache T, Musen MA: BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res. 2011, 39 (Web Server issue): W541-5.View ArticleGoogle Scholar
- Ghazvinian A, Noy NF, Musen MA: Creating mappings for ontologies in biomedicine: simple methods work. AMIA Annu Symp Proc. 2009, 2009: 198-202.Google Scholar
- Jonquet C, Shah NH, Musen MA: The open biomedical annotator. Summit on Translat Bioinforma. 2009, 2009: 56-60.Google Scholar
- Dai M, etal: An Efficient Solution for Mapping Free Text to Ontology Terms. AMIA Summit on Translational Bioinformatics. 2008, San Francisco, CAGoogle Scholar
- Shah NH, Bhatia N, Jonquet C, Rubin D, Chiang AP, Musen MA: Comparison of concept recognizers for building the Open Biomedical Annotator. BMC Bioinformatics. 2009, 10 (Suppl 9): S14-View ArticleGoogle Scholar
- Shah NH, Jonquet C, Chiang AP, Butte AJ, Chen R, Musen MA: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics. 2009, 10 (Suppl 2): S1-View ArticleGoogle Scholar
- Jonquet C, LePendu P, Falconer S, Coulet A, Noy NF, Musen MA, Shah NH: NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources. Web Semant. 2011, 9: 316-324.View ArticleGoogle Scholar
- BioPortal Import plugin. [http://protegewiki.stanford.edu/wiki/BioPortal_Import_Plugin]
- BioPortal Reference plugin. [http://protegewiki.stanford.edu/wiki/BioPortal_Reference_Plugin]
- OLS2OWL plugin. [http://ols2owl.sourceforge.net]
- Ontology Generation plugin. [http://http://protegewiki.stanford.edu/wiki/Ontology_Generation_Plugin_%28DOG4DAG%29]
- WebProtégé. [http://protegewiki.stanford.edu/wiki/WebProtege]
- RadLex Term Browser. [http://www.radlex.org]
- ISAcreator. [http://isatab.sourceforge.net/index.html]
- Maguire E, González-Beltrán A, Whetzel PL, Sansone SA, Rocca-Serra P: OntoMaton: a BioPortal powered ontology widget for Google Spreadsheets. Bioinformatics. 2012, [Epub ahead of print]Google Scholar
- Rightfield. [http://www.sysmo-db.org/rightfield]
- ECG Gadget. [http://cvrgrid.org/features/ecgrid-toolkit]
- openMDR. [http://www.cagrid.org/display/MDR/Overview]
- eleMap. [https://victr.vanderbilt.edu/eleMAP]
- McCarty CA, Chisholm RL, Chute CG, Kullo IJ, Jarvik GP, Larson EB, Li R, Masys DR, Ritchie MD, Roden DM, Struewing JP, Wolf WA, eMERGE Team: The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics. 2011, 4: 13-View ArticleGoogle Scholar
- RedFly. [http://redfly.ccr.buffalo.edu]
- Knowledge Egg. [http://sites.google.com/site/evidencebasedsupport/kunnskapsegget]
- GeneWiki. [http://en.wikipedia.org/wiki/Portal:Gene_Wiki]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.