What’s new and what’s changing in ChEBI in 2011

 

INTRODUCTION

ChEBI — Chemical Entities of Biological Interest — is an ontology of chemical entities such as molecules and ions, and their roles in biological contexts (de Matos, 2010).  As of April 2011, it contains in total around 25,000 classes. Here, we report on recent developments and changes in the ontology, and give a brief view on ongoing work that will lead to changes in the future.

Authors

Janna Hastings*, Paula de Matos, Adriano Dekker, Marcus Ennis, Kenneth Haug,
Zara Josephs, Gareth Owen, Steve Turner and Christoph Steinbeck

European Bioinformatics Institute, Hinxton, UK CB10 1SD

ONTOLOGY RESTRUCTURING

Role-structure disentanglement.

Prior to 2009, the ‘is a’ relationship in ChEBI was overloaded, linking molecular entities with chemical classes and specifying the ‘roles’ that chemical entities can enact in various contexts. To address this, the relationship ‘has role’ was introduced and used to link molecular entities to roles, for example, the molecular entity acetylsalicylic acid (CHEBI:15365) ‘has role’ non-narcotic analgesic (CHEBI:35481). The initial disentanglement was performed programmatically, and subsequent manual curation was required to clean up, since errors occurred in cases where, say, a chemical entity lacked a structure and was only classified with a role parent. Current curation efforts are underway to fully define classes which are specified with both structural and role-based features, such as the entity tricyclic antidepressant (CHEBI:36809), which is defined as ‘is a’ organic tricyclic compound and ‘has role’ antidepressant.

Mapping to upper level ontologies.

In order to comply with our goal of increasing interoperability with other ontologies in the biomedical domain, ChEBI has undertaken to provide a mapping to the upper level ontology BFO.  Mapping multiple ontologies beneath a common upper level allows easier linking between ontologies, since it reduces ambiguities in interpretations through clear ontological commitment. The ChEBI-BFO mapping is provided as a bridge OWL file, downloadable alongside the ChEBI ontology OBO and OWL exports, available at ftp://ftp.ebi.ac.uk/pub/databases/chebi/ontology/.

AUTOMATED SUBMISSIONS

ChEBI growth is entirely community and user request driven. To better meet our user needs, ChEBI provides a community curation platform in the form of a web-based submission tool. The submission tool is accessible at:
https://www.ebi.ac.uk/chebi/submissions.

The key advantage for users of the submission tool is that they are able to directly deposit new chemicals or roles into the ChEBI production database, and retrieve an identifier which they can start to use immediately. The identifier is maintained, although it will only become publicly available via the ChEBI public interface at the next monthly release.

CURATION EFFORTS

In order to adequately deal with user-requested mixtures and polymers within the ontology, ChEBI has expanded its ‘chemical substance’ hierarchy, differentiating between pure and mixed substances.  A pure substance is a macroscopic homogeneous collection of molecular entities, while a mixture contains a non-homogeneous collection — at least two different types of molecular entity. In particular, this allows us to adequately model racemic mixtures, which are crucial in the representation of drugs, since many active substances found in drugs are formulated as racemic mixtures.

A large-scale ongoing effort is focused on annotating compounds relevant for immunology. Also, ChEBI is currently refactoring the representation of natural products in the ontology, which is currently inconsistently represented. Natural products will be given the role ‘secondary metabolite’.

Relationship evaluation

ChEBI is moving towards including full logical definitions for structure-based classes in the ontology where possible, and is continuing the alignment with BFO and with the Relation Ontology (RO). These efforts include a thorough evaluation of the relationships currently used in the ontology. Some relationships will be added, such as ‘disjoint from’, while others will be deprecated if they prove resistant to being assigned a logical definition.

References

de Matos, P., Alcántara, R., Dekker, A., Ennis, M., Hastings, J., Haug, K., Spiteri, I., Turner, S., and Steinbeck, C. (2009). Chemical entities of biological interest: an update. Nucleic Acids Res, 38, D249–D254.