Content area

Abstract

Chemogenomics data generally refers to the activity data of chemical compounds on an array of protein targets and represents an important source of information for building in silico target prediction models. The increasing volume of chemogenomics data offers exciting opportunities to build models based on Big Data. Preparing a high quality data set is a vital step in realizing this goal and this work aims to compile such a comprehensive chemogenomics dataset. This dataset comprises over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations. Our aspiration is to create a useful chemogenomics resource reflecting industry-scale data not only for building predictive models of in silico polypharmacology and off-target effects but also for the validation of cheminformatics approaches in general.

Details

Title
ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics
Author
Sun, Jiangming 1 ; Jeliazkova, Nina 2 ; Chupakin, Vladimir 3 ; Golib-dzib, Jose-felipe 4 ; Engkvist, Ola 1 ; Carlsson, Lars 1 ; Wegner, Jörg 3 ; Ceulemans, Hugo 3 ; Georgiev, Ivan 2 ; Jeliazkov, Vedrin 2 ; Kochev, Nikolay 5 ; Ashby, Thomas J 6 ; Chen, Hongming 1 

 Discovery Sciences, Innovative Medicines and Early Development Biotech Unit, AstraZeneca R&D Gothenburg, Mölndal, Sweden 
 Ideaconsult Ltd., Sofia, Bulgaria 
 Computational Biology, Discovery Sciences, Janssen Pharmaceutica NV, Beerse, Belgium 
 Computational Biology, Discovery Sciences, Janssen Cilag SA, Toledo, Spain 
 Ideaconsult Ltd., Sofia, Bulgaria; Department of Analytical Chemistry and Computer Chemistry, University of Plovdiv, Plovdiv, Bulgaria 
 Imec vzw, Louvain, Belgium 
Pages
1-9
Publication year
2017
Publication date
Mar 2017
Publisher
Springer Nature B.V.
e-ISSN
1758-2946
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
1953626163
Copyright
Journal of Cheminformatics is a copyright of Springer, 2017.