Abstract

Background

Biomolecular interactions that modulate biological processes occur mainly in cavities throughout the surface of biomolecular structures. In the data science era, structural biology has benefited from the increasing availability of biostructural data due to advances in structural determination and computational methods. In this scenario, data-intensive cavity analysis demands efficient scripting routines built on easily manipulated data structures. To fulfill this need, we developed pyKVFinder, a Python package to detect and characterize cavities in biomolecular structures for data science and automated pipelines.

Results

pyKVFinder efficiently detects cavities in biomolecular structures and computes their volume, area, depth and hydropathy, storing these cavity properties in NumPy arrays. Benefited from Python ecosystem interoperability and data structures, pyKVFinder can be integrated with third-party scientific packages and libraries for mathematical calculations, machine learning and 3D visualization in automated workflows. As proof of pyKVFinder’s capabilities, we successfully identified and compared ADRP substrate-binding site of SARS-CoV-2 and a set of homologous proteins with pyKVFinder, showing its integrability with data science packages such as matplotlib, NGL Viewer, SciPy and Jupyter notebook.

Conclusions

We introduce an efficient, highly versatile and easily integrable software for detecting and characterizing biomolecular cavities in data science applications and automated protocols. pyKVFinder facilitates biostructural data analysis with scripting routines in the Python ecosystem and can be building blocks for data science and drug design applications.

Details

Title
pyKVFinder: an efficient and integrable Python package for biomolecular cavity detection and characterization in data science
Author
João Victor da Silva Guerra; Helder Veras Ribeiro-Filho; Jara, Gabriel Ernesto; Leandro Oliveira Bortot; José Geraldo de Carvalho Pereira; Lopes-de-Oliveira, Paulo Sérgio  VIAFID ORCID Logo 
Pages
1-13
Section
Software
Publication year
2021
Publication date
2021
Publisher
BioMed Central
e-ISSN
14712105
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2620938762
Copyright
© 2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.