Content area

Abstract

Proteins drive biochemical transformations by transitioning through distinct conformational states. Understanding these states is essential for modulating protein function. Although X-ray crystallography has enabled revolutionary advances in protein structure prediction by machine learning, this connection was made at the level of atomic models, not the underlying data. This lack of connection to crystallographic data limits the potential for further advances in both the accuracy of protein structure prediction and the application of machine learning to experimental structure determination. Here, we present SFCalculator, a differentiable pipeline that generates crystallographic observables from atomistic molecular structures with bulk solvent correction, bridging crystallographic data and neural network-based molecular modeling. We validate SFCalculator against conventional methods and demonstrate its utility by establishing three important proof-of-concept applications. First, SFCalculator enables accurate placement of molecular models relative to crystal lattices (known as phasing). Second, SFCalculator enables the search of the latent space of generative models for conformations that fit crystallographic data and are, therefore, also implicitly constrained by the information encoded by the model. Finally, SFCalculator enables the use of crystallographic data during training of generative models, enabling these models to generate an ensemble of conformations consistent with crystallographic data. SFCalculator, therefore, enables a new generation of analytical paradigms integrating crystallographic data and machine learning.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* improved typesetting in figures and text; correction of incorrect hyperlinked figure references (which pointed to Figure 1 rather than to Figure 2); addition of a few missing references; small typo corrections and language edits.

* https://github.com/Hekstra-Lab/SFcalculator

* https://github.com/minhuanli/deeprefine

Details

1009240
Business indexing term
Title
SFCalculator: connecting deep generative models and crystallography
Publication title
bioRxiv; Cold Spring Harbor
Publication year
2025
Publication date
Jan 19, 2025
Section
New Results
Publisher
Cold Spring Harbor Laboratory Press
Source
BioRxiv
Place of publication
Cold Spring Harbor
Country of publication
United States
University/institution
Cold Spring Harbor Laboratory Press
Publication subject
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Milestone dates
2025-01-15 (Version 1)
ProQuest document ID
3155862052
Document URL
https://www.proquest.com/working-papers/sfcalculator-connecting-deep-generative-models/docview/3155862052/se-2?accountid=208611
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-01-20
Database
ProQuest One Academic