Jue Wang ([email protected])
Doug Tischer ([email protected])
Sidney Lisanza ([email protected])
David Juergens ([email protected])
Joe Watson ([email protected])
This repository contains code for protein hallucination or inpainting, as
described in our
preprint. Code
for postprocessing and analysis scripts included in scripts/
.
All code is released under the MIT license.
All weights for neural networks are released for non-commercial use only under the Rosetta-DL license.
- Clone the repository:
git clone https://github.com/RosettaCommons/RFDesign.git
cd rfdesign
- Create environment and install dependencies:
cd envs
conda env create -f SE3.yml
- Download model weights (see license info above).
wget https://files.ipd.uw.edu/pub/rfdesign/weights.tar.gz
tar xzf weights.tar.gz
- Configure path to weights. Put a file called config.json in
hallucination/
andinpainting/
with the path to the weights directory. An example file is in each folder to copy from.
If you want/need to configure your environment manually, here are the packages in our environment:
- python 3.8
- pytorch 1.10.1
- cudatoolkit 11.3.1
- numpy
- scipy
- requests
- packaging
- pytorch-geometric (installation instructions)
- dgl (installation instructions)
- se3-transformer (install from github)
- lie_learn
- icecream (for
inpainting.py
)
- If you are running this on digs at the IPD, you don't need to do steps 3-4.
- If you are getting output pdbs that are a ball of disconnected segments (as viewed in pymol), this may be due to a problem with the spherical harmonics cached by SE3-transformer. A workaround is to copy the
hallucination/cache/
folder (a correct, clean copy of the cache) to your working directory before runninghallucinate.py
orinpaint.py
.
See READMEs in hallucination/
and inpainting/
subfolders.
J. Wang, S. Lisanza, D. Juergens, D. Tischer, et al. Deep learning methods for designing proteins scaffolding functional sites. bioRxiv (2021). link
M. Baek, et al., Accurate prediction of protein structures and interactions using a three-track neural network, Science (2021). link
An earlier version of our hallucination method can be found at the trdesign-motif repo and published at:
D. Tischer, S. Lisanza, J. Wang, R. Dong, I. Anishchenko, L. F. Milles, S. Ovchinnikov, D. Baker. Design of proteins presenting discontinuous functional sites using deep learning. (2020) bioRxiv link
Our work is based on previous hallucination methods for unconstrained protein generation and fixed-backbone sequence design (trDesign repo):
I Anishchenko, SJ Pellock, TM Chidyausiku, ..., S Ovchinnikov, D Baker. De novo protein design by deep network hallucination. (2021) Nature link
C Norn, B Wicky, D Juergens, S Liu, D Kim, B Koepnick, I Anishchenko, Foldit Players, D Baker, S Ovchinnikov. Protein sequence design by conformational landscape optimization. (2021) PNAS link