Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does it mean when there is no output data in 'final_design_stats.csv'? #141

Open
nduan1 opened this issue Dec 19, 2024 · 6 comments
Open

Comments

@nduan1
Copy link

nduan1 commented Dec 19, 2024

Hi, the pipeline is amazing! I am trying to run a prediction on a ~300-aa peptide. Everything seems to be running smoothly, but there is no output in the 'final_design_stats.csv' file or the 'Accepted' folder after 6 days of processing. It is still running, though.

There is some data in the 'mpnn_design_stats.csv' and 'trajectory_stats.csv' files.

I used the default settings on my command line.
Here is the content of my JSON file:

{
"design_path": "/data/EmiolaLab/duann2/bin_proj/bindcraft/bcraft/BindCraft/candidate4-2/",
"binder_name": "can4",
"starting_pdb": "/data/EmiolaLab/duann2/bin_proj/bindcraft/bcraft/BindCraft/candidate4-2/can4.pdb",
"chains": "A",
"target_hotspot_residues": null,
"lengths": [1, 289],
"number_of_final_designs": 100
}

What does it mean if there is no final output in the 'final_design_stats.csv' file or the 'Accepted' folder? Should I change any settings? I would greatly appreciate any instructions you can provide.

Thanks!

@LennartNickel
Copy link
Collaborator

This means that no designs have made it through the filters. Have a look at the trajectories being sampled and see if they are reasonable or if you can find a reason for the failure in the failure.csv file. This can help you to adjust the settings. Sometimes, a lot of sampling is necessary, and in rare cases it just doesn't find a solution. However, I would advice to try a few more inputs (af2 prediction, difference crystal structures) and different hotspots.

@linuxfold
Copy link

linuxfold commented Dec 29, 2024

I am having a similar issue. From preprint:

We allow only 2 MPNNsol generated sequences per individual AF2 trajectory to
pass filters to promote interface diversity amongst selected binders.

If the overall success rate is very low, would it make sense to increase the amount of MPNNsol sequences per trajectory?

After like 1000 trajectories, there are no accepted designs but there are 2 different trajectories (with 5 _mpnn and 2 _mpnn versions) in mpnn_design_stats.csv.

Is there a way to re-run the mpnn pipeline on these trajectories for more than 2 examples until some of them get accepted designs? Would this be beneficial or am I misunderstanding something?

@Mikukub
Copy link

Mikukub commented Dec 30, 2024

Check at the code, the mpnn design is came from trajectory that mean fitter is too strict for you target. I think plddt, iptm might to high for your target. I think it should possible to redo mpnn step from trajectory but that need skill for python, I think he might update this function soon

@LennartNickel
Copy link
Collaborator

The amount of mpnn designs that are tried is by default 20 and can be increased in the advanced settings. The quote you are mentioning is linked to the amount of accepted designs allowed for each trajectory. There will be 20 mpnn sequences generated and the pipeline checks them for both af2 and rosetta filters one after the other. The moment 2 have passed all filters the remaining ones will not be checked and a new trajectory is started.

@Mikukub
Copy link

Mikukub commented Dec 31, 2024

I want to reuse trajectory but don't want to headache with the python, I would love to see it is implementing to the program since many people that tried bindcraft a lot of trajectory

@nduan1
Copy link
Author

nduan1 commented Jan 2, 2025

This means that no designs have made it through the filters. Have a look at the trajectories being sampled and see if they are reasonable or if you can find a reason for the failure in the failure.csv file. This can help you to adjust the settings. Sometimes, a lot of sampling is necessary, and in rare cases it just doesn't find a solution. However, I would advice to try a few more inputs (af2 prediction, difference crystal structures) and different hotspots.
Hi @LennartNickel, thanks for your reply. I tried using AF2 to pick the hotspot, but there is still no final output. I copied my failure.csv file. Could you guide me on how to interpret it? Thanks!

Trajectory_logits_pLDDT,Trajectory_softmax_pLDDT,Trajectory_one-hot_pLDDT,Trajectory_final_pLDDT,Trajectory_Contacts,Trajectory_Clashes,Trajectory_WrongHotspot,MPNN_score,MPNN_seq_recovery,pLDDT,pTM,i_pTM,pAE,i_pAE,i_pLDDT,ss_pLDDT,Unrelaxed_Clashes,Relaxed_Clashes,Binder_Energy_Score,Surface_Hydrophobicity,ShapeComplementarity,PackStat,dG,dSASA,dG/dSASA,Interface_SASA_%,Interface_Hydrophobicity,n_InterfaceResidues,n_InterfaceHbonds,InterfaceHbondsPercentage,n_InterfaceUnsatHbonds,InterfaceUnsatHbondsPercentage,Interface_Helix%,Interface_BetaSheet%,Interface_Loop%,Binder_Helix%,Binder_BetaSheet%,Binder_Loop%,InterfaceAAs_A,InterfaceAAs_C,InterfaceAAs_D,InterfaceAAs_E,InterfaceAAs_F,InterfaceAAs_G,InterfaceAAs_H,InterfaceAAs_I,InterfaceAAs_K,InterfaceAAs_L,InterfaceAAs_M,InterfaceAAs_N,InterfaceAAs_P,InterfaceAAs_Q,InterfaceAAs_R,InterfaceAAs_S,InterfaceAAs_T,InterfaceAAs_V,InterfaceAAs_W,InterfaceAAs_Y,HotspotRMSD,Target_RMSD,Binder_pLDDT,Binder_pTM,Binder_pAE,Binder_RMSD
60,7,17,43,0,58,0,0,0,573,117,998,0,1136,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants