CaBLAM fixing and interpretation in comprehensive validation for cryo-EM
Hello, I would appreciate some insights on how to interpret the validation results under the CaBLAM tab and how to use it to fix my structure in Coot? What are the CaBLAM score, the Ca geometry score, the helix score and the 3-10 score? How to use the information under the secondary structure tab? In my protein for example, I see a lot of "try beta sheets", should I try to refine that segment using beta sheets restraints? Best regards.
[I forgot to copy my reply to the bulletin board, so here it is, reproduced
for the record.]
The official publication for CaBLAM is the 2017 Molprobity paper in Protein
Science, here: https://doi.org/10.1002/pro.3330 Further technical
documentation on CaBLAM can be found in Phenix's Computational
Crystallography Newsletter, here:
https://www.phenix-online.org/newsletter/CCN_2018_07.pdf#page=7 I
recommend the newsletter article as a fast read.
To identify outliers, CaBLAM looks at a structure's CA trace, which is
generally well-modeled. For each residue, it compares the local peptide
plane orientations of the model to the observed distribution of peptide
plane orientations for high quality residues with matching CA trace
geometry. The CaBLAM score is a percentile score that rates how well the
model matches with the expected distribution. The lower the score, the
rarer the observed conformation is in our database of quality structures.
A conformation falling in the bottom 5% of observed behavior is potentially
suspicious ("Disfavored") and a conformation falling in the bottom 1% is
considered an outlier.
This percentile-based scoring is fundamentally the same scoring used in
MolProbity's description of Ramachandran and rotamer outliers, though of
course CaBLAM puts its cutoffs in different places.
As a matter of interpretation, loop/coil regions tend to be highly varied.
CaBLAM "Disfavored" conformations in loops can largely be ignored.
However, disfavored conformations in regions expected to by highly regular
(repeating secondary structure) should be taken seriously. CaBLAM outliers
should be inspected wherever they occur.
The CA geometry score looks at just the CA trace, and takes the CA virtual
angle into account (defined by CAi-1, CAi, CAi+1). Outliers in this space
reflect some sort of severe problem with CA geometry, often involving an
over-extended or over-compressed CA virtual angle.
The secondary structure scores are based on how well a residue's local CA
trace matches the expected CA trace of each major secondary structure type,
alpha, 3-10, and beta. You can see the contours used for this assessment
in Figure 3 of the newsletter article. Each residue receives individual
secondary structure scores. Then regions of residues that all pass a
scoring threshold are assembled into probable secondary structure
elements. This is where the "try beta sheet" recommendations come from.
That recommendation indicates that the residue in question *and its
neighboring residues* all have CA traces that look like beta sheet.
I wish I had a simple recommendation for you, but fixing CaBLAM outliers
systematically has proven to be a challenge. Take a look at your structure
and see if you believe that the outlier residues really are intended to be
part of beta sheets. If so, beta sheets have distinctive hydrogen bonding
patterns that tend to be disrupted by the kind of problems that CaBLAM
identifies. Ideally, you will be able to use Coot's tools to restore the
proper hydrogen bonding. Then, applying hydrogen bonding restraints during
refinement may help keep your work in place.
If you have large regions of outliers, it may instead be more practical to
strip out the existing model and replace it with idealized beta sheet
structure, then rerefine. Again, hydrogen bonding restraints may be
helpful.
As a general rule, CaBLAM outliers usually indicate a problem with the
orientations for one or more peptide planes. Look for a way to reorient
the peptide either to remove clashes or establish hydrogen bonds. Make
sure you build good regular secondary structure, don't sweat about the
loops too much, and trust your judgement and experience to identify the
real and justified outliers.
We of the Richardson Lab generally dislike torsion-based Ramachandran
restraints/secondary structure restraints. It's very easy to accidentally
generate a model that looks better than it really is using these methods.
However, we recognize that these are powerful tools for a difficult problem
and you may get good out of careful use.
Hope that helps, and good luck
-Christopher Williams
---Richardson Lab, Duke University
On Wed, Jun 26, 2019 at 2:21 PM Ahmad Khalifa
Hello,
I would appreciate some insights on how to interpret the validation results under the CaBLAM tab and how to use it to fix my structure in Coot?
What are the CaBLAM score, the Ca geometry score, the helix score and the 3-10 score?
How to use the information under the secondary structure tab? In my protein for example, I see a lot of "try beta sheets", should I try to refine that segment using beta sheets restraints?
Best regards. _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Hi Christopher,
[I forgot to copy my reply to the bulletin board, so here it is, reproduced for the record.]
The official publication for CaBLAM is the 2017 Molprobity paper in Protein Science, here: https://doi.org/10.1002/pro.3330 Further technical documentation on CaBLAM can be found in Phenix's Computational Crystallography Newsletter, here: https://www.phenix-online.org/newsletter/CCN_2018_07.pdf#page=7 I recommend the newsletter article as a fast read.
To identify outliers, CaBLAM looks at a structure's CA trace, which is generally well-modeled. For each residue, it compares the local peptide plane orientations of the model to the observed distribution of peptide plane orientations for high quality residues with matching CA trace geometry. The CaBLAM score is a percentile score that rates how well the model matches with the expected distribution. The lower the score, the rarer the observed conformation is in our database of quality structures. A conformation falling in the bottom 5% of observed behavior is potentially suspicious ("Disfavored") and a conformation falling in the bottom 1% is considered an outlier.
This percentile-based scoring is fundamentally the same scoring used in MolProbity's description of Ramachandran and rotamer outliers, though of course CaBLAM puts its cutoffs in different places.
As a matter of interpretation, loop/coil regions tend to be highly varied. CaBLAM "Disfavored" conformations in loops can largely be ignored. However, disfavored conformations in regions expected to by highly regular (repeating secondary structure) should be taken seriously. CaBLAM outliers should be inspected wherever they occur.
The CA geometry score looks at just the CA trace, and takes the CA virtual angle into account (defined by CAi-1, CAi, CAi+1). Outliers in this space reflect some sort of severe problem with CA geometry, often involving an over-extended or over-compressed CA virtual angle.
The secondary structure scores are based on how well a residue's local CA trace matches the expected CA trace of each major secondary structure type, alpha, 3-10, and beta. You can see the contours used for this assessment in Figure 3 of the newsletter article. Each residue receives individual secondary structure scores. Then regions of residues that all pass a scoring threshold are assembled into probable secondary structure elements. This is where the "try beta sheet" recommendations come from. That recommendation indicates that the residue in question /and its neighboring residues/ all have CA traces that look like beta sheet.
I wish I had a simple recommendation for you, but fixing CaBLAM outliers systematically has proven to be a challenge. Take a look at your structure and see if you believe that the outlier residues really are intended to be part of beta sheets. If so, beta sheets have distinctive hydrogen bonding patterns that tend to be disrupted by the kind of problems that CaBLAM identifies. Ideally, you will be able to use Coot's tools to restore the proper hydrogen bonding. Then, applying hydrogen bonding restraints during refinement may help keep your work in place.
If you have large regions of outliers, it may instead be more practical to strip out the existing model and replace it with idealized beta sheet structure, then rerefine. Again, hydrogen bonding restraints may be helpful.
As a general rule, CaBLAM outliers usually indicate a problem with the orientations for one or more peptide planes. Look for a way to reorient the peptide either to remove clashes or establish hydrogen bonds. Make sure you build good regular secondary structure, don't sweat about the loops too much, and trust your judgement and experience to identify the real and justified outliers.
this is a great summary, thanks! Personally, I have a lot of trouble interpreting CaBLAM output. I've seen many CaBLAM outliers and disfavored that looked to me just fine leaving me confused as to what I should do. Misplaced carbonyl groups are among rare cases where a fix is obvious by rotating the group to satisfy H bonding. So.. I'd say a set of concrete and clear fixing instructions would be very helpful to have. And if these instructions can be encoded in software -- that's even better!
We of the Richardson Lab generally dislike torsion-based Ramachandran restraints/secondary structure restraints.
I can see your point. But the matter of fact is: to make low-resolution refinement possible these restraints are absolutely necessary to use. Showing unfolding helix in 5A resolution map as result of refinement without these restraints is among my favorite examples that I've been showing for years not at workshops. Key point here is that one should not use these restraints to fix outliers (because of limited convergence radius of refinement) but only to keep a good model good during refinement. And, surely, we count on validation to stay on the safe side! All the best, Pavel
participants (3)
-
Ahmad Khalifa
-
Christopher Williams
-
Pavel Afonine