[phenixbb] RSRZ in Phenix?

Sun Dec 27 12:40:11 PST 2015

Hi Randy,

> The argument for RSRZ last time we had the PDB X-ray validation task force was that it’s meant to be independent of resolution and residue type, i.e. if you’re trying to judge how well someone did in fitting a structure, you expect a much better CC for a Trp in a 1.8A resolution structure than you do for a Lys in a 2.8A resolution structure.  The RSRZ score compares the RSR for the residue type at its resolution with other residues of the same type at similar resolution.  (We used RSR because Gerard K had generated the statistics for that one and hadn’t done the same for RSCC, which may or may not have had advantages.)
>
> However, it definitely has its problems.  For example, the RSRZ for a side chain in a 6A structure makes no sense whatsoever because, in order to get enough representatives of each residue type to generate statistics, you have to include all the structures from something like 3.5 to 8A resolution!  So we’re looking for better alternatives this time around in the VTF.

thanks a lot for explanation, it is very helpful!

> Suggestions with evidence are welcome!

Assuming we are talking about local model-to-map fit so that we focus on 
a goodness of fit of an atom or a few atoms (such as residue or ligand), 
I still think that a triplet of numbers {2mFo-DFc map value, mFo-DFc map 
value, CC} calculated per group of atoms in question is something to 
consider.

In case of a group that contains just one atom, we will be looking at 
2mFo-DFc and mFo-DFc map values at atom center, and CC between the map 
in question and Fmodel map, calculated in a sphere R around the atom, 
with Fmodel map being the Fourier transform of the total model structure 
factor Fmodel = ktotal * (Fcalc_atoms + Fbulk_solvent).

In case of a group that contains several atoms, we will be looking at 
average of 2mFo-DFc and mFo-DFc map values at atom centers, and CC is 
computed as above using map values inside atom spheres.

Why I think {2mFo-DFc map value, mFo-DFc map value, CC} is good?

1. If 2mFo-DFc map is scaled by standard deviation we kind of know, as a 
rule of thumb, what is good value and what is poor one. With 
cctbx/Phenix tools it is a day long exercise collect any desired 
statistics for 2mFo-DFc map values over entire PDB: we can bin it by 
resolution, various types of completeness, by residue/atom types, etc 
etc you name it. Thus we could turn "a rule of thumb" above into 
sometime more scientific-like.

2. Even if 2mFo-DFc map value is good, an atom may still be slightly 
misplaced or its parameters (xyz, q, B) may not be well refined. This 
will be immediately highlighted by mFo-DFc map value. In fact, it is a 
good target in a sense that we don't even need a "rule of thumb" values 
for it: the flatter mFo-DFc, the better.

3. 2mFo-DFc and mFo-DFc maps as suggested to be used above are still 
insufficient because the map shape is not taken into account. One may 
envision a situation of incorrect atoms filling the map such that 
2mFo-DFc is strong enough and mFo-DFc is rather flat but the shape of 
the construct does not resemble the actual map well. This is what the 
map CC will help us with, as it is scale-insensitive but shape 
sensitive! Actually, CCr from

http://journals.iucr.org/d/issues/2014/10/00/kw5094/kw5094.pdf
Acta Cryst. (2014). D70, 2593–2606

would be even much better than regular CC.

4. An advantage of using the above mentioned metrics (all together!) is 
that they are old well established quality measures that majority of 
crystallographers have more or less clear idea about, both the meaning 
and what to expect as a numerical value. No need to re-invent the wheel! 
(except from gathering some statistics on expected map values).

All the best,
Pavel