[phenixbb] questions related to Phenix refinement
pafonine at lbl.gov
Sun Jan 18 09:58:14 PST 2015
thanks for email and bringing this topic!
>>>> In the X-ray statistics by resolution bin of the Phenix.refine result,
>>>> there is a column "%complete". For my refinement data, I find the
>>>> better the resolution (from lower resolution to the higher
>>>> resolution), the lower the completeness (for example for 40-6 A,
>>>> %complete is 98, for 3.1-3.0 A, %complete is 60%, for 2.2-2.1 A,
>>>> %complete is 6%).
>>>> Will you please tell me what does this "%complete" mean? why it
>>>> decreases in the better diffraction bin?
>> Completeness is how many reflections you have compared to theoretically
>> possible. So the higher completeness the better. Ideally (and it's not
>> that uncommon these days) you should have 100% complete data set in
>> d_min-inf resolution. Anything below say 80 in any resolution bin is
>> bad, and numbers you quote 6-60% mean something is wrong withe the dataset.
> Given your standing in the community, the last sentence will lead many
> unexperienced people to believe that they should cut their data at the
> resolution where the completeness falls below "say 80"%.
> But that would be wrong. There is no reason to consider a completeness
> as "too low in a high-resolution shell" as long as the data in that
> shell are good. Particularly in refinement any reflection helps to
> improve the model, and to reduce overfitting.
Clearly, email is not the best way of communication, especially if
written without a lawyer's help and attempted to read between the lines!
No, I was not suggesting to cut the data, particularly if cutting is
judged by completeness exclusively. What I was really saying is that if
the data set is so incomplete then that should be alerting and prompt to
review data collection and processing steps (rather than spending months
struggling with a poor data set!).
Also, I think, extremes such as routine data cutoffs by "sigma" or/and
resolution (as used to be in the past) and panic fear to throw away a
reflection (as the modern trend is) may be counterproductive. Indeed,
for example, non-permanent data cutoffs by resolution (or by other
criteria, such as derived from Fobs vs Fmodel differences) may be
essential for success of refinement and phasing by Molecular Replacement:
J. Appl. Cryst. (2008). 41, 491-522
Structure refinement: some background theory and practical
Acta Cryst. (1999). D55, 1759-1764
Detecting outliers in non-redundant diffraction data
R. J. Read
J. Appl. Cryst. (2009). 42, 607-615
Automatic multiple-zone rigid-body refinement with a large
P. V. Afonine, R. W. Grosse-Kunstleve, A. Urzhumtsev and P.
STIR option in SHELX.
Also, incomplete data can distort maps. As few as 1% of missing
reflections may be sufficient to destroy molecule image in Fourier maps:
Acta Cryst. (1991). A47, 794-801
Low-resolution phases: influence on SIR syntheses and
retrieval with double-step filtration
A. G. Urzhumtsev
Acta Cryst. (2014). D70, 2593-2606
Metrics for comparison of crystallographic maps
A. Urzhumtsev, P. V. Afonine, V. Y. Lunin, T. C. Terwilliger
and P. D. Adams
Retrieval of lost reflections in high resolution Fourier
syntheses by 'soft' solvent flattening.
Natalia L. Lunina, Vladimir Y. Lunin and Alberto D. Podjarny
Finally, it is a poor idea to assign the data resolution the resolution
of the highest resolution reflection unless the data set is 100%
complete. Instead, effective resolution (that has strict mathematical
definition and meaning) should be used:
Acta Cryst. (2013). D69, 1921-1934
On effective and optical resolutions of diffraction data sets
L. Urzhumtseva, B. Klaholz and A. Urzhumtsev
Summarizing, a severely incomplete data set should trigger suspicion. If
that's the only datset available then correct expectations should be set
about (possible difficulty of) structure solution and quality of final
All the best,
More information about the phenixbb