[phenixbb] questions related to Phenix refinement
Kay.Diederichs at uni-konstanz.de
Sun Jan 18 10:30:45 PST 2015
thanks for your thoughtful email! I am not going to try and comment on every specific point you make, mostly because I don't see any fundamental disagreement. People should really go and read the papers you cite; reading mailing lists is not a substitute for that.
Let me just make one remark: crystallography, although it resides on firm grounds, is complex enough that there are no simple rules for everything. It is oversimplification that has done a bad service to our science; just think of the "religious" Rsym cutoffs, in use until not long ago, that has caused people to discard a lot of valuable data. This is why I am against seemingly innocent rules of thumb like "the high-resolution cutoff should be done at X I/sigma or when the completeness falls below Y %" (and no, I'm not implying that you said this). There are better ways, but they are not as simplistic.
On Sunday, January 18, 2015 18:58 CET, Pavel Afonine <pafonine at lbl.gov> wrote:
> Dear Kay,
> thanks for email and bringing this topic!
> >>>> In the X-ray statistics by resolution bin of the Phenix.refine result,
> >>>> there is a column "%complete". For my refinement data, I find the
> >>>> better the resolution (from lower resolution to the higher
> >>>> resolution), the lower the completeness (for example for 40-6 A,
> >>>> %complete is 98, for 3.1-3.0 A, %complete is 60%, for 2.2-2.1 A,
> >>>> %complete is 6%).
> >>>> Will you please tell me what does this "%complete" mean? why it
> >>>> decreases in the better diffraction bin?
> >> Completeness is how many reflections you have compared to theoretically
> >> possible. So the higher completeness the better. Ideally (and it's not
> >> that uncommon these days) you should have 100% complete data set in
> >> d_min-inf resolution. Anything below say 80 in any resolution bin is
> >> bad, and numbers you quote 6-60% mean something is wrong withe the dataset.
> > Given your standing in the community, the last sentence will lead many
> > unexperienced people to believe that they should cut their data at the
> > resolution where the completeness falls below "say 80"%.
> > But that would be wrong. There is no reason to consider a completeness
> > as "too low in a high-resolution shell" as long as the data in that
> > shell are good. Particularly in refinement any reflection helps to
> > improve the model, and to reduce overfitting.
> Clearly, email is not the best way of communication, especially if
> written without a lawyer's help and attempted to read between the lines!
> No, I was not suggesting to cut the data, particularly if cutting is
> judged by completeness exclusively. What I was really saying is that if
> the data set is so incomplete then that should be alerting and prompt to
> review data collection and processing steps (rather than spending months
> struggling with a poor data set!).
> Also, I think, extremes such as routine data cutoffs by "sigma" or/and
> resolution (as used to be in the past) and panic fear to throw away a
> reflection (as the modern trend is) may be counterproductive. Indeed,
> for example, non-permanent data cutoffs by resolution (or by other
> criteria, such as derived from Fobs vs Fmodel differences) may be
> essential for success of refinement and phasing by Molecular Replacement:
> J. Appl. Cryst. (2008). 41, 491-522
> Structure refinement: some background theory and practical
> D. Watkin
> Acta Cryst. (1999). D55, 1759-1764
> Detecting outliers in non-redundant diffraction data
> R. J. Read
> J. Appl. Cryst. (2009). 42, 607-615
> Automatic multiple-zone rigid-body refinement with a large
> convergence radius
> P. V. Afonine, R. W. Grosse-Kunstleve, A. Urzhumtsev and P.
> D. Adams
> STIR option in SHELX.
> Also, incomplete data can distort maps. As few as 1% of missing
> reflections may be sufficient to destroy molecule image in Fourier maps:
> Acta Cryst. (1991). A47, 794-801
> Low-resolution phases: influence on SIR syntheses and
> retrieval with double-step filtration
> A. G. Urzhumtsev
> Acta Cryst. (2014). D70, 2593-2606
> Metrics for comparison of crystallographic maps
> A. Urzhumtsev, P. V. Afonine, V. Y. Lunin, T. C. Terwilliger
> and P. D. Adams
> Retrieval of lost reflections in high resolution Fourier
> syntheses by 'soft' solvent flattening.
> Natalia L. Lunina, Vladimir Y. Lunin and Alberto D. Podjarny
> Finally, it is a poor idea to assign the data resolution the resolution
> of the highest resolution reflection unless the data set is 100%
> complete. Instead, effective resolution (that has strict mathematical
> definition and meaning) should be used:
> Acta Cryst. (2013). D69, 1921-1934
> On effective and optical resolutions of diffraction data sets
> L. Urzhumtseva, B. Klaholz and A. Urzhumtsev
> Summarizing, a severely incomplete data set should trigger suspicion. If
> that's the only datset available then correct expectations should be set
> about (possible difficulty of) structure solution and quality of final
> All the best,
Kay Diederichs http://strucbio.biologie.uni-konstanz.de
email: Kay.Diederichs at uni-konstanz.de Tel +49 7531 88 4049 Fax 3183
Fachbereich Biologie, Universität Konstanz, Box 647, D-78457 Konstanz
More information about the phenixbb