[phenixbb] phenix and weak data

Fri Dec 7 13:44:48 PST 2012

Hi Pavel,

Thanks for the clarification, comments, and esp. the refs.  I was aware of your Acta Cryst 2002 paper.  Am I correct in thinking that in phenix you are basically maximizing eqn 4 (a type of Rice distribution)?  I had always assumed that experimental sigmas were somehow lumped into the alpha and beta parameters (esp. given your discussion in section 2.3).  In principle they could be, right?

In any case, I wonder if your example of rigid-body refinement actually argues for incorporating experimental sigmas --- since high-res data is on average the most uncertain, incorporating sigmas would downweight high-res data most, and cutting high res data is really just a crude way of downweighting it.  

Anyway, thanks for your work on phenix, it's my current favorite refinement software.

Cheers,

Douglas

On Dec 6, 2012, at 9:19 PM, Pavel Afonine <pafonine at LBL.GOV> wrote:

> Hi Douglas,
> 
> that's the point: there are many hand-waving arguments supported with no or weak inconclusive data, and little rock-sold evidence!
> 
> I'm not aware of a paper *clearly* demonstrating what kind of improvement using weak data in refinement brings? I mean not an R-factor improvement by a fraction of a percent or "cosmetics" like this, but a case where it is demonstrated that using it allowed more model to be built, or showing two maps side-by-side obtained without and with weak data where the latter would significantly be more useful (not just appears more pleasant after tweaking contouring thresholds to show the case favouritely).
> 
> Regarding refinement itself: consider rigid-body refinement. One may think that with today's technology you would just dump all the data into refinement machinery and Maximum-likelihood target would do the trick (weight high-res data properly). No. For rigid-body refinement to actually work you still need to cut the high resolution end. See discussion in:
> Automatic multiple-zone rigid-body refinement with a large convergence radius. P. V. Afonine, R. W. Grosse-Kunstleve, A. Urzhumtsev and P. D. Adams J. Appl. Cryst. 42, 607-615 (2009)).
> 
> Same logic might be applicable with weak data. Its amount and weakness may be just sufficient to make refinement target profile complex enough to stuck refinement or impede its convergence. On the other hand it may be just good enough to make refinement behave better and yield better model. Using it may harm refinement at the beginning but may help towards the end (remember arguments behind STIR option in SHELX?!), so the question may not be just "whether or not?", but also "when?".
> 
> Maps that are mostly used (2mFo-DFc and mFo-DFc) are calculated without using experimental sigmas, unless you modify them using techniques such as maximization of entropy or so, where sigmas may be used somehow (but don't have to, though). So even if one weights weak data smartly for refinement and uses it the right moment, one still need to think about how to use it in map calculation so it brings good rather than noise into maps.
> 
> All in all, yes, *conceptually* it is good to use weak data in refinement and map calculation, but two questions - 1) how and when? and 2) whether it's going to change anything significantly? - are yet to answer. It's in todo list to answer these questions.
> 
> Finally, FYI: refinement targets that phenix.refine uses are described here (they are coded exactly as discussed in these papers):
> 
> MLHL:
> Pannu, N.S., Murshudov, G.N., Dodson, E.J. & Read, R.J. (1998). Acta Cryst. D54, 1285-1294. "Incorporation of Prior Phase Information Strengthens Maximum-Likelihood Structure Refinement"
> 
> ML:
> V.Y., Lunin, P.V. Afonine & A.G., Urzhumtsev. Acta Cryst. (2002). A58, 270-282. "Likelihood-based refinement. I. Irremovable model errors"
> 
> Flavor of LS and accounting for scales in ML and MLHL functions:
> P.V. Afonine, R.W. Grosse-Kunstleve & P.D. Adams. Acta Cryst. (2005). D61, 850-855. "A robust bulk-solvent correction and anisotropic scaling procedure"
> 
> and they work fine in phenix.refine since 2004.
> 
> All the best,
> Pavel
> 
> 
> On 12/6/12 2:35 PM, Douglas Theobald wrote:
>> Hi all,
>> 
>> Many have argued that we should include weak data in refinement --- e.g., reflections much weaker than I/sigI=2 --- in order to take advantage of the useful information found in large numbers of uncertain data points (like argued in the recent Karplus and Diederichs Science paper on CC1/2).  This makes sense to me as long as the uncertainty attached to each HKL is properly accounted for.  However, I was surprised to hear rumors that with phenix "the data are not properly weighted in refinement by incorporating observed sigmas" and such.  I was wondering if the phenix developers could comment on the sanity of including weak data in phenix refinement, and on how phenix handles it.
>> 
>> Douglas
>> 
>> 
>> 
>> 
>> ^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`
>> Douglas L. Theobald
>> Assistant Professor
>> Department of Biochemistry
>> Brandeis University
>> Waltham, MA  02454-9110
>> 
>> dtheobald at brandeis.edu
>> http://theobald.brandeis.edu/
>> 
>>              ^\
>>    /`  /^.  / /\
>>   / / /`/  / . /`
>>  / /  '   '
>> '
>> 
>> 
>> 
>> 
>> _______________________________________________
>> phenixbb mailing list
>> phenixbb at phenix-online.org
>> http://phenix-online.org/mailman/listinfo/phenixbb
>