[phenixbb] phenix and weak data

Wed Dec 12 06:45:37 PST 2012

On Dec 12, 2012, at 1:46 AM, Ed Pozharski <epozh001 at UMARYLAND.EDU> wrote:

> On Tue, 2012-12-11 at 11:27 -0500, Douglas Theobald wrote:
> 
>> What is the evidence, if any, that the exptl sigmas are actually negligible compared to fit beta (is it alluded to in Lunin 2002)?  Is there somewhere in phenix output I can verify this myself?
> 
> Essentially, equation 4 in Lunin (2002) is the same as equation 14 in
> Murshudov (1997) or equation 1 in Cowtan (2005) or 12-79 in Rupp (2010).
> The difference is that instead of combination of sigf^2 and sigma_wc you
> have a single parameter, beta.  One can do that assuming that
> sigf<<sqrt(beta).  Phenix log files list optimized beta parameter in
> each resolution shell.  

Thanks, I see that now. 

> It does not list sigf though, but trust me - I
> checked and it is indeed true that sqrt(beta)>sigf. I just pulled up a
> random dataset refined with phenix and here is what I see
> 
> min(sigf/sqrt(beta)) = 0.012
> max(sigf/sqrt(beta)) = 0.851
> mean(sigf/sqrt(beta)) = 0.144
> std(sigf/sqrt(beta)) = 0.118
> 
> But there are two problems.  First, in the highest resolution shell
> 
> min(sigf/sqrt(beta)) = 0.116
> max(sigf/sqrt(beta)) = 0.851
> mean(sigf/sqrt(beta)) = 0.339
> std(sigf/sqrt(beta)) = 0.110
> 
> This is a bit more troubling.  Notice that for acentrics it's 2sigf**2
> +sigma_wc, thus the actual ratio should be increased by sqrt(2), getting
> uncomfortably close to 1/2.  Still, given that one adds variances, this
> is at most 25% correction, and this *is* the high resolution shell.
> 
> Second, if one tries to interpret sqrt(beta) as a measure of model error
> in reciprocal space, one runs into trouble.  This dataset was refined to
> R~18%.  Assuming that sqrt(beta) should roughly predict discrepancy
> between Fo and Fc, it corresponds to R~30%.  

Given the form of the likelihood (a Rice distribution, based on integrating out phase from a 2D Gaussian), beta should be equivalent to 2*variance for the original Gaussian.  So I think the recip space error should be sigma = \sqrt(beta/2).   

> This suggests that for
> reasons I don't yet quite understand beta overestimates model variance.
> If it is simply doubled, then it becomes comparable to experimental
> error, at least in higher resolution shells.
> 
>> And, in comparison, how does refmac handle the exptl sigmas?  Maybe this last question is more appropriate for ccp4bb, but contrasting with phenix would be helpful for me.  I know there's a box, checked by default, "Use exptl sigmas to weight Xray terms".
> 
> Refmac fits sigmaA to a certain resolution dependence and then adds experimental sigmas (or not as you noticed).  I was told that the actual formulation is different from what is described in the original manuscript.  But what's important that if one pulls out the sigma_wc as calculated by refmac it has all the same characteristics as sqrt(beta) - it is generally >>sigf and suggests model error in reciprocal space that is incompatible with (too large) observed R-values.  

\Sigma_wc should also be (roughly) twice the structure factor variance (at least as parameterized in Murshudov 1997 AC D53:240).  

> Kevin Cowtan's spline approximation implemented in clipper libraries behaves much better, meaning that R-value expectations projected from sigma_wc are much closer to observed R-value.
> 
> Curiously, it does not make much difference in practice, i.e. refined model is not affected as much.  For instance, with refmac there are no significant changes whether one uses experimental errors or not.  I could think of several reasons for this, but haven't verified any.

Thanks again for the comments/discussion, its been very interesting.  

> 
> Cheers,
> 
> Ed.
> 
> -- 
> "I'd jump in myself, if I weren't so good at whistling."
>                               Julian, King of Lemurs
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb