# [cctbxbb] Model based outlier calculation in mmtbx.scaling.outlier_rejection

Pavel Afonine pafonine at lbl.gov
Thu Sep 5 08:12:09 PDT 2013

```Hi Keitaro,

Peter Zwart wrote the code so I hope he comments on this more. My
understanding that this is essentially based on Randy Read's paper:

Read, R. J. (1999). Acta Cryst. D55, 1759-1764.

Pavel

On 9/5/13 2:31 AM, Keitaro Yamashita wrote:
> Dear cctbx developers,
>
> I am interested in the implementation of model-based reflection
> outlier rejection. As I read the code
> mmtbx/scaling/outlier_rejection.py (lines 244-351), I noticed that
> maybe there was a discrepancy between what log_message explained and
> the actual code. The log_message in the code says:
>
>> Outliers are rejected on the basis of the assumption that a scaled
>> log likelihood differnce 2(log[P(Fobs)]-log[P(Fmode)])/Q\" is distributed
>> according to a Chi-square distribution (Q\" is equal to the second
>> derivative of the log likelihood function of the mode of the
>> distribution).
>> The outlier threshold of the p-value relates to the p-value of the
>> extreme value distribution of the chi-square distribution.
> while actual p_value is calculated for each hkl as
> p_value = 1 - erf(sqrt(LLG))**N,
> where
> LLG = log p(F=Fbar | Fmodel) - log p(F=Fobs | Fmodel),
> and N is the number of reflections. Here, Fbar is F which
> gives the maximum value of p(F | Fmodel). At least, Q (the second
> derivative of p(F=Fbar | Fmodel)) is not used in the actual
> calculation.
>
> Could someone please explain the meaning of the actual calculation?
> Why taking square-root and raising erf() result to the power of N?
>
> Thank you very much,
> Keitaro
> _______________________________________________
> cctbxbb mailing list
> cctbxbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/cctbxbb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20130905/b7571920/attachment.htm>
```