On 5 September 2013 02:31, Keitaro Yamashita <k.yamashita@spring8.or.jp> wrote:

Dear cctbx developers,

I am interested in the implementation of model-based reflection
outlier rejection. As I read the code
mmtbx/scaling/outlier_rejection.py (lines 244-351), I noticed that
maybe there was a discrepancy between what log_message explained and
the actual code. The log_message in the code says:

> Outliers are rejected on the basis of the assumption that a scaled
> log likelihood differnce 2(log[P(Fobs)]-log[P(Fmode)])/Q\" is distributed
> according to a Chi-square distribution (Q\" is equal to the second
> derivative of the log likelihood function of the mode of the
> distribution).
> The outlier threshold of the p-value relates to the p-value of the
> extreme value distribution of the chi-square distribution.

while actual p_value is calculated for each hkl as
p_value = 1 - erf(sqrt(LLG))**N,
where
LLG = log p(F=Fbar | Fmodel) - log p(F=Fobs | Fmodel),
and N is the number of reflections. Here, Fbar is F which
gives the maximum value of p(F | Fmodel). At least, Q (the second
derivative of p(F=Fbar | Fmodel)) is not used in the actual
calculation.

Could someone please explain the meaning of the actual calculation?
Why taking square-root and raising erf() result to the power of N?

Thank you very much,
Keitaro
_______________________________________________
cctbxbb mailing list
cctbxbb@phenix-online.org
http://phenix-online.org/mailman/listinfo/cctbxbb

--
-----------------------------------------------------------------
P.H. Zwart
Research Scientist
Berkeley Center for Structural Biology
Lawrence Berkeley National Laboratories
1 Cyclotron Road, Berkeley, CA-94703, USA
Cell: 510 289 9246
BCSB:� � � http://bcsb.als.lbl.gov
PHENIX:�� http://www.phenix-online.org
SASTBX:� http://sastbx.als.lbl.gov
-----------------------------------------------------------------