What about the effect of missing data on maps used in real-space refinement?

-Nat


On Fri, Nov 1, 2013 at 11:29 PM, Pavel Afonine <pafonine@lbl.gov> wrote:
Hi Rongjin,

it's not really waste: test reflections are used not only in Rfree calculations, but in ML refinement target parameters estimations, so more test reflections you use better these estimations are, and so better refinement target is. I still believe that 10% is a good number, and any capping the amount of test reflections is NOT necessary: it's not about getting a statistically meaningful amount of them overall, but rather it's about getting the right spread of them across the whole resolution range, and the right amount across all of the thin slices of resolution. I imagine by capping them at a particular number you may end up having a couple of test reflections per certain bin, which isn't great.

Pavel


On 11/1/13 9:18 PM, Rongjin Guan wrote:
Thank you, Nat and Pavel. I just came back and saw replies from both of
you. Thanks a lot.
I have an impression that free_R set has max 2000 reflections, and more
importantly, each resolution shell has at least
50 reflections, so I feel >4500 in the free_R set / 150 each shell kind
of luxury ("waste"), and if I have more reflections in the refinement
set, I could have better data/parameters ratio. �But this may not make
much difference.
Best, and have a great weekend.
Rongjin


On Fri, Nov 1, 2013 at 3:18 PM, Nathaniel Echols <nechols@lbl.gov
<mailto:nechols@lbl.gov>> wrote:

� � On Fri, Nov 1, 2013 at 12:13 PM, rjguan <rjguan@gmail.com
� � <mailto:rjguan@gmail.com>> wrote:
� � �> I solved a structure with 2.7 A Se-Met data set, �with 10%
� � reflections in free_R set.
� � �> Now I have 2.0 A native data set, and extended free_R to 2.0A.
� � Now I have >4500 reflections
� � �> in free_R set, each resolution shell has >150 reflections. Kind
� � of too much, right?

� � I don't think it's necessarily too much, but it is probably more
� � than you need.


� � �> What is the best way to reduce the number of reflections in the
� � free_R set, say, to 5%?
� � �> I already built and refined model at 2.7 A, but do not want to
� � redo autobuild.

� � Use the reflection file editor in the GUI - click "More options" in
� � the section for R-free flags, and check the box "Adjust test set
� � size to specified fraction".

� � �> Another question: now I have 2.0A data set, shall I use phases
� � from 2.7 A data in refinement?
� � �> I compared, without using the phases I got lower R/R_free (about
� � 1% lower).
� � �> is this because the 2.0 A data is more accurate than the 2.7 A
� � phases, and I should continue
� � �> to refine at 2.0A without the phases from 2.7 A data?

� � It's difficult to reach a conclusion by comparing R-factors alone;
� � maybe you simply need to run more cycles of refinement with phases
� � to get the same result. �Alternately, if the crystals aren't truly
� � isomorphous, the phases may be inappropriate for the native
� � structure. �But I think with 2.0� data and a complete model, it is
� � perfectly valid to use the amplitudes-only ML refinement target.

� � -Nat