Hi Rongjin,
it's not really waste: test reflections are used not only in Rfree
calculations, but in ML refinement target parameters estimations, so
more test reflections you use better these estimations are, and so
better refinement target is. I still believe that 10% is a good number,
and any capping the amount of test reflections is NOT necessary: it's
not about getting a statistically meaningful amount of them overall, but
rather it's about getting the right spread of them across the whole
resolution range, and the right amount across all of the thin slices of
resolution. I imagine by capping them at a particular number you may end
up having a couple of test reflections per certain bin, which isn't great.
Pavel
On 11/1/13 9:18 PM, Rongjin Guan wrote:
> Thank you, Nat and Pavel. I just came back and saw replies from both of
> you. Thanks a lot.
> I have an impression that free_R set has max 2000 reflections, and more
> importantly, each resolution shell has at least
> 50 reflections, so I feel >4500 in the free_R set / 150 each shell kind
> of luxury ("waste"), and if I have more reflections in the refinement
> set, I could have better data/parameters ratio. But this may not make
> much difference.
> Best, and have a great weekend.
> Rongjin
>
>
> On Fri, Nov 1, 2013 at 3:18 PM, Nathaniel Echols <nechols(a)lbl.gov
> <mailto:[email protected]>> wrote:
>
> On Fri, Nov 1, 2013 at 12:13 PM, rjguan <rjguan(a)gmail.com
> <mailto:[email protected]>> wrote:
> > I solved a structure with 2.7 A Se-Met data set, with 10%
> reflections in free_R set.
> > Now I have 2.0 A native data set, and extended free_R to 2.0A.
> Now I have >4500 reflections
> > in free_R set, each resolution shell has >150 reflections. Kind
> of too much, right?
>
> I don't think it's necessarily too much, but it is probably more
> than you need.
>
>
> > What is the best way to reduce the number of reflections in the
> free_R set, say, to 5%?
> > I already built and refined model at 2.7 A, but do not want to
> redo autobuild.
>
> Use the reflection file editor in the GUI - click "More options" in
> the section for R-free flags, and check the box "Adjust test set
> size to specified fraction".
>
> > Another question: now I have 2.0A data set, shall I use phases
> from 2.7 A data in refinement?
> > I compared, without using the phases I got lower R/R_free (about
> 1% lower).
> > is this because the 2.0 A data is more accurate than the 2.7 A
> phases, and I should continue
> > to refine at 2.0A without the phases from 2.7 A data?
>
> It's difficult to reach a conclusion by comparing R-factors alone;
> maybe you simply need to run more cycles of refinement with phases
> to get the same result. Alternately, if the crystals aren't truly
> isomorphous, the phases may be inappropriate for the native
> structure. But I think with 2.0Å data and a complete model, it is
> perfectly valid to use the amplitudes-only ML refinement target.
>
> -Nat
>
>