Hi Maia, I would add a couple of comments too: 1)
I am wondering why the ncs refinement gives me a better Rfree (21.0%)
Because by using NCS you added some "observations", therefore you improved the data-to-parameters ratio, which in turn reduced degree of overfitting. The fact that the R-factor dropped (and not increased) probably suggests that you selected NCS groups correctly. 2)
the ncs refinement gives me a better Rfree (21.0%) than without ncs (21.7%).
Here is another stream of thought... The target for restrained coordinate (and similarly for B-factor) refinement looks like this (in phenix.refine): T_total = wxc_scale * wxc * T_xray + wc * T_geometry where the relative target weight wxc is determined as wxc ~ ratio of gradient's norms: Brünger, A.T., Karplus, M. & Petsko, G.A. (1989). Acta Cryst. A45, 50-61. "Crystallographic refinement by simulated annealing: application to crambin" Brünger, A.T. (1992). Nature (London), 355, 472-474. "The free R value: a novel statistical quantity for assessing the accuracy of crystal structures" Adams, P.D., Pannu, N.S., Read, R.J. & Brünger, A.T. (1997). Proc. Natl. Acad. Sci. 94, 5018-5023. "Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement" Before wxc scale is computed, the structure is subject of a short molecular dynamics run - this is where the random component comes into play. Now, having said this, we know that if you run, for example, 100 identical phenix.refine runs, where the only difference between each run is the random seed, you will get 100 slightly different refinement results. The spread in R-factors depends on resolution, and if I remember correctly, for a structure at ~2A resolution I was getting delta_R~ from 0.1 to 2%. It can be higher at lower resolution, and smaller at higher resolution. The NCS term goes into T_geometry, which in turn means that it changes (somehow) the weight. This may explain the difference in R-factors and, I would say, the one less then 1% I would consider insignificant (unless it is highly systematic and consistent observation, and unless it is not made weight independent). To make it less arbitrary I would suggest to run two refinement jobs using "optimize_wxc=true optimize_wxu=true", one with NCS and the other one without using NCS. I'm sure I suggested this a month or two ago. Pavel.