On Mon, Oct 20, 2014 at 2:32 PM, Rich Grey <rich_grey@yahoo.com> wrote:
I'm preparing Table 1 for a structure I have recently solved.  I chose to compare the Table 1 Phenix produces to the data found in the PDB file and the log file from my last refinement in Phenix.  I see discrepancies between the data for the highest resolution bin's R-factors, including what resolution data that bin contains.  In addition, the number of reflections used in refinement is different (although only by about 5, so not a large difference).  The PDB file appears to match the log file, and the overall Rwork and Rfree agree for all.  Does anyone know where the Table 1 program in Phenix pulls these values from why I may be seeing these differences?  Thank You.

The Table 1 statistics are recalculated from scratch given the input data.  Because phenix.refine uses a different convention than what is normally done for publication, and because I prefer not to rely on parsing headers or any other text file, I ignore the numbers in the header and divide into 10 bins of equal size (based on theoretically complete data).  You can adjust the number of bins (at least in the nightly builds), but there is no guarantee that it will match phenix.refine.  Alternately, you can report the numbers from the phenix.refine header if you prefer, but then they will not match the binning of the merging statistics if you provided unmerged data.

I suspect the difference in reflections used for refinement may have something to do with outlier rejection - the number in Table 1 reflects what was used as input for phenix.refine, but it may internally discard reflections.  (Never permanently, though, so you will always start with the same number for the next round of refinement.)  Just to make sure I'm not misinterpreting this, would you mind sending me the input files?

thanks,
Nat