Possibly a silly question - but did you use the same R-free
reflection subset in both refinements?
"Same MTZ file" doesn't necessarily mean the program was using the
same R-free flags. CCP4 and PHENIX use different conventions
(phenix.refine automatically recognizes both). phenix.refine:
test/work=1/0.