Dear Pavel and Peter,
Here is the typical distribution of Rfree, Rwork and Rfree-Rwork for structures in PDB refined at 2.5A resolution:
Are their statistics applied to twinning cases? I think such kind of statistics should be (slightly) different from normal cases.. not?
Did you use PHENIX to select free-R flags? It is important.
Yes, I used phenix to select R-free-flags with use_lattice_symmetry=true. But, my data have pseudo-translation, too (~20% of origin height in patterson). I'm afraid I should have considered pseudo-translation as well as twinning when selecting R-free-flags, e.g. use_dataman_shells=true. Do you have any way to know the refinement is biased or not because of wrong R-free-flags selections?
ML is better than LS because ML better account for model errors and incompleteness taking the latter into account statistically.
Do they come from sigma-A estimation?
phenix.model_vs_data model.pdb data.mtz
does it suggest that you have twinning?
Yes, it says: Data: twinned : -k,-h,-l
I do not know what's implemented in Refmac - I'm not aware of a corresponding publication.
FYI, I think No. 13 of this slide describes the likelihood function in case of twin.. http://www.ysbl.york.ac.uk/refmac/Presentations/refmac_Osaka.pdf
Typically, when people send us the "reproducer" (all inputs that are enough to reproduce the problem) then we can work much more efficiently, otherwise it takes a lot of emails before one can start having a clue about the problem.
I fully understand it, but I'm sorry I couldn't.. I will do my best to give you sufficient information. Thank you for giving me the solution! Cheers, Keitaro