Phenix reflection statistics
What does the correlation coefficient that is reported at the end of phenix.reflection_statistics represent? (I would assume the CC of F1 vs F2 for all reflections with an observation in datasets 1 and 2). The documentation "...the output will also show the correlations between the datasets and, if applicable, between the anomalous differences, both as overall values and in bins." is a bit ambiguous. Also in the below comparison, how would the overall correlation be 0.901 when all the bins have correlations between 0.330-0.858? CC Obs 2 3 0.901 h,k,l Correlation of: 2: 1.mtz:F-obs,SIGF-obs 3: 2.mtz:F-obs,SIGF-obs Overall correlation: 0.901 unused: - 16.7088 [ 0/9 ] bin 1: 16.7088 - 2.0456 [5236/5928] 0.858 bin 2: 2.0456 - 1.6241 [5610/5946] 0.744 bin 3: 1.6241 - 1.4189 [5539/5925] 0.641 bin 4: 1.4189 - 1.2892 [5357/5873] 0.516 bin 5: 1.2892 - 1.1969 [5405/5972] 0.497 bin 6: 1.1969 - 1.1263 [5241/5939] 0.464 bin 7: 1.1263 - 1.0699 [5067/5903] 0.408 bin 8: 1.0699 - 1.0234 [4960/5934] 0.416 bin 9: 1.0234 - 0.9840 [4808/5941] 0.376 bin 10: 0.9840 - 0.9500 [4700/5931] 0.348 unused: 0.9500 - [3339/6874] 0.330 Thanks, James -- James Fraser [email protected] Alber Lab 356 Stanley Hall, QB3 UC Berkeley Berkeley, CA 94720 http://ucxray.berkeley.edu/~jfraser/
On Fri, Aug 6, 2010 at 8:05 AM, James Fraser
What does the correlation coefficient that is reported at the end of phenix.reflection_statistics represent? (I would assume the CC of F1 vs F2 for all reflections with an observation in datasets 1 and 2).
Correct.
Also in the below comparison, how would the overall correlation be 0.901 when all the bins have correlations between 0.330-0.858?
This appears to be a property of correlation coefficients - I'm still trying to figure out the precise mathematical explanation, but it is expected. http://mathworld.wolfram.com/CorrelationCoefficient.html -Nat
Correlation coefficients assume that the data correlated all arise from the same underlying distribution. This is not true for Fs, since low resolution Fs tend to be larger than high resolution ones. CCs are probably valid in resolution shells. Over all resolution ranges you just see a correlation due the average F effect. This is why in my program Pointless I use CC on E^2, which removes (most of) this effect The correlation coefficient is just the slope of the LSQ line through a scatter plot of (say) F1 vs. F2, so you can see the large resolution Fs will be at the top right of the scatter plot even if they're not the same, small low reso ones at bottom left Phil On 6 Aug 2010, at 20:22, Nathaniel Echols wrote:
On Fri, Aug 6, 2010 at 8:05 AM, James Fraser
wrote: What does the correlation coefficient that is reported at the end of phenix.reflection_statistics represent? (I would assume the CC of F1 vs F2 for all reflections with an observation in datasets 1 and 2). Correct.
Also in the below comparison, how would the overall correlation be 0.901 when all the bins have correlations between 0.330-0.858?
This appears to be a property of correlation coefficients - I'm still trying to figure out the precise mathematical explanation, but it is expected.
http://mathworld.wolfram.com/CorrelationCoefficient.html
-Nat
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
participants (3)
-
James Fraser
-
Nathaniel Echols
-
Phil Evans