On 10/9/2011 10:38 AM, Nathaniel Echols wrote:
On Sat, Oct 8, 2011 at 11:27 PM, Dale Tronrud
wrote: "more precise than is actually the case"? I don't understand this. A map has precision to infinitely fine spacing. The sampling grids we choose are the artifacts - the courser the grid the worst the representation. The prismatic points and spikes of a coarsely sampled map are aliasing artifacts. A smoothly varying surface is an accurate representation of the continuous density function.
The spacing between grid points is telling you something about how well each of those grid points is resolved. Even if the electron density is continuous, it still comes from an incomplete Fourier series and is full of artifacts and ambiguity. Spacing your grid points every d_min/6 A implies (to my eyes, anyway) that the optical resolution allows you to accurately distinguish the values at those points, which isn't actually the case. It's not necessarily mathematically inaccurate, but since most of us are trained to model-build using a grid spacing of d_min/3 or d_min/4 (or whatever the default is in Coot), we "know" what a 3A map looks like, and a 1.5A map, etc.
I'm not sure what "artifacts and ambiguity" you are worried about. Surely Phenix has worked hard to come up with the best mtz file it can. Adding additional artifacts by coarsely sampling the map is not likely to improve the situation. Here is my analysis of the problem of contouring a coarsely sampled map. Since this is not the CCP4 BB maybe I can be forgiven a few figures. When you calculate a map on a grid the FFT produces values for the map at the location of the sampling points with quite good accuracy and the precision is limited only by the round-off error of the computer (and the FFT is pretty robust relative to round-off error). Where you get in trouble is when you start making assumptions about what the map function does between the sampled points. To the best of my knowledge Coot and Pymol still use the simplex contouring method that has been used since computer were first asked to perform this task. In this algorithm the location of the contour is found by linear interpolation between the nearest two grid points. Linear interpolation is the second worst method I've seen for doing this. In the first figure below (Three Fold Sampling.png) I show a Sin wave along with three sample points and the linear interpolation between. I propose to contour at 0.5. (Since this is one dimension the contour is a set of single points where the function crosses the contour level. I've marked the location of the contour "points" with red dots. The true Sin wave is also plotted and the true location of these contour points are marked with black dots. You can see that the red dot on the left is considerably displaced from where it should be. When interpreting a map one tends to assume that the peak of the density will be centrally located between the two contour points, but clearly that is not true here. The peak in the Sin wave is considerably to the left of the average of the two red dots. You will be trying to center an atom in the density but find it always drifts to the edge of the density upon refinement. Have you ever noticed that happening when building a water molecule into a "ball" of contours? The exact about of displacement of the contour will change as the phase of the Sin wave shifts relative to the sampling grid, but even though, in this case, a phase shift to the right will, for a time, improve the placement of the left contour point it will degrade the placement of the one on the right. There is no good compromise. The next plot (Four Fold Sampling.png) shows that adding one more sampling point gives good improvement in the placement of the contour point, and going to six (Six Fold Sampling.png) does quite well. Actually the background Sin wave in each plot is just a 100 point sampling so clearly even sampling at 100 times the resolution does not create any artifacts, misrepresentations, or overstatement of the smoothness of a Sin wave. Increasing the number of sampling points simply allows the linear interpolation to work better. With computer memory and speed what it is today I'd be happy to sample even more finely, but the line width gets in the way and I start having problems seeing the lines representing the bonds inside the density. Six fold oversampling is a good compromise. Three fold is not. Why can people interpret a map with 3 fold oversampling? Because most of the information about peptide chain placement is in the data around 4 A resolution. If you have a 2 A data set and sample with three fold oversampling the 4 A reflections will be sampled at 6 fold oversample and the contours will be roughly correct. You will not be "seeing" your 2 A data very well, but you will get the fold of your protein right. If you have 3 A data and sample your map with a 1 A grid you will have a much more difficult time visualizing what is going on because the placement of the contours will have quite large errors. I find it odd when someone says that the smooth surface of a finely sampled, low resolution map, is "unnatural". How else should a low resolution map appear? It has to be smooth - it's low resolution! You have to have high resolution Fourier coefficients to observe lumps and bumps and edges and points. Dale Tronrud P.S. The parameter in Coot that sets the sampling of its maps is 1/2 the numbers used in this letter. The default in Coot is 1.5 which is a three fold oversample. I set Coot to 3 for all my work.
(I know this is all nit-picking, but I have in mind a specific figure in a methods paper where the authors compared a 2mFo-DFc map before and after their magical map improvement procedure, with much more detail visible in the "after" maps. I had to read it twice to realize that the "after" map had a much finer grid spacing - of course it looked much better!)
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb