Appropriate model_vs_data usage over part of a structure

Perhaps necessary preface: I am largely learning as I go, and large parts of Phenix (and the details of refinement data file formats) are still black boxes for me. Suppose I have a model of one chain of a complex and an .mtz describing the entire complex. I have performed some refinement procedure altering the model of that single chain. My aim is to determine how this refinement procedure has affected the fit to the experimental data--or, really, the subset of the experimental data expected to be relevant for my model. For obvious reasons, phenix.model_vs_data chain_A.pdb whole_complex.mtz phenix.model_vs_data chain_A_refined.pdb whole_complex.mtz are undesirable: they evaluate the model of a single chain against the whole complex's data, and there's obviously a lot of unsatisfied electron density. So, while this command *works*, it reports an unrepresentative fit. (In theory, one solution to my problem would be to re-combine each chain's refined version, then run model_vs_data on the recombined, refined complex. I'm interested nonetheless in how it would work on the chains separately. Now, during the refinement procedure, we of course generate .ccp4 density maps for the individual chain models: phenix.maps chain_A.pdb whole_complex.mtz which produces the ccp4 file as well as, crucially, chain_A_map_coeffs.mtz. Attempting to employ that resulting .mtz file, i.e. phenix.model_vs_data chain_A.pdb whole_complex.mtz phenix.model_vs_data chain_A_refined.pdb whole_complex.mtz leads to an evocative error: Sorry_No_array_of_the_required_type: No reflection arrays available. My highest suspicion is that I need to alter maps.params in a particular way so that reflection arrays are also output to model_map_coeffs.mtz. I could also imagine that I need to be using another program entirely! Thanks in advance for whatever help you can provide; unfortunately, I can't provide any input files.

You might look for the Real space correlation coefficient to see if your model fit that particular map better than before, but thats probably it. There is no subset experimental data just influenced by one chain. The whole complex contributes to each measurement and vice versa your chain A model contribute to each measurement you made to generate the maps. Sorry, but that's it Christian Am 25.05.2016 um 17:31 schrieb Andy Watkins:

Thanks--that certainly makes sense. In that case, I suppose I have a
different question: what distinguishes the .mtz file suffixed "map_coeffs,"
which is output by phenix.maps, from the input .mtz?
On Wed, May 25, 2016 at 1:28 PM, Christian Roth

An mtz file is a generic container for information that is represented as Fourier coefficients - amplitudes and/or phases. The mtz you get from data reduction contains your observed diffraction pattern. There are quite a variety of things that might be in there, e.g. intensities, amplitudes, indications of anomalous scattering. The "map_coeffs" mtz probably echos all of those data and includes the amplitudes/phases which can be used to calculate several maps. These coefficients are derived from both your observations and the coordinates of your model that matches the "map_coeffs" file. The program could simply write a map file directly, but the Fourier representation takes less disk space and is quicker to read and write than a map. The "map_coeffs" file is a "throw away" file. You use it to look at your map, but when you continue refinement you should always use the original mtz. You don't want to refine your model against an mtz that contains data calculated from your model. While it might be okay, it is safer to stick with the original. Dale Tronrud On 5/25/2016 10:38 AM, Andy Watkins wrote:

Thanks for clarifying its purpose! I guess I do still have one question,
though: are ccp4 files computed via phenix.maps (from e.g., a .pdb and
.mtz) unsuitable for real-space refinement (e.g., some kind of structural
optimization with a scoring function that includes an electron density
term) because such a ccp4 file is itself influenced by the starting model?
On Wed, May 25, 2016 at 1:57 PM, Dale Tronrud

You can do real-space refinement against such maps, and we regularly do inside of Coot, for example. Such refinement should only be for big changes to the model where reciprocal space refinement is of limited utility. The final word is to perform restrained, reciprocal space refinement because, as you say, there is a clear distinction between the information from outside sources (the restraints and the diffraction pattern) and the model. The situation is, mathematically, pretty squishy when you are using as "observations" a map that is partially derived from the model you are trying to improve. When you have an electron density map from microscopy the situation is quite different. Then the map is an independent source of information and you don't have a dog chasing its own tail. Dale Tronrud On 5/25/2016 11:14 AM, Andy Watkins wrote:

Hi Andy,
Suppose I have a model of one chain of a complex and an .mtz describing the entire complex.
I'm struggling making sense of this statement.. MTZ file is just a file format to records some data with labels associated with it in a binary form to make it easier for machines to handle it. So what precisely you mean by ".mtz describing the entire complex"?
Okay. This is what refinement is for.
Could you please define "unrepresentative fit" in this context?
I guess Fobs would not like this as they include contributions from all atoms, as far as I remember from the theory..
Well, this is what it seems *you* do as part of *your* protocol, but this is *not* what typical work-flow includes as phenix.refine always reports maps upon completion of refinement; so running phenix.maps seems to be an unnecessary step in this scenario.
Cheers, Pavel

To large extent this may have been resolved by prior replies to the list, but I figure I may as well clarify for your sake if no one else's!
Precisely, I mean an .mtz that contains experimental structure factors. (The use of "describing the entire complex" may be redundant, and my use of that phrase may rest on a misunderstanding that I'll be able to refer to later.) For obvious reasons,
Well, suppose that the structure factors in whole_complex.mtz reflect a tetramer. I would imagine that *part* of the reason that the resulting Rwork and Rfree are poor is because the model contains only one chain; with all four chains, the fit would be considerably better.
Yeah, I think that was the key conceptual issue I was having. Rationally, I had *no* idea how there could possibly be a subset of Fobs that could be associated with a subset of the atoms. But phenix.maps, being handed a PDB and a MTZ file of structure factors, hands you--in addition to a .ccp4 map--the rather generically named map_coeffs MTZ file. Looking more into the documentation for maps.params, it's probably just separate data:
mFo-DFc and anomalous difference maps will be output in MTZ format
So it's possible that this entire line of not-quite-logically-consistent questioning arose from an incorrect assumption about the meaning of the output map_coeffs MTZ.
Well, I intentionally didn't mention a particular refinement protocol: I'm not using phenix.refine, but--in what I'm testing at the moment--an external real-space refinement that uses the .ccp4 map to provide restraints.
Right: this was due to the aforementioned conceptual error re: what the "map_coeffs" output really represented.
participants (4)
-
Andy Watkins
-
Christian Roth
-
Dale Tronrud
-
Pavel Afonine