[phenixbb] Phenix.ensemble_refinment questions

Mark Wilson mwilson13 at unl.edu
Fri Nov 14 02:10:46 PST 2014


Hi Tim,
That is a fair point, although the observed (very modest) improvement with
phases was for the final ensemble. My motivation for looking into this is
that some of our unpublished work has indicated that even weak phase
information that is properly handled with a maximum likelihood target
(hence my use of HL coeffs with MLHL) can improve the modeling of disorder
using "CNS-style" non-interacting multicopy ensemble models.  These are
distinct from the ensembles implemented in PHENIX in that they are not
time-averaged MD models but rather a set of n full replicates of the model
that do not interact with each other and are allowed to collectively fit
the data.  I wanted to see how time-averaged ensembles compared to this
approach when phase information was included.
	The overarching rationale for these tests is that phase information
provides a powerful set of additional experimental observations that may
counteract the intrinsic tendency of ensemble approaches to overfit data,
even if the phase information is noisy.  In any event, I certainly did not
see a detrimental effect of added phase information on the R values of the
ensemble models with optimum choices of pTLS and using the MLHL target.
Your broader point however, is a good one-perhaps the true value of
including phase data isn't evident unless unusually high quality
experimental phases are being used.
Best regards,
Mark

Mark A. Wilson
Associate Professor
Department of Biochemistry/Redox Biology Center
University of Nebraska
N118 Beadle Center
1901 Vine Street
Lincoln, NE 68588
(402) 472-3626
mwilson13 at unl.edu 





On 11/14/14 2:50 AM, "Tim Gruene" <tg at shelx.uni-ac.gwdg.de> wrote:

>Dear Mark,
>
>for a 1A structure, at least when it is near completeness, I would
>expect the phase information from the model much, much more accurate
>than from a SAD experiment, i.e. I'd rather expect the R value to get
>worse when you include SAD phasing information. For example, when
>developers show the phase errors of their methods for phasing
>experiments, the error is generally calculated with respect to the final
>model.
>
>Did you observe the drop at an early stage of refinement with a low
>complete model?
>
>Regards,
>Tim
>
>On 11/13/2014 05:03 PM, Mark Wilson wrote:
>> Hi All,
>> As I am (I think) the other user that Nat is referring to, I'll comment.
>> I requested support for experimental phase information in the PHENIX
>> ensemble refinement target and can verify that it is accepted (in my
>>case
>> as HL coeffs) and will run.  The result in my test case was not
>> dramatically different than an amplitude-based target, but obviously a
>> great many factors could affect this.  What differences I saw were minor
>> improvements in R/Rfree (~1%) with Se-Met SAD phases in a 1.05 Å
>> resolution structure of a flexible protein.  I've not dug too much more
>> into this, but I can verify that phases are accepted and do influence
>>the
>> final ensemble.
>> Best regards,
>> Mark
>> 
>> Mark A. Wilson
>> Associate Professor
>> Department of Biochemistry/Redox Biology Center
>> University of Nebraska
>> N118 Beadle Center
>> 1901 Vine Street
>> Lincoln, NE 68588
>> (402) 472-3626
>> mwilson13 at unl.edu
>> 
>> 
>> 
>> 
>> 
>> On 11/13/14 9:53 AM, "Nathaniel Echols" <nechols at lbl.gov> wrote:
>> 
>>> On Thu, Nov 13, 2014 at 6:46 AM, Joseph Brock <joseph.brock at ki.se>
>>>wrote:
>>>
>>> 1. In the associated publication (Burnley et
>>> al. eLife 2012;), the ensemble refinement is validated by comparing the
>>> correlation of the ensemble generated map, with the map generated
>>> from the experimental phases for PDB entry 1YTT... I am confused how
>>>one
>>> computes an experimentally phased from structure factors deposited in
>>>the
>>> PDB that contain only anomalous intensities/amplitudes and not
>>> Hendrickson-Lattman
>>> coefficients. Is there a program within the phenix package that can do
>>> this?
>>>
>>>
>>>
>>>
>>>
>>>
>>> AutoSol can be used to re-solve such datasets, although in the case of
>>> 1YTT it requires additional information that wasn't deposited.
>>>
>>>
>>> Is it possible to include experimental phases during the rolling
>>>average
>>> refinement process and could this be beneficial (if the phases were of
>>>a
>>> sufficient quality)?
>>>
>>>
>>>
>>>
>>>
>>>
>>> It is possible, but completely untested aside from verifying that it
>>> doesn't crash.  I added this a year ago at the request of another user
>>> but haven't looked into it since.
>>>
>>>
>>>
>>> 3. What is the function of the "nproc" keyword? If this is the number
>>>of
>>> CPU cores that can be used in parallel, what is the most efficient way
>>>of
>>> using phenix.ensemble_refinement on a cluster?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> The only parallelization is in the optimization of the ptls parameter -
>>> i.e. if you try N values for ptls, you can run N jobs at once.  For a
>>> single ptls it will run in serial.  So on a cluster, you are better off
>>> running N different jobs separately.
>>>
>>>
>>>
>>> Finally, I noticed that I cannot run phenix.ensemble_refinement using a
>>> "my_parameters.eff" file, it is necessary to type on the command line.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> That sounds like a bug...
>>>
>>>
>>> -Nat
>>>
>>>
>>>
>> 
>> 
>> _______________________________________________
>> phenixbb mailing list
>> phenixbb at phenix-online.org
>> http://phenix-online.org/mailman/listinfo/phenixbb
>> 
>
>-- 
>Dr Tim Gruene
>Institut fuer anorganische Chemie
>Tammannstr. 4
>D-37077 Goettingen
>
>GPG Key ID = A46BEE1A
>




More information about the phenixbb mailing list