[phenixbb] automating setting up parallel refinement jobs

Kendall Nettles knettles at scripps.edu
Thu Feb 17 17:12:14 PST 2011


Hi Nat, 
that would be a great help. One thing you could do would be to have a tab of parameters that will be distributed to all jobs, and another tab for which each thing checked generates a separate run. 

We usually start with just individual xyz and ADP refinement, as long as the structure is 3 angstroms. If the additional parameter improves R/Rfree, we add them together for the final run. 

I generally have a folder called refine1 that contains all the subjobs, including the throwaway ones and the final combined one, then after rebuilding I would start the next one as refine2. Ideally you could name the folders to identify parameters, and maybe add a check box on the GUI tab to indicate the final keeper run, which would go into the file name. 

so far we have have just looked at Rfree to pick which ones to keep, and find a spread of 1-3% between parameters, depending on the structure. 

for outputs it would be great to have a table with R/Rfree. 

Have you compared the phenix TLS grouping with various groupings from TLSMD? I will do so this week on our 20 structures and can provide some feedback. We have a range of 2-3 angstroms so it should provide a good test set. 


best regards,
Kendall

On Feb 17, 2011, at 6:42 PM, Nathaniel Echols wrote:

> On Thu, Feb 17, 2011 at 12:02 PM, Kendall Nettles <knettles at scripps.edu> wrote:
>> So here is my feature request: I would love to have a GUI interface to generate the folders and parameter files, where you would select a list of common parameters, and a list of parameters to populate 1/ job, instead of having to set up each one as a separate job.
>> 
>>  I think this would really speed things up for your industrial users. I'm working on 20 structures of the same protein with different ligands, and expect to spend maybe 8 hours generating TLS groups and editing the 240 parameter files. A GUI interface would make it 10 or 20 minutes!
> 
> I did something like this for Phaser as a proof-of-concept for simple
> parallelization of tasks:
> 
> http://cci.lbl.gov/~nat/img/phenix/phaser_mp_config.png
> http://cci.lbl.gov/~nat/img/phenix/phaser_mp_results.png
> 
> It runs all search models in parallel, and can sample multiple
> expected RMSDs too.  The calculations can be parallelized over
> multiple cores (I never tried more than 12, I think, but there's no
> limit that I'm aware of) or across a cluster.  It only uses one
> dataset with many models, but I could have just as easily done the
> reverse, or both model and data parallel.  This isn't a very
> sophisticated program (it was maybe 2 days effort), but eventually
> we'll have a new MR frontend that does something similar, with lots
> more pre-processing of search models.
> 
> So, from a technical standpoint, it's fairly easy to set up, and
> distributing the jobs and displaying results is relatively easy.  The
> main reason I haven't done anything like this yet is that it isn't
> obvious to me which parameters need to be sampled and which would be
> in common.  (Also, I'm already at the limit of my multitasking
> ability.)  I like the idea of making the user choose; since almost all
> of the controls in the GUI can be generated automatically, a dynamic
> interface is not difficult to set up.  There is a separate problem of
> how to group inputs, but this may not be as hard as I'm imagining.
> (From my perspective, there is yet another issue with how to organize
> and save results - should those 20 structures be one project or 20,
> etc.)
> 
> All that said, I think the immediate problem is actually not too bad -
> phenix.refine will take as many parameter files as you want, so for
> TLS, for example, you just need to make one file that looks like this
> (for example):
> 
> refinement.refine.adp {
>  tls = "chain A"
>  tls = "chain B"
> }
> 
> ... and call it "tls.eff", then run "phenix.refine model1.pdb
> data1.mtz tls.eff other_params.eff", and so on with each dataset.  You
> can have additional parameter files for other settings that you want
> to vary.  It doesn't solve the organizational problem, however, nor
> does it display the results conveniently, but at least it's less time
> spent in a text editor.
> 
> -Nat
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb




More information about the phenixbb mailing list