On Thu, Feb 17, 2011 at 12:02 PM, Kendall Nettles
So here is my feature request: I would love to have a GUI interface to generate the folders and parameter files, where you would select a list of common parameters, and a list of parameters to populate 1/ job, instead of having to set up each one as a separate job.
I think this would really speed things up for your industrial users. I'm working on 20 structures of the same protein with different ligands, and expect to spend maybe 8 hours generating TLS groups and editing the 240 parameter files. A GUI interface would make it 10 or 20 minutes!
I did something like this for Phaser as a proof-of-concept for simple parallelization of tasks: http://cci.lbl.gov/~nat/img/phenix/phaser_mp_config.png http://cci.lbl.gov/~nat/img/phenix/phaser_mp_results.png It runs all search models in parallel, and can sample multiple expected RMSDs too. The calculations can be parallelized over multiple cores (I never tried more than 12, I think, but there's no limit that I'm aware of) or across a cluster. It only uses one dataset with many models, but I could have just as easily done the reverse, or both model and data parallel. This isn't a very sophisticated program (it was maybe 2 days effort), but eventually we'll have a new MR frontend that does something similar, with lots more pre-processing of search models. So, from a technical standpoint, it's fairly easy to set up, and distributing the jobs and displaying results is relatively easy. The main reason I haven't done anything like this yet is that it isn't obvious to me which parameters need to be sampled and which would be in common. (Also, I'm already at the limit of my multitasking ability.) I like the idea of making the user choose; since almost all of the controls in the GUI can be generated automatically, a dynamic interface is not difficult to set up. There is a separate problem of how to group inputs, but this may not be as hard as I'm imagining. (From my perspective, there is yet another issue with how to organize and save results - should those 20 structures be one project or 20, etc.) All that said, I think the immediate problem is actually not too bad - phenix.refine will take as many parameter files as you want, so for TLS, for example, you just need to make one file that looks like this (for example): refinement.refine.adp { tls = "chain A" tls = "chain B" } ... and call it "tls.eff", then run "phenix.refine model1.pdb data1.mtz tls.eff other_params.eff", and so on with each dataset. You can have additional parameter files for other settings that you want to vary. It doesn't solve the organizational problem, however, nor does it display the results conveniently, but at least it's less time spent in a text editor. -Nat