Lionel,

It's not quite as easy as one would hope, and it has nothing to do with Phenix or Phaser. We make use of extensive multiprocessing across our cluster in our RAPD software used at the beamline to speed up the results for the user.

I set this up a while ago, but from what I remember, you first have to setup a 'parallel environment' (PE) in SGE. The options maybe different depending on your version of SGE. We use 6.2u4. There are probably default PE's setup already but they might not have the correct parameters. I created a new one called 'smp' with the number of 'slots' set to the number of cores of your cluster, or some other lower limit. (If you set 'slots' to 12 and you submit 5 jobs requiring 4 slots each, only three will run, until one has finished and the resources are free.) The 'allocation rules' are set to '$pe_slots' so that the job can use only the cores on a single node. There are other rules that might be better for what you want to do. In our case, I setup different queues with different priorities that have access to specific PE's depending on the jobs that are getting submitted at the beamline. Your setup may not need this complexity. I would read through the huge manual for SGE for details or do a search on Oracle's website.

When you submit the job, make sure you add 'qsub ... -pe smp 1-4 ...' which will tell SGE that your job will need 1-4 cores on a single node. You could also just specify a single integer (4 instead of 1-4) to request 4 slots. Obviously, you can modify these to your needs. After you submit the job, run 'qstat' and look at the last column labeled 'slots' to see how many slots are saved for the job.

In your Phaser command include 'JOBS 4' to match your requested number of slots. I am not sure how much this speeds up a single Phaser job because there isn't a whole lot of code to parallelize in MR. Randy Read mentioned (either to me or the BB, I don't remember) that everything that could be parallelized in Phaser is done.

On a side note, if you write code in Python, and you start a new multiprocessing.Process() it will automatically launch it on another core on the same node. You have to account for this when you request a specific number of slots during job submission, otherwise you could overload your cluster pretty quickly. Many programs will have an optimized number of slots to request and requesting more slots will not make it run any faster, but it will limit resources available for other jobs on the cluster. I assume Phaser is one of these programs.

Jon
-- 
Jonathan P. Schuermann, Ph. D.
Beamline Scientist, NE-CAT
Argonne National Laboratory, 436E
9700 S. Cass Ave.
Argonne, IL 60439

Email: [email protected]
Tel: (630) 252-0682


On 07/10/2013 09:16 AM, L. Costenaro (IBB) wrote:
Hello,

I am trying to run phaser in a SGE cluster (qsub) using multiple proc (either phenix.phaser from the command line or phaser-MR from the GUI), but the jobs do not parallelize. When I run the phaser-MR locally (same executable) it does parallelize (multiple python threads).

Any help , advice would be welcome.

Best regards,
Lionel


_______________________________________________
phenixbb mailing list
[email protected]
http://phenix-online.org/mailman/listinfo/phenixbb