Hi, What is the best way to run Autobuild if I need to run it on hundreds of datasets (through command-line)? I have a single machine with 24 cores and access to a cluster with 128 cores (on parallel; accessible only through PBS script). Each Autobuild job with nproc=4 on a single core machine takes ~12 hours to run (nproc > 4 wouldn't make any difference in the speed of computation?!). However, I read that I could set nproc=10 (or more) and run on parallel ( https://www.phenix-online.org/documentation/reference/autobuild.html#paralle...). When I tried the same through PBS script, I got the following error, XXXXXXXXXXXXXX Running up to 1 jobs in parallel... with total of 3 jobs Splitting work into 3 jobs and running with 1 processors using qsub background=False nproc=20 background=False in ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0 Final job will be run with sh with background=True Starting job 1...Log will be: ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1.log Traceback (most recent call last): File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 989, in DoNextMethod self.CarryOutBest() # to be obtained after it is finished File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 2147, in CarryOutBest getattr(self,str(self.application_method))() # call this fn File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 2364, in AutoBuild_build_cycle self.AutoBuild_rebuild_cycle_run() File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 4647, in AutoBuild_rebuild_cycle_run always_reuse_model=always_reuse_model) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7452, in AutoBuild_build_refine mtz_file=mtz_file,mtz_ref_file=mtz_ref_file) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7696, in run_standard_build_in_parallel r.run(out=sys.stdout) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 294, in run self.start_run(run_file,last=is_last) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 604, in start_run cmd+" "+self.add_double_quote(run_file,escape_space=False)).raise_if_errors() File "~opt/phenix-1.9-1692/cctbx_project/libtbx/easy_run.py", line 37, in raise_if_errors raise Error(msg) RuntimeError: child process stderr output: command: 'qsub background=False nproc=20 "~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1"' /bin/sh: qsub: command not found XXXXXXXXXXXXXX Please let me know if I can share eff and log files of the run. Please advice, Kaushik -- People living deeply have no fear of death - Anais Nin Caution: I am still the dumbest person I have ever known :-)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear Kaushik, as you have many jobs to run, it will be faster to run each job on a single core and use the available cores for different runs. qsub is a program used to distribute programs on a cluster - maybe when you submitted your job, you modified the PATH variable and the path to the qsub executable was removed. Some programs require an immense ammount of RAM - you probably want to make sure that there is enough RAM per core for autobuilding. Best, Tim On 04/27/2015 08:05 AM, Kaushik Hatti wrote:
Hi,
What is the best way to run Autobuild if I need to run it on hundreds of datasets (through command-line)? I have a single machine with 24 cores and access to a cluster with 128 cores (on parallel; accessible only through PBS script).
Each Autobuild job with nproc=4 on a single core machine takes ~12 hours to run (nproc > 4 wouldn't make any difference in the speed of computation?!). However, I read that I could set nproc=10 (or more) and run on parallel ( https://www.phenix-online.org/documentation/reference/autobuild.html#paralle...).
When I tried the same through PBS script, I got the following error,
XXXXXXXXXXXXXX Running up to 1 jobs in parallel... with total of 3 jobs
Splitting work into 3 jobs and running with 1 processors using qsub background=False nproc=20 background=False in ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0 Final job will be run with sh with background=True Starting job 1...Log will be: ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1.log
Traceback (most recent call last):
File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 989, in DoNextMethod self.CarryOutBest() # to be obtained after it is finished File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 2147, in CarryOutBest getattr(self,str(self.application_method))() # call this fn File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 2364, in AutoBuild_build_cycle self.AutoBuild_rebuild_cycle_run() File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 4647, in AutoBuild_rebuild_cycle_run always_reuse_model=always_reuse_model) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7452, in AutoBuild_build_refine mtz_file=mtz_file,mtz_ref_file=mtz_ref_file) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7696, in run_standard_build_in_parallel r.run(out=sys.stdout) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 294, in run self.start_run(run_file,last=is_last) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 604, in start_run cmd+" "+self.add_double_quote(run_file,escape_space=False)).raise_if_errors()
File "~opt/phenix-1.9-1692/cctbx_project/libtbx/easy_run.py", line 37, in
raise_if_errors raise Error(msg) RuntimeError: child process stderr output: command: 'qsub background=False nproc=20 "~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1"'
/bin/sh: qsub: command not found
XXXXXXXXXXXXXX
Please let me know if I can share eff and log files of the run.
Please advice, Kaushik
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
- -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen phone: +49 (0)551 39 22149 GPG Key ID = A46BEE1A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iD8DBQFVPeNAUxlJ7aRr7hoRApRAAKC8YPN5J6cTn1E3EeO079kEb0MWSgCg/Of8 3FY7ym/fbrdy1ZHTWWCvTiw= =d67F -----END PGP SIGNATURE-----
Hi Kaushik, I'm sorry for the problem with parallel autobuilding! I think that parallel_autobuild is unfortunately not going to do what you want. It is for running many parallel jobs with the same starting model and combining all the results. It doesn't work with multiple starting models as I think you want it to. On the error you got, I'm not quite sure what happened but it seems that perhaps the command "qsub" was not found: RuntimeError: child process stderr output: command: 'qsub background=False nproc=20 "~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1"' /bin/sh: qsub: command not found It is a little surprising that this would happen in parallel_autobuild and not in autobuild. However perhaps in autobuild you specified that the multiple jobs are to be run with "sh" (run_command=sh) and in parallel_autobuild it was with qsub (run_command=qsub). Let me know if that does not help, All the best, Tom T ________________________________ From: [email protected] [[email protected]] on behalf of Kaushik Hatti [[email protected]] Sent: Monday, April 27, 2015 12:05 AM To: [email protected] Subject: [phenixbb] Autobuild on parallel Hi, What is the best way to run Autobuild if I need to run it on hundreds of datasets (through command-line)? I have a single machine with 24 cores and access to a cluster with 128 cores (on parallel; accessible only through PBS script). Each Autobuild job with nproc=4 on a single core machine takes ~12 hours to run (nproc > 4 wouldn't make any difference in the speed of computation?!). However, I read that I could set nproc=10 (or more) and run on parallel (https://www.phenix-online.org/documentation/reference/autobuild.html#paralle...). When I tried the same through PBS script, I got the following error, XXXXXXXXXXXXXX Running up to 1 jobs in parallel... with total of 3 jobs Splitting work into 3 jobs and running with 1 processors using qsub background=False nproc=20 background=False in ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0 Final job will be run with sh with background=True Starting job 1...Log will be: ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1.log Traceback (most recent call last): File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 989, in DoNextMethod self.CarryOutBest() # to be obtained after it is finished File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 2147, in CarryOutBest getattr(self,str(self.application_method))() # call this fn File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 2364, in AutoBuild_build_cycle self.AutoBuild_rebuild_cycle_run() File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 4647, in AutoBuild_rebuild_cycle_run always_reuse_model=always_reuse_model) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7452, in AutoBuild_build_refine mtz_file=mtz_file,mtz_ref_file=mtz_ref_file) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7696, in run_standard_build_in_parallel r.run(out=sys.stdout) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 294, in run self.start_run(run_file,last=is_last) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 604, in start_run cmd+" "+self.add_double_quote(run_file,escape_space=False)).raise_if_errors() File "~opt/phenix-1.9-1692/cctbx_project/libtbx/easy_run.py", line 37, in raise_if_errors raise Error(msg) RuntimeError: child process stderr output: command: 'qsub background=False nproc=20 "~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1"' /bin/sh: qsub: command not found XXXXXXXXXXXXXX Please let me know if I can share eff and log files of the run. Please advice, Kaushik -- People living deeply have no fear of death - Anais Nin Caution: I am still the dumbest person I have ever known :-)
Dear Tim and Tom, Thanks for the response. After attempting multiple Parallel runs through PBS array options, I have decided to go with Tim's suggestion of running each job with 1 core. The time difference between nproc=4 and nproc=1 is negligible for my dataset and its lot easier to run without PBS and queues. Thanks so much, Kaushik On Mon, Apr 27, 2015 at 8:57 PM, Terwilliger, Thomas Charles < [email protected]> wrote:
Hi Kaushik,
I'm sorry for the problem with parallel autobuilding!
I think that parallel_autobuild is unfortunately not going to do what you want. It is for running many parallel jobs with the same starting model and combining all the results. It doesn't work with multiple starting models as I think you want it to.
On the error you got, I'm not quite sure what happened but it seems that perhaps the command "qsub" was not found:
RuntimeError: child process stderr output: command: 'qsub background=False nproc=20 "~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1"' /bin/sh: qsub: command not found
It is a little surprising that this would happen in parallel_autobuild and not in autobuild. However perhaps in autobuild you specified that the multiple jobs are to be run with "sh" (run_command=sh) and in parallel_autobuild it was with qsub (run_command=qsub).
Let me know if that does not help,
All the best, Tom T
------------------------------ *From:* [email protected] [ [email protected]] on behalf of Kaushik Hatti [ [email protected]] *Sent:* Monday, April 27, 2015 12:05 AM *To:* [email protected] *Subject:* [phenixbb] Autobuild on parallel
Hi,
What is the best way to run Autobuild if I need to run it on hundreds of datasets (through command-line)? I have a single machine with 24 cores and access to a cluster with 128 cores (on parallel; accessible only through PBS script).
Each Autobuild job with nproc=4 on a single core machine takes ~12 hours to run (nproc > 4 wouldn't make any difference in the speed of computation?!). However, I read that I could set nproc=10 (or more) and run on parallel ( https://www.phenix-online.org/documentation/reference/autobuild.html#paralle...). When I tried the same through PBS script, I got the following error,
XXXXXXXXXXXXXX Running up to 1 jobs in parallel... with total of 3 jobs
Splitting work into 3 jobs and running with 1 processors using qsub background=False nproc=20 background=False in ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0 Final job will be run with sh with background=True Starting job 1...Log will be: ~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1.log Traceback (most recent call last): File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 989, in DoNextMethod self.CarryOutBest() # to be obtained after it is finished File "~opt/phenix-1.9-1692/phenix/phenix/autosol/AutoBaseExtend.py", line 2147, in CarryOutBest getattr(self,str(self.application_method))() # call this fn File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 2364, in AutoBuild_build_cycle self.AutoBuild_rebuild_cycle_run() File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 4647, in AutoBuild_rebuild_cycle_run always_reuse_model=always_reuse_model) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7452, in AutoBuild_build_refine mtz_file=mtz_file,mtz_ref_file=mtz_ref_file) File "~opt/phenix-1.9-1692/phenix/phenix/wizards/AutoBuild.py", line 7696, in run_standard_build_in_parallel r.run(out=sys.stdout) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 294, in run self.start_run(run_file,last=is_last) File "~opt/phenix-1.9-1692/phenix/phenix/autosol/run_group_of_wizards.py", line 604, in start_run cmd+" "+self.add_double_quote(run_file,escape_space=False)).raise_if_errors() File "~opt/phenix-1.9-1692/cctbx_project/libtbx/easy_run.py", line 37, in raise_if_errors raise Error(msg) RuntimeError: child process stderr output: command: 'qsub background=False nproc=20 "~phenixWorkingDir/1yya/part1/8tim/AutoBuild_run_5_/TEMP0/RUN_FILE_1"' /bin/sh: qsub: command not found
XXXXXXXXXXXXXX
Please let me know if I can share eff and log files of the run.
Please advice, Kaushik
-- People living deeply have no fear of death - Anais Nin Caution: I am still the dumbest person I have ever known :-)
-- People living deeply have no fear of death - Anais Nin Caution: I am still the dumbest person I have ever known :-)
participants (3)
-
Kaushik Hatti
-
Terwilliger, Thomas Charles
-
Tim Gruene