phenix.refine command line: what to change on the subsequent def file to point to the desired input pdb
Dear phenix users/developers Some time ago there were several modifications on the names/specifications of important parameters/options (such as miller_array for structure factor files, etc) Running phenix.refine by command lines (no gui) is my favorite, and I am now running into several difficulties: 1- what’s the difference between miller_array and default_miller_array; or model and default_model? 2- what’s the precise definition of phil_files and default_phil? What to change if I manually edit my .eff? 3- how should I use the new automatically created .def file for subsequent refinement cycles after I’ve manually rebuilt (I need to point to the new pdb input file: I usually screw this up, can’t seem to use the newly created .def as my starting instructions file anymore, it complains that I have problems in the right definition of the input pdb, or too many models...) 4- I often fall into a loop where by default newly created .eff files include a dry_run=true option (e.g. when I deliberately tell phenix.refine to overwrite previous .eff files) Thanks a lot for clarifying on these issues: and sorry, for I couldn’t find obvious answers to them in the documentation (probably my fault!!) Also, I am using phenix v 1.21.1-5286 (on a Sonoma MacOS laptop) With best wishes Alejandro
Hi Alejandro,
I had the same issues with command line batch refinement. The script below works with the new syntax. Replace PREFIX, SERIAL, MTZFILE, and MODEL.PDB with your file names. Of course, change the SG, restraint files, and tls definitions to match your system. You may also want to change the refinement strategy, as the one below is not the default. For the default files in the .eff or .def files, use None as follows:
default_model = None
phil_files = None
default_phil = None
---------------------------begin refine.com---------------------------------------------
phenix.refine \
output.prefix=PREFIX \
output.serial=SERIAL \
MTZFILE \
data_manager.miller_array.labels.name=F,SIGF \
MODEL.PDB \
/titan/tanner/elbow/FAD/elbow.FAD_ideal_pdb.004.cif \
/titan/tanner/elbow/NAD/NAD.cif \
/titan/tanner/elbow/FAD/FDA4.cif \
refinement.crystal_symmetry.space_group="P 1 21 1" \
refine.strategy=tls+individual_sites+individual_sites_real_space+individual_adp+rigid_body \
refine.adp.tls="chain A and peptide" \
refine.adp.tls="chain B and peptide" \
refine.sites.rigid_body="chain A" \
refine.sites.rigid_body="chain B" \
main.simulated_annealing=False \
main.ordered_solvent=False \
ordered_solvent.secondary_map_and_map_cc_filter.poor_cc_threshold=0.75 \
ordered_solvent.h_bond_min_mac=2.4 \
ordered_solvent.h_bond_min_sol=2.4 \
main.number_of_macro_cycles=3 \
output.write_model_cif_file=False \
output.write_geo_file=False
------------------end refine.com------------------------------------------------
Best Regards,
Jack Tanner
-----------------------
John J. Tanner
Professor of Biochemistry and Chemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: [email protected]mailto:[email protected]
https://cafnrfaculty.missouri.edu/tannerlab/
Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A
From: Alejandro Buschiazzo
Dear Jack Thank you very much for your message!! I see that the way you worked around the problem, was to have a detailed, “fully argumented” command, for each and every cycle, actually avoiding the use of the .def file that a n-th round of refinement generates (file that is supposedly useful as initial script for the n+1-th round, just by updating the desired input pdb model). In any case, it does the trick (in my case having to add several more arguments that I personally need) (still don’t quite get what the default phil and model are intended for, nor the phil_files: I see you just indicate “None” for all those, which I will do) Thanks again Alejandro -- Alejandro Buschiazzo, PhD Associate Professor Laboratory of Molecular & Structural Microbiology Institut Pasteur de Montevideo Mataojo 2020 Montevideo 11400 URUGUAY Phone: +598 25220910 ext. 120 Fax: +598 25224185 https://pasteur.uy/en/laboratories/molecular-and-structural-microbiology/
On 30 Jul 2024, at 2:52 PM, Tanner, John
wrote: Hi Alejandro,
I had the same issues with command line batch refinement. The script below works with the new syntax. Replace PREFIX, SERIAL, MTZFILE, and MODEL.PDB with your file names. Of course, change the SG, restraint files, and tls definitions to match your system. You may also want to change the refinement strategy, as the one below is not the default. For the default files in the .eff or .def files, use None as follows:
default_model = None phil_files = None default_phil = None
---------------------------begin refine.com http://refine.com/--------------------------------------------- phenix.refine \ output.prefix=PREFIX \ output.serial=SERIAL \ MTZFILE \ data_manager.miller_array.labels.name=F,SIGF \ MODEL.PDB \ /titan/tanner/elbow/FAD/elbow.FAD_ideal_pdb.004.cif \ /titan/tanner/elbow/NAD/NAD.cif \ /titan/tanner/elbow/FAD/FDA4.cif \ refinement.crystal_symmetry.space_group="P 1 21 1" \ refine.strategy=tls+individual_sites+individual_sites_real_space+individual_adp+rigid_body \ refine.adp.tls="chain A and peptide" \ refine.adp.tls="chain B and peptide" \ refine.sites.rigid_body="chain A" \ refine.sites.rigid_body="chain B" \ main.simulated_annealing=False \ main.ordered_solvent=False \ ordered_solvent.secondary_map_and_map_cc_filter.poor_cc_threshold=0.75 \ ordered_solvent.h_bond_min_mac=2.4 \ ordered_solvent.h_bond_min_sol=2.4 \ main.number_of_macro_cycles=3 \ output.write_model_cif_file=False \ output.write_geo_file=False ------------------end refine.com http://refine.com/------------------------------------------------
Best Regards,
Jack Tanner
----------------------- John J. Tanner Professor of Biochemistry and Chemistry Department of Biochemistry University of Missouri 117 Schweitzer Hall 503 S College Avenue Columbia, MO 65211 Phone: 573-884-1280 Email: [email protected] mailto:[email protected] https://cafnrfaculty.missouri.edu/tannerlab/ Lab: Schlundt Annex rooms 3,6,9, 203B, 203C Office: Schlundt Annex 203A
From: Alejandro Buschiazzo
mailto:[email protected]> Date: Tuesday, July 30, 2024 at 12:13 PM To: phenixbb mailto:[email protected]> Subject: [phenixbb] phenix.refine command line: what to change on the subsequent def file to point to the desired input pdb WARNING: This message has originated from an External Source. This may be a phishing expedition that can result in unauthorized access to our IT System. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
Dear phenix users/developers
Some time ago there were several modifications on the names/specifications of important parameters/options (such as miller_array for structure factor files, etc) Running phenix.refine by command lines (no gui) is my favorite, and I am now running into several difficulties:
1- what’s the difference between miller_array and default_miller_array; or model and default_model? 2- what’s the precise definition of phil_files and default_phil? What to change if I manually edit my .eff? 3- how should I use the new automatically created .def file for subsequent refinement cycles after I’ve manually rebuilt (I need to point to the new pdb input file: I usually screw this up, can’t seem to use the newly created .def as my starting instructions file anymore, it complains that I have problems in the right definition of the input pdb, or too many models...) 4- I often fall into a loop where by default newly created .eff files include a dry_run=true option (e.g. when I deliberately tell phenix.refine to overwrite previous .eff files)
Thanks a lot for clarifying on these issues: and sorry, for I couldn’t find obvious answers to them in the documentation (probably my fault!!)
Also, I am using phenix v 1.21.1-5286 (on a Sonoma MacOS laptop)
With best wishes Alejandro
_______________________________________________ phenixbb mailing list -- [email protected] mailto:[email protected] To unsubscribe send an email to [email protected] mailto:[email protected] Unsubscribe: phenixbb-leave@%(host_name)s
Hi Alejandro,
The data_manager scope is a bookkeeping mechanism for tracking files in
programs. We recently refactored many of our tools to follow this approach
so that they have a consistent interface. Basically, each kind of data you
would provide is mapped to specific Phenix/CCTBX data structures (e.g.
model files to our model object) and the DataManager class manages the file
I/O.
1) The "default" parameters are the first file of that type that is read.
It simplifies how to get the data within our programs without having to
specify file names. For more complicated situations, you may need to
provide a PHIL file for specifying models for different uses (e.g. X-ray or
neutron models for joint X-ray/neutron refinement), or for selecting
labels across different data files)
2) The "phil_files" are any parameter files that you specify as an argument
(e.g. selections for restraints, or some common parameters). The parameters
in these files are applied in the order the files are given. So the
settings in the last file is the final setting. Moreover, if the same
parameter is specified as a command-line argument, that setting will
override previous settings.
3) If you only have one model file, like the one you rebuilt, you can
delete the "model" section in the .def file first, then when you run
phenix.refine, provide the model and the .def file. The multiple models
error occurs if you still have a model file specified in the .def file and
you provide your rebuilt model to phenix.refine. The program expects
exactly one model file for refinement. Other models can be used for
reference restraints or joint refinement, but they will be marked as such
in the .def file.
4) You can try adding "dry_run=False" to your phenix.refine command after
providing the .def file as an argument. The command-line argument will
override the .def setting as described in 2.
The behavior of the data_manager scope is still a work in progress, but I
can add more documentation to the output of
phenix.refine --show-defaults --attributes-level=1
That will show the help string for each parameter once I add more
information.
It also looks like we should remove the phil_files and default_phil
sections when outputting the .def file.
Let us know if you have any other questions. Thanks!
--
Billy K. Poon
Research Scientist, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
1 Cyclotron Road, M/S 33R0345
Berkeley, CA 94720
Fax: (510) 486-5909
Web: https://phenix-online.org
On Wed, Jul 31, 2024 at 7:56 AM Alejandro Buschiazzo
Dear Jack
Thank you very much for your message!!
I see that the way you worked around the problem, was to have a detailed, “fully argumented” command, for each and every cycle, actually avoiding the use of the .def file that a n-th round of refinement generates (file that is supposedly useful as initial script for the n+1-th round, just by updating the desired input pdb model).
In any case, it does the trick (in my case having to add several more arguments that I personally need)
(still don’t quite get what the default phil and model are intended for, nor the phil_files: I see you just indicate “None” for all those, which I will do)
Thanks again Alejandro
-- Alejandro Buschiazzo, PhD Associate Professor Laboratory of Molecular & Structural Microbiology Institut Pasteur de Montevideo Mataojo 2020 Montevideo 11400 URUGUAY Phone: +598 25220910 ext. 120 Fax: +598 25224185 https://pasteur.uy/en/laboratories/molecular-and-structural-microbiology/
On 30 Jul 2024, at 2:52 PM, Tanner, John
wrote: Hi Alejandro,
I had the same issues with command line batch refinement. The script below works with the new syntax. Replace PREFIX, SERIAL, MTZFILE, and MODEL.PDB with your file names. Of course, change the SG, restraint files, and tls definitions to match your system. You may also want to change the refinement strategy, as the one below is not the default. For the default files in the .eff or .def files, use None as follows:
default_model = None phil_files = None default_phil = None
---------------------------begin refine.com --------------------------------------------- phenix.refine \ output.prefix=PREFIX \ output.serial=SERIAL \ MTZFILE \ data_manager.miller_array.labels.name=F,SIGF \ MODEL.PDB \ /titan/tanner/elbow/FAD/elbow.FAD_ideal_pdb.004.cif \ /titan/tanner/elbow/NAD/NAD.cif \ /titan/tanner/elbow/FAD/FDA4.cif \ refinement.crystal_symmetry.space_group="P 1 21 1" \ refine.strategy=tls+individual_sites+ individual_sites_real_space+individual_adp+rigid_body \ refine.adp.tls="chain A and peptide" \ refine.adp.tls="chain B and peptide" \ refine.sites.rigid_body="chain A" \ refine.sites.rigid_body="chain B" \ main.simulated_annealing=False \ main.ordered_solvent=False \ ordered_solvent.secondary_map_and_map_cc_filter. poor_cc_threshold=0.75 \ ordered_solvent.h_bond_min_mac=2.4 \ ordered_solvent.h_bond_min_sol=2.4 \ main.number_of_macro_cycles=3 \ output.write_model_cif_file=False \ output.write_geo_file=False ------------------end refine.com ------------------------------------------------
Best Regards,
Jack Tanner
----------------------- John J. Tanner Professor of Biochemistry and Chemistry Department of Biochemistry University of Missouri 117 Schweitzer Hall 503 S College Avenue Columbia, MO 65211 Phone: 573-884-1280 Email: [email protected]
https://cafnrfaculty.missouri.edu/tannerlab/ Lab: Schlundt Annex rooms 3,6,9, 203B, 203C Office: Schlundt Annex 203A *From: *Alejandro Buschiazzo
*Date: *Tuesday, July 30, 2024 at 12:13 PM *To: *phenixbb *Subject: *[phenixbb] phenix.refine command line: what to change on the subsequent def file to point to the desired input pdb WARNING: This message has originated from an External Source. This may be a phishing expedition that can result in unauthorized access to our IT System. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email. Dear phenix users/developers
Some time ago there were several modifications on the names/specifications of important parameters/options (such as miller_array for structure factor files, etc) Running phenix.refine by command lines (no gui) is my favorite, and I am now running into several difficulties:
1- what’s the difference between miller_array and default_miller_array; or model and default_model? 2- what’s the precise definition of phil_files and default_phil? What to change if I manually edit my .eff? 3- how should I use the new automatically created .def file for subsequent refinement cycles after I’ve manually rebuilt (I need to point to the new pdb input file: I usually screw this up, can’t seem to use the newly created .def as my starting instructions file anymore, it complains that I have problems in the right definition of the input pdb, or too many models...) 4- I often fall into a loop where by default newly created .eff files include a dry_run=true option (e.g. when I deliberately tell phenix.refine to overwrite previous .eff files)
Thanks a lot for clarifying on these issues: and sorry, for I couldn’t find obvious answers to them in the documentation (probably my fault!!)
Also, I am using phenix v 1.21.1-5286 (on a Sonoma MacOS laptop)
With best wishes Alejandro
_______________________________________________ phenixbb mailing list -- [email protected] To unsubscribe send an email to [email protected] Unsubscribe: phenixbb-leave@%(host_name)s
_______________________________________________ phenixbb mailing list -- [email protected] To unsubscribe send an email to [email protected] Unsubscribe: phenixbb-leave@%(host_name)s
Thank you Billy! Everything is now clear Best, Alejandro PS: also thanks again to Jack Tanner, for his last e-mail (your practical advice allowed me to move forward quickly!): understood, and everything consistent with what Billy described here as well.
On 2 Aug 2024, at 9:07 AM, Billy Poon
wrote: Hi Alejandro,
The data_manager scope is a bookkeeping mechanism for tracking files in programs. We recently refactored many of our tools to follow this approach so that they have a consistent interface. Basically, each kind of data you would provide is mapped to specific Phenix/CCTBX data structures (e.g. model files to our model object) and the DataManager class manages the file I/O.
1) The "default" parameters are the first file of that type that is read. It simplifies how to get the data within our programs without having to specify file names. For more complicated situations, you may need to provide a PHIL file for specifying models for different uses (e.g. X-ray or neutron models for joint X-ray/neutron refinement), or for selecting labels across different data files)
2) The "phil_files" are any parameter files that you specify as an argument (e.g. selections for restraints, or some common parameters). The parameters in these files are applied in the order the files are given. So the settings in the last file is the final setting. Moreover, if the same parameter is specified as a command-line argument, that setting will override previous settings.
3) If you only have one model file, like the one you rebuilt, you can delete the "model" section in the .def file first, then when you run phenix.refine, provide the model and the .def file. The multiple models error occurs if you still have a model file specified in the .def file and you provide your rebuilt model to phenix.refine. The program expects exactly one model file for refinement. Other models can be used for reference restraints or joint refinement, but they will be marked as such in the .def file.
4) You can try adding "dry_run=False" to your phenix.refine command after providing the .def file as an argument. The command-line argument will override the .def setting as described in 2.
The behavior of the data_manager scope is still a work in progress, but I can add more documentation to the output of
phenix.refine --show-defaults --attributes-level=1
That will show the help string for each parameter once I add more information.
It also looks like we should remove the phil_files and default_phil sections when outputting the .def file.
Let us know if you have any other questions. Thanks!
-- Billy K. Poon Research Scientist, Molecular Biophysics and Integrated Bioimaging Lawrence Berkeley National Laboratory 1 Cyclotron Road, M/S 33R0345 Berkeley, CA 94720 Fax: (510) 486-5909 Web: https://phenix-online.org https://phenix-online.org/
On Wed, Jul 31, 2024 at 7:56 AM Alejandro Buschiazzo
mailto:[email protected]> wrote: Dear Jack
Thank you very much for your message!!
I see that the way you worked around the problem, was to have a detailed, “fully argumented” command, for each and every cycle, actually avoiding the use of the .def file that a n-th round of refinement generates (file that is supposedly useful as initial script for the n+1-th round, just by updating the desired input pdb model).
In any case, it does the trick (in my case having to add several more arguments that I personally need)
(still don’t quite get what the default phil and model are intended for, nor the phil_files: I see you just indicate “None” for all those, which I will do)
Thanks again Alejandro
-- Alejandro Buschiazzo, PhD Associate Professor Laboratory of Molecular & Structural Microbiology Institut Pasteur de Montevideo Mataojo 2020 Montevideo 11400 URUGUAY Phone: +598 25220910 ext. 120 Fax: +598 25224185 https://pasteur.uy/en/laboratories/molecular-and-structural-microbiology/
On 30 Jul 2024, at 2:52 PM, Tanner, John
mailto:[email protected]> wrote: Hi Alejandro,
I had the same issues with command line batch refinement. The script below works with the new syntax. Replace PREFIX, SERIAL, MTZFILE, and MODEL.PDB with your file names. Of course, change the SG, restraint files, and tls definitions to match your system. You may also want to change the refinement strategy, as the one below is not the default. For the default files in the .eff or .def files, use None as follows:
default_model = None phil_files = None default_phil = None
---------------------------begin refine.com http://refine.com/--------------------------------------------- phenix.refine \ output.prefix=PREFIX \ output.serial=SERIAL \ MTZFILE \ data_manager.miller_array.labels.name=F,SIGF \ MODEL.PDB \ /titan/tanner/elbow/FAD/elbow.FAD_ideal_pdb.004.cif \ /titan/tanner/elbow/NAD/NAD.cif \ /titan/tanner/elbow/FAD/FDA4.cif \ refinement.crystal_symmetry.space_group="P 1 21 1" \ refine.strategy=tls+individual_sites+individual_sites_real_space+individual_adp+rigid_body \ refine.adp.tls="chain A and peptide" \ refine.adp.tls="chain B and peptide" \ refine.sites.rigid_body="chain A" \ refine.sites.rigid_body="chain B" \ main.simulated_annealing=False \ main.ordered_solvent=False \ ordered_solvent.secondary_map_and_map_cc_filter.poor_cc_threshold=0.75 \ ordered_solvent.h_bond_min_mac=2.4 \ ordered_solvent.h_bond_min_sol=2.4 \ main.number_of_macro_cycles=3 \ output.write_model_cif_file=False \ output.write_geo_file=False ------------------end refine.com http://refine.com/------------------------------------------------
Best Regards,
Jack Tanner
----------------------- John J. Tanner Professor of Biochemistry and Chemistry Department of Biochemistry University of Missouri 117 Schweitzer Hall 503 S College Avenue Columbia, MO 65211 Phone: 573-884-1280 Email: [email protected] mailto:[email protected] https://cafnrfaculty.missouri.edu/tannerlab/ Lab: Schlundt Annex rooms 3,6,9, 203B, 203C Office: Schlundt Annex 203A
From: Alejandro Buschiazzo
mailto:[email protected]> Date: Tuesday, July 30, 2024 at 12:13 PM To: phenixbb mailto:[email protected]> Subject: [phenixbb] phenix.refine command line: what to change on the subsequent def file to point to the desired input pdb WARNING: This message has originated from an External Source. This may be a phishing expedition that can result in unauthorized access to our IT System. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
Dear phenix users/developers
Some time ago there were several modifications on the names/specifications of important parameters/options (such as miller_array for structure factor files, etc) Running phenix.refine by command lines (no gui) is my favorite, and I am now running into several difficulties:
1- what’s the difference between miller_array and default_miller_array; or model and default_model? 2- what’s the precise definition of phil_files and default_phil? What to change if I manually edit my .eff? 3- how should I use the new automatically created .def file for subsequent refinement cycles after I’ve manually rebuilt (I need to point to the new pdb input file: I usually screw this up, can’t seem to use the newly created .def as my starting instructions file anymore, it complains that I have problems in the right definition of the input pdb, or too many models...) 4- I often fall into a loop where by default newly created .eff files include a dry_run=true option (e.g. when I deliberately tell phenix.refine to overwrite previous .eff files)
Thanks a lot for clarifying on these issues: and sorry, for I couldn’t find obvious answers to them in the documentation (probably my fault!!)
Also, I am using phenix v 1.21.1-5286 (on a Sonoma MacOS laptop)
With best wishes Alejandro
_______________________________________________ phenixbb mailing list -- [email protected] mailto:[email protected] To unsubscribe send an email to [email protected] mailto:[email protected] Unsubscribe: phenixbb-leave@%(host_name)s
_______________________________________________ phenixbb mailing list -- [email protected] mailto:[email protected] To unsubscribe send an email to [email protected] mailto:[email protected] Unsubscribe: phenixbb-leave@%(host_name)s
Hi Alejandro,
I use this script only for the first round of refinement. The script generates PREFIX_001.eff early in the refinement, and then writes PREFIX_002.def file at the end of refinement. For subsequent rounds, I edit the .def file. Often, I control-C to stop to phenix.refine after PREFIX_001.eff is created, delete PREFIX_001.log to avoid overwrite error in next step, move PREFIX_001.eff to “start.eff”, edit start.eff to modify options not included in the script, and then run phenix.refine start.eff.
Best Regards,
Jack
From: Alejandro Buschiazzo
participants (3)
-
Alejandro Buschiazzo
-
Billy Poon
-
Tanner, John