Maximum-likelihood score calculation - phenixbb - phenix-online.org

newer
cifs still in pdb v2

Maximum-likelihood score calculation

older
Real space implementation of DEN...

Jouko Virtanen

11 Jul 2014 11 Jul '14

9:48 p.m.

Hi all, I would like to calculate the maximum-likelihood score given an input structure and an mtz file. How can I do that? Thanks in advance. Best regards, Jouko

Attachments:

attachment.html (text/html — 642 bytes)

Reply

Sign in to reply online Use email software

Show replies by date

Nathaniel Echols

11 Jul 11 Jul

10:03 p.m.

On Fri, Jul 11, 2014 at 2:48 PM, Jouko Virtanen wrote:

I would like to calculate the maximum-likelihood score given an input structure and an mtz file. How can I do that? Thanks in advance.

Do you just want the ML target function value, i.e. what phenix.refine prints out at regular intervals? In that case, you can use the script appended below (run with phenix.python score.py), which is derived from $PHENIX/cctbx_project/mmtbx/examples/simple_command_line_cc.py. from __future__ import division import mmtbx.command_line import sys def master_phil () : return mmtbx.command_line.generic_simple_input_phil() def run (args, out=sys.stdout) : cmdline = mmtbx.command_line.load_model_and_data( update_f_part1_for="map", args=args, master_phil=master_phil(), out=out, process_pdb_file=False, create_fmodel=True) fmodel = cmdline.fmodel tf = fmodel.target_functor(compute_gradients=False) score = tf_r.target_work() print score if (__name__ == "__main__") : run(sys.argv[1:])

Reply

Sign in to reply online Use email software

Jouko Virtanen

10:28 p.m.

Thank Nathaniel. I am still confused. Do you intend me to copy and paste the script that you included to a file called score.py. When I did that I got the following error message Syntax error: expected "=", found "__future__" (file "score.py", line 1) Sorry: score.py is not a valid parameter file. What are the inputs? I would think that at a minimum I would have to input a pdb file, mtz file, fasta file, and an estimate for the rmsd. Thanks again, Jouko On Fri, Jul 11, 2014 at 6:03 PM, Nathaniel Echols wrote:

On Fri, Jul 11, 2014 at 2:48 PM, Jouko Virtanen wrote:

...
I would like to calculate the maximum-likelihood score given an input structure and an mtz file. How can I do that? Thanks in advance.

Do you just want the ML target function value, i.e. what phenix.refine prints out at regular intervals? In that case, you can use the script appended below (run with phenix.python score.py), which is derived from $PHENIX/cctbx_project/mmtbx/examples/simple_command_line_cc.py.

from __future__ import division import mmtbx.command_line import sys

def master_phil () : return mmtbx.command_line.generic_simple_input_phil()

def run (args, out=sys.stdout) : cmdline = mmtbx.command_line.load_model_and_data( update_f_part1_for="map", args=args, master_phil=master_phil(), out=out, process_pdb_file=False, create_fmodel=True) fmodel = cmdline.fmodel tf = fmodel.target_functor(compute_gradients=False) score = tf_r.target_work() print score

if (__name__ == "__main__") : run(sys.argv[1:])

Reply

Sign in to reply online Use email software

Nathaniel Echols

10:41 p.m.

On Fri, Jul 11, 2014 at 3:28 PM, Jouko Virtanen wrote:

Thank Nathaniel. I am still confused. Do you intend me to copy and paste the script that you included to a file called score.py. When I did that I got the following error message

Syntax error: expected "=", found "__future__" (file "score.py", line 1) Sorry: score.py is not a valid parameter file.

That's the kind of error message I'd expect if you ran "phenix.refine score.py" instead of "phenix.python score.py". What are the inputs? I would think that at a minimum I would have to input

a pdb file, mtz file, fasta file, and an estimate for the rmsd.

You would definitely need a PDB file and MTZ file - those will just be additional command-line arguments. The script I sent you does not require a sequence or estimated RMSD; it also assumes that the model is already placed correctly in the unit cell. Phaser does require the sequence and RMSD, however; is that what you really wanted to use? If you actually need to place the model you might as well just run the full Phaser MR_AUTO mode; otherwise you can use the MR_RNP mode. I had expected to be able to do it like this: phenix.phaser model.pdb data.mtz seq.fa model_rmsd=1.0 phaser.mode=MR_RNP although that crashes, which I guess may be a bug. There are other ways to get the same information but it is not clear what exactly you're trying to do. -Nat

Reply

Sign in to reply online Use email software

Nathaniel Echols

10:51 p.m.

On Fri, Jul 11, 2014 at 3:41 PM, Nathaniel Echols wrote:

If you actually need to place the model you might as well just run the full Phaser MR_AUTO mode; otherwise you can use the MR_RNP mode. I had expected to be able to do it like this:

phenix.phaser model.pdb data.mtz seq.fa model_rmsd=1.0 phaser.mode=MR_RNP

although that crashes, which I guess may be a bug.

[answering my own question yet again] My bug, apparently. It'll work in the next nightly build. -Nat

Reply

Sign in to reply online Use email software

Jouko Virtanen

11:17 p.m.

Hi Nathaniel, I am developing my own MR code. I want to use the maximum-likelihood score in phenix as part of an assessment score for the final placed models. Soon I will write my own code to calculate the maximum-likelihood score. Not having to input a fasta file or rmsd is ok, I just that that it was mandatory, since there are terms for estimated coordinate error and incompleteness of the structure in maximum-likelihood score. You were right, I did accidentally use phenix.phaser instead of phenix.python. When I use phenix.python score.py file.pdb file.mtz (not the actual file paths) I get the following error message Traceback (most recent call last): File "score.py", line 22, in <module> run(sys.argv[1:]) File "score.py", line 9, in run cmdline = mmtbx.command_line.load_model_and_data( AttributeError: 'module' object has no attribute 'load_model_and_data' Thanks, Jouko On Fri, Jul 11, 2014 at 6:41 PM, Nathaniel Echols wrote:

On Fri, Jul 11, 2014 at 3:28 PM, Jouko Virtanen wrote:

...
Thank Nathaniel. I am still confused. Do you intend me to copy and paste the script that you included to a file called score.py. When I did that I got the following error message

Syntax error: expected "=", found "__future__" (file "score.py", line 1) Sorry: score.py is not a valid parameter file.

That's the kind of error message I'd expect if you ran "phenix.refine score.py" instead of "phenix.python score.py".

What are the inputs? I would think that at a minimum I would have to

...
input a pdb file, mtz file, fasta file, and an estimate for the rmsd.

You would definitely need a PDB file and MTZ file - those will just be additional command-line arguments. The script I sent you does not require a sequence or estimated RMSD; it also assumes that the model is already placed correctly in the unit cell. Phaser does require the sequence and RMSD, however; is that what you really wanted to use? If you actually need to place the model you might as well just run the full Phaser MR_AUTO mode; otherwise you can use the MR_RNP mode. I had expected to be able to do it like this:

phenix.phaser model.pdb data.mtz seq.fa model_rmsd=1.0 phaser.mode=MR_RNP

although that crashes, which I guess may be a bug. There are other ways to get the same information but it is not clear what exactly you're trying to do.

-Nat

Reply

Sign in to reply online Use email software

Nathaniel Echols

11:25 p.m.

On Fri, Jul 11, 2014 at 4:17 PM, Jouko Virtanen wrote:

When I use phenix.python score.py file.pdb file.mtz (not the actual file paths) I get the following error message

AttributeError: 'module' object has no attribute 'load_model_and_data'

This means you're using an outdated version of Phenix; you will need at least version 1.9 to run the script. But to repeat, this code snippet is *not* directly comparable to the scores output by Phaser; phenix.refine uses a completely different likelihood target. -Nat

Reply

Sign in to reply online Use email software

Pavel Afonine

11:57 p.m.

Hi Jouko, a few points to be aware of: 1) ML targets used in phenix.refine and Phaser are essentially the same but parametrized differently (alpha/beta vs sigmaA). You can see this if you check corresponding publications. 2) Parameters of ML targets used in phenix.refine (alpha/beta) and Phaser (sigmaA) are estimated differently mainly because these program work at very different range of model-data discrepancy. At MR stage (Phaser) model errors are usually gross and deriving alpha/beta or sigmaA from comparison of Fcalc and Fobs may not be reliable; this is why Phaser asks the user to provide an approximate information about how large the model errors are (as rmsd or sequence identity). This information is then used to estimate sigmaA, and in turn calculate ML target. At refinement stage (phenix.refine) model-data discrepancy is smaller, and sigmaA or alpha/beta can be reliably estimated from comparison of Fcalc and Fobs. This is what phenix.refine does, and this is the target that you get from fmodel object that Nat mentioned earlier. Pavel On 7/11/14, 2:48 PM, Jouko Virtanen wrote:

Hi all,

I would like to calculate the maximum-likelihood score given an input structure and an mtz file. How can I do that? Thanks in advance.

Best regards,

Jouko

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Reply

Sign in to reply online Use email software

Nathaniel Echols

12 Jul 12 Jul

12:07 a.m.

On Fri, Jul 11, 2014 at 4:57 PM, Pavel Afonine wrote:

1) ML targets used in phenix.refine and Phaser are essentially the same but parametrized differently (alpha/beta vs sigmaA).

... but they are not reported the same way - phenix.refine shows a raw target value (normalized?), while Phaser outputs TFZ and LLG. Presumably these are related somehow but it is not obvious to me. -Nat

Reply

Sign in to reply online Use email software

Pavel Afonine

12:54 a.m.

ML used in phenix.refine does not contain a constant term (to save computation time; perhaps there is no huge gain!) and is normalized by the number of reflections so the value is between 0 and ~10 most of the time (that makes it predictable and so it can be calibrated to use for various purposes). Pavel On 7/11/14, 5:07 PM, Nathaniel Echols wrote:

On Fri, Jul 11, 2014 at 4:57 PM, Pavel Afonine mailto:[email protected]> wrote:

1) ML targets used in phenix.refine and Phaser are essentially the same but parametrized differently (alpha/beta vs sigmaA).

... but they are not reported the same way - phenix.refine shows a raw target value (normalized?), while Phaser outputs TFZ and LLG. Presumably these are related somehow but it is not obvious to me.

-Nat

Reply

Sign in to reply online Use email software

Jouko Virtanen

8 p.m.

I got this working. Thanks for also explaining the difference between this score and the one used in phenix.phaser. On Fri, Jul 11, 2014 at 8:54 PM, Pavel Afonine wrote:

ML used in phenix.refine does not contain a constant term (to save computation time; perhaps there is no huge gain!) and is normalized by the number of reflections so the value is between 0 and ~10 most of the time (that makes it predictable and so it can be calibrated to use for various purposes).

Pavel

On 7/11/14, 5:07 PM, Nathaniel Echols wrote:

On Fri, Jul 11, 2014 at 4:57 PM, Pavel Afonine wrote:

...
1) ML targets used in phenix.refine and Phaser are essentially the same but parametrized differently (alpha/beta vs sigmaA).

... but they are not reported the same way - phenix.refine shows a raw target value (normalized?), while Phaser outputs TFZ and LLG. Presumably these are related somehow but it is not obvious to me.

-Nat

Reply

Sign in to reply online Use email software

4009

Age (days ago)

4010

Last active (days ago)

Download

10 comments

3 participants

tags

participants (3)

Jouko Virtanen
Nathaniel Echols
Pavel Afonine