How should we estimate the "uncertainty" of the occupancy of an atom?
Hello We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine. By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue? Best regards ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Masaki UNNO Graduate School of Science and Engineering Ibaraki University, JAPAN 4-12-1 Nakanarusawa, Hitachi, Ibaraki 316-8511, Japan Tel&Fax +81-294-38-5041 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear Masaki UNNO, esds (estimated standard deviations) are calculated by full-matrix least-squares refinement as we carried out in order to estimate the uncertaintiy of the occupancy values of the ions in potassium channels. It is described in the Supplemental Material of http://www.sciencemag.org/content/346/6207/352 SHELXL will also provide you with the correlation coefficient between the B-values and the occupancy. If you refine the atom in question anisotropically, the CC drops a little bit. Wit dynamic memory allocation the only limit to full-matrix least-squares refinement is the RAM of your computer (and patience as it may take some time even with only a single refinement cycle). It has been done a 1A resolution (PDB ID 4KGD). Best, Tim On 02/11/2015 09:36 AM, Masaki UNNO wrote:
Hello
We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine.
By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue?
Best regards
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Masaki UNNO
Graduate School of Science and Engineering
Ibaraki University, JAPAN
4-12-1 Nakanarusawa, Hitachi, Ibaraki 316-8511, Japan
Tel&Fax +81-294-38-5041
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
- -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iD8DBQFU2xjdUxlJ7aRr7hoRApLwAJ0UpUbogEfr9SZkkKQ7JDivx13TYwCdFBmw otrEdjnBK+Tit0ze9lzhgXQ= =bp1n -----END PGP SIGNATURE-----
Hello Masaki,
We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine.
By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue?
an approximation to what you want can be easily achieved as following: 1) create 10-100 structures where occupancies in question are perturbed and vary from say 0.1-0.9. Also slightly perturb B-factors of atoms that surround these atoms. 2) Refine all these perturbed structures using refinement protocols that you usually use (the one used to obtain your final structure). Make sure to use sufficient amount of refinement macro-cycles so that refinement achieves near-convergence. 3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty. Pavel
On 02/11/2015 07:21 PM, Pavel Afonine wrote:
Hello Masaki,
[...] 3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty.
... tainted by any B-factor restraints in use Best, Tim
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A
Sphere B-factor restraints used in phenix.refine are least restrictive amongst those I've seen so far. So perhaps this is fine. Pavel On 2/11/15 11:47 AM, Tim Gruene wrote:
On 02/11/2015 07:21 PM, Pavel Afonine wrote:
Hello Masaki,
[...] 3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty. ... tainted by any B-factor restraints in use
Best, Tim
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Pavel's test does not even come close to estimating the uncertainty in occupancy. Imagine you are optimizing a parameter with a broad minimum. (Which is true when you are refining both the occupancy and B factor of an atom.) A number of refinements with differing starting positions (within the radius of convergence) will always result in a cluster of answers near the minimum. The breadth, or uncertainty, of the parameter is unimportant in Pavel's test because the purpose of refinement is to find the minimum. The breadth of the minimum is described by the second derivative of the target function. You can't calculate the uncertainty without calculating it. In the absence of the program doing the calculus for you (As Shelxl will do at high resolution) I suppose you could create a number of models when the occupancy and B factor of the atom are varied jointly (There is no use wandering off the line.) and calculating the value of the target function for each. Fitting a parabola to these results will give you the curvature. The differences in the target function for such coordinated alterations to one atom will be very small so one hopes the program uses high precision mathematics. I hope you are now beginning to understand why people don't routinely quote uncertainties for occupancies and therefore don't take their exact values very seriously. If your interpretation of your model depends on whether the occupancy of this atom is 0.65 or 0.75 crystallography is the tool you need. Dale On 2/11/2015 10:21 AM, Pavel Afonine wrote:
Hello Masaki,
We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine.
By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue?
an approximation to what you want can be easily achieved as following:
1) create 10-100 structures where occupancies in question are perturbed and vary from say 0.1-0.9. Also slightly perturb B-factors of atoms that surround these atoms.
2) Refine all these perturbed structures using refinement protocols that you usually use (the one used to obtain your final structure). Make sure to use sufficient amount of refinement macro-cycles so that refinement achieves near-convergence.
3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty.
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
What I suggested is a quick and simple work-around that will provide an *estimate* of the uncertainty. How this estimate is going to be different from the uncertainty derived from 2nd derivatives is difficult to say (and whether the difference is going to be significant!). You need to try both and see the difference - that's the only way to get THE answer. Also, I'm not entirely sure how much uncertainties obtained from 2nd derivatives would account for uncertainties due to refinement result variations described in Acta Cryst. D63, 597-610 (2007) (I tend to think they won't at all, though again I would not bet on that until I try). Pavel On 2/11/15 11:57 AM, Dale Tronrud wrote:
Pavel's test does not even come close to estimating the uncertainty in occupancy. Imagine you are optimizing a parameter with a broad minimum. (Which is true when you are refining both the occupancy and B factor of an atom.) A number of refinements with differing starting positions (within the radius of convergence) will always result in a cluster of answers near the minimum. The breadth, or uncertainty, of the parameter is unimportant in Pavel's test because the purpose of refinement is to find the minimum.
The breadth of the minimum is described by the second derivative of the target function. You can't calculate the uncertainty without calculating it. In the absence of the program doing the calculus for you (As Shelxl will do at high resolution) I suppose you could create a number of models when the occupancy and B factor of the atom are varied jointly (There is no use wandering off the line.) and calculating the value of the target function for each. Fitting a parabola to these results will give you the curvature.
The differences in the target function for such coordinated alterations to one atom will be very small so one hopes the program uses high precision mathematics.
I hope you are now beginning to understand why people don't routinely quote uncertainties for occupancies and therefore don't take their exact values very seriously. If your interpretation of your model depends on whether the occupancy of this atom is 0.65 or 0.75 crystallography is the tool you need.
Dale
On 2/11/2015 10:21 AM, Pavel Afonine wrote:
Hello Masaki,
We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine.
By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue?
an approximation to what you want can be easily achieved as following:
1) create 10-100 structures where occupancies in question are perturbed and vary from say 0.1-0.9. Also slightly perturb B-factors of atoms that surround these atoms.
2) Refine all these perturbed structures using refinement protocols that you usually use (the one used to obtain your final structure). Make sure to use sufficient amount of refinement macro-cycles so that refinement achieves near-convergence.
3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty.
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
I might be incorrect, but Pavel's suggested test done with B-factors kept the same may yield the uncertainties of occupancy values. Dale's mentioned widening of the function will be minimized in this case. Vaheh Oganesyan www.medimmune.com -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Afonine Sent: Wednesday, February 11, 2015 3:22 PM To: [email protected] Subject: Re: [phenixbb] How should we estimate the "uncertainty" of the occupancy of an atom? What I suggested is a quick and simple work-around that will provide an *estimate* of the uncertainty. How this estimate is going to be different from the uncertainty derived from 2nd derivatives is difficult to say (and whether the difference is going to be significant!). You need to try both and see the difference - that's the only way to get THE answer. Also, I'm not entirely sure how much uncertainties obtained from 2nd derivatives would account for uncertainties due to refinement result variations described in Acta Cryst. D63, 597-610 (2007) (I tend to think they won't at all, though again I would not bet on that until I try). Pavel On 2/11/15 11:57 AM, Dale Tronrud wrote:
Pavel's test does not even come close to estimating the uncertainty in occupancy. Imagine you are optimizing a parameter with a broad minimum. (Which is true when you are refining both the occupancy and B factor of an atom.) A number of refinements with differing starting positions (within the radius of convergence) will always result in a cluster of answers near the minimum. The breadth, or uncertainty, of the parameter is unimportant in Pavel's test because the purpose of refinement is to find the minimum.
The breadth of the minimum is described by the second derivative of the target function. You can't calculate the uncertainty without calculating it. In the absence of the program doing the calculus for you (As Shelxl will do at high resolution) I suppose you could create a number of models when the occupancy and B factor of the atom are varied jointly (There is no use wandering off the line.) and calculating the value of the target function for each. Fitting a parabola to these results will give you the curvature.
The differences in the target function for such coordinated alterations to one atom will be very small so one hopes the program uses high precision mathematics.
I hope you are now beginning to understand why people don't routinely quote uncertainties for occupancies and therefore don't take their exact values very seriously. If your interpretation of your model depends on whether the occupancy of this atom is 0.65 or 0.75 crystallography is the tool you need.
Dale
On 2/11/2015 10:21 AM, Pavel Afonine wrote:
Hello Masaki,
We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine.
By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue?
an approximation to what you want can be easily achieved as following:
1) create 10-100 structures where occupancies in question are perturbed and vary from say 0.1-0.9. Also slightly perturb B-factors of atoms that surround these atoms.
2) Refine all these perturbed structures using refinement protocols that you usually use (the one used to obtain your final structure). Make sure to use sufficient amount of refinement macro-cycles so that refinement achieves near-convergence.
3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty.
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb To the extent this electronic communication or any of its attachments contain information that is not in the public domain, such information is considered by MedImmune to be confidential and proprietary. This communication is expected to be read and/or used only by the individual(s) for whom it is intended. If you have received this electronic communication in error, please reply to the sender advising of the error in transmission and delete the original message and any accompanying documents from your system immediately, without copying, reviewing or otherwise using them for any purpose. Thank you for your cooperation. To the extent this electronic communication or any of its attachments contain information that is not in the public domain, such information is considered by MedImmune to be confidential and proprietary. This communication is expected to be read and/or used only by the individual(s) for whom it is intended. If you have received this electronic communication in error, please reply to the sender advising of the error in transmission and delete the original message and any accompanying documents from your system immediately, without copying, reviewing or otherwise using them for any purpose. Thank you for your cooperation.
If I draw an Gaussian curve on a piece of paper and then try to estimate the location of the highest point, I will get a scatter of points near the maximum. The width of this distribution might be related to the standard deviation of the curve but it will certainly not be an estimate of it. It will be much, much smaller. If they are linearly related, I could not guess what the multiplicative constant would be, but it would be very large. Dale Tronrud P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time. On 2/11/2015 12:21 PM, Pavel Afonine wrote:
What I suggested is a quick and simple work-around that will provide an *estimate* of the uncertainty. How this estimate is going to be different from the uncertainty derived from 2nd derivatives is difficult to say (and whether the difference is going to be significant!). You need to try both and see the difference - that's the only way to get THE answer. Also, I'm not entirely sure how much uncertainties obtained from 2nd derivatives would account for uncertainties due to refinement result variations described in Acta Cryst. D63, 597-610 (2007) (I tend to think they won't at all, though again I would not bet on that until I try).
Pavel
On 2/11/15 11:57 AM, Dale Tronrud wrote:
Pavel's test does not even come close to estimating the uncertainty in occupancy. Imagine you are optimizing a parameter with a broad minimum. (Which is true when you are refining both the occupancy and B factor of an atom.) A number of refinements with differing starting positions (within the radius of convergence) will always result in a cluster of answers near the minimum. The breadth, or uncertainty, of the parameter is unimportant in Pavel's test because the purpose of refinement is to find the minimum.
The breadth of the minimum is described by the second derivative of the target function. You can't calculate the uncertainty without calculating it. In the absence of the program doing the calculus for you (As Shelxl will do at high resolution) I suppose you could create a number of models when the occupancy and B factor of the atom are varied jointly (There is no use wandering off the line.) and calculating the value of the target function for each. Fitting a parabola to these results will give you the curvature.
The differences in the target function for such coordinated alterations to one atom will be very small so one hopes the program uses high precision mathematics.
I hope you are now beginning to understand why people don't routinely quote uncertainties for occupancies and therefore don't take their exact values very seriously. If your interpretation of your model depends on whether the occupancy of this atom is 0.65 or 0.75 crystallography is the tool you need.
Dale
On 2/11/2015 10:21 AM, Pavel Afonine wrote:
Hello Masaki,
We determined the structure of an enzyme-substrate complex. The substrate of the enzyme contains some partially occupied atoms. We were able to calculate the occupancies of these atoms using PHENIX_refine.
By the way, I would like to know the uncertainties of the occupancies of these atoms. How can I estimate the uncertainties? For example, the occupancy of an atom was calculated as 0.65; can we write it as 0.65 (+/- 0.05) or 0.65 (+/- 0.1)? I know that it is related to B-factor and depends on the resolution (number of diffraction data per parameters), but I would like to know how to estimate (calculate?) the uncertainty. Could someone tell me about this issue?
an approximation to what you want can be easily achieved as following:
1) create 10-100 structures where occupancies in question are perturbed and vary from say 0.1-0.9. Also slightly perturb B-factors of atoms that surround these atoms.
2) Refine all these perturbed structures using refinement protocols that you usually use (the one used to obtain your final structure). Make sure to use sufficient amount of refinement macro-cycles so that refinement achieves near-convergence.
3) Extract occupancies in question. They will make some distribution and the spread of that distribution will tell you hint you the uncertainty.
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Dale,
P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time.
it is open access: http://phenix-online.org/papers/wd5073_reprint.pdf All the best, Pavel
Dear all Thank you very much for your suggestions. I will try making a number of models in which the atom has different occupancies (e.g. 0.1-1.0). Then, I will refine them by restraining the B-factors. Actually, our structure contains some reaction intermediates not only the substrate. So I would like to estimate the ratio. Best regards Masaki -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Afonine Sent: Thursday, February 12, 2015 6:36 AM To: Dale Tronrud; [email protected] Subject: Re: [phenixbb] How should we estimate the "uncertainty" of the occupancy of an atom? Hi Dale,
P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time.
it is open access: http://phenix-online.org/papers/wd5073_reprint.pdf All the best, Pavel _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Dear All,
How meaningful are the second derivative based estimates obtained via full
matrix inversion when the gradient is not 0 (i.e. when not in the minimum)?
I can understand that when you are working with high-resolution data and
your R-value is close to 0, things could work, but what happens when around
a more challenging 2A?
If you are interested in the uncertainty of the occupancy, I recommend not
doing any refinement, but just generate a list of occupancies and B-values
for the atom of interest and compute the (free) likelihood for each model.
Subsequent normalisation of the neg-exponent of these values, should
provide you with an answer that could be just as believable as any other
method around. A little bit of python scripting should do the trick quite
easily.
Both the full matrix inversion and the suggestion above probe the steepness
of the data-agreement hole the structure is sitting in. Pavels suggestion
explores the spread of local minima around the starting configuration. I am
not sure what method is more appropriate, perhaps it is instructive to know
what problem you are trying to solve.
HTH
P
On 11 February 2015 at 16:13, Masaki UNNO
Dear all
Thank you very much for your suggestions. I will try making a number of models in which the atom has different occupancies (e.g. 0.1-1.0). Then, I will refine them by restraining the B-factors. Actually, our structure contains some reaction intermediates not only the substrate. So I would like to estimate the ratio.
Best regards
Masaki -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Afonine Sent: Thursday, February 12, 2015 6:36 AM To: Dale Tronrud; [email protected] Subject: Re: [phenixbb] How should we estimate the "uncertainty" of the occupancy of an atom?
Hi Dale,
P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time.
it is open access:
http://phenix-online.org/papers/wd5073_reprint.pdf
All the best, Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- ----------------------------------------------------------------- P.H. Zwart Staff Scientist Berkeley Center for Structural Biology, Science lead Lawrence Berkeley National Laboratories 1 Cyclotron Road, Berkeley, CA-94703, USA Cell: 510 289 9246 SASTBX: http://sastbx.als.lbl.gov BCSB: http://bcsb.als.lbl.gov PHENIX: http://www.phenix-online.org -----------------------------------------------------------------
Hi Peter, you suggest to calculate "the (free) likelihood".. May I ask: specifically likelihood of what you are suggesting to calculate and what is "free" in this context? I guess I'm just lost in jargon, sorry! Thanks, Pavel On 2/11/15 8:08 PM, Peter Zwart wrote:
Dear All,
How meaningful are the second derivative based estimates obtained via full matrix inversion when the gradient is not 0 (i.e. when not in the minimum)? I can understand that when you are working with high-resolution data and your R-value is close to 0, things could work, but what happens when around a more challenging 2A?
If you are interested in the uncertainty of the occupancy, I recommend not doing any refinement, but just generate a list of occupancies and B-values for the atom of interest and compute the (free) likelihood for each model. Subsequent normalisation of the neg-exponent of these values, should provide you with an answer that could be just as believable as any other method around. A little bit of python scripting should do the trick quite easily.
Both the full matrix inversion and the suggestion above probe the steepness of the data-agreement hole the structure is sitting in. Pavels suggestion explores the spread of local minima around the starting configuration. I am not sure what method is more appropriate, perhaps it is instructive to know what problem you are trying to solve.
HTH P
On 11 February 2015 at 16:13, Masaki UNNO
mailto:[email protected]> wrote: Dear all
Thank you very much for your suggestions. I will try making a number of models in which the atom has different occupancies (e.g. 0.1-1.0). Then, I will refine them by restraining the B-factors. Actually, our structure contains some reaction intermediates not only the substrate. So I would like to estimate the ratio.
Best regards
Masaki -----Original Message----- From: [email protected] mailto:[email protected] [mailto:[email protected] mailto:[email protected]] On Behalf Of Pavel Afonine Sent: Thursday, February 12, 2015 6:36 AM To: Dale Tronrud; [email protected] mailto:[email protected] Subject: Re: [phenixbb] How should we estimate the "uncertainty" of the occupancy of an atom?
Hi Dale,
> P.S. I'll look up the paper you reference but my university does not > subscribe to acta Cryst and getting those papers takes time.
it is open access:
http://phenix-online.org/papers/wd5073_reprint.pdf
All the best, Pavel
_______________________________________________ phenixbb mailing list [email protected] mailto:[email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] mailto:[email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- ----------------------------------------------------------------- P.H. Zwart Staff Scientist Berkeley Center for Structural Biology, Science lead Lawrence Berkeley National Laboratories 1 Cyclotron Road, Berkeley, CA-94703, USA Cell: 510 289 9246 SASTBX: http://sastbx.als.lbl.gov BCSB: http://bcsb.als.lbl.gov PHENIX: http://www.phenix-online.org -----------------------------------------------------------------
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
i actually mean the log-likelihood, which is the target function typically
optimized in phenix.refine. If you compute the log likehood for the
test/free set instead, one might overcome bias issues, if any are present.
P
On 11 February 2015 at 21:56, Pavel Afonine
Hi Peter,
you suggest to calculate "the (free) likelihood".. May I ask: specifically likelihood of what you are suggesting to calculate and what is "free" in this context? I guess I'm just lost in jargon, sorry!
Thanks, Pavel
On 2/11/15 8:08 PM, Peter Zwart wrote:
Dear All,
How meaningful are the second derivative based estimates obtained via full matrix inversion when the gradient is not 0 (i.e. when not in the minimum)? I can understand that when you are working with high-resolution data and your R-value is close to 0, things could work, but what happens when around a more challenging 2A?
If you are interested in the uncertainty of the occupancy, I recommend not doing any refinement, but just generate a list of occupancies and B-values for the atom of interest and compute the (free) likelihood for each model. Subsequent normalisation of the neg-exponent of these values, should provide you with an answer that could be just as believable as any other method around. A little bit of python scripting should do the trick quite easily.
Both the full matrix inversion and the suggestion above probe the steepness of the data-agreement hole the structure is sitting in. Pavels suggestion explores the spread of local minima around the starting configuration. I am not sure what method is more appropriate, perhaps it is instructive to know what problem you are trying to solve.
HTH P
On 11 February 2015 at 16:13, Masaki UNNO
wrote: Dear all
Thank you very much for your suggestions. I will try making a number of models in which the atom has different occupancies (e.g. 0.1-1.0). Then, I will refine them by restraining the B-factors. Actually, our structure contains some reaction intermediates not only the substrate. So I would like to estimate the ratio.
Best regards
Masaki -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Afonine Sent: Thursday, February 12, 2015 6:36 AM To: Dale Tronrud; [email protected] Subject: Re: [phenixbb] How should we estimate the "uncertainty" of the occupancy of an atom?
Hi Dale,
P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time.
it is open access:
http://phenix-online.org/papers/wd5073_reprint.pdf
All the best, Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- ----------------------------------------------------------------- P.H. Zwart Staff Scientist Berkeley Center for Structural Biology, Science lead Lawrence Berkeley National Laboratories 1 Cyclotron Road, Berkeley, CA-94703, USA Cell: 510 289 9246 SASTBX: http://sastbx.als.lbl.gov BCSB: http://bcsb.als.lbl.gov PHENIX: http://www.phenix-online.org -----------------------------------------------------------------
_______________________________________________ phenixbb mailing [email protected]http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- ----------------------------------------------------------------- P.H. Zwart Staff Scientist Berkeley Center for Structural Biology, Science lead Lawrence Berkeley National Laboratories 1 Cyclotron Road, Berkeley, CA-94703, USA Cell: 510 289 9246 SASTBX: http://sastbx.als.lbl.gov BCSB: http://bcsb.als.lbl.gov PHENIX: http://www.phenix-online.org -----------------------------------------------------------------
P.S.:
How meaningful are the second derivative based estimates obtained via full matrix inversion when the gradient is not 0 (i.e. when not in the minimum)?
I think this is one of the key questions to ask! Very good one! Pavel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2/11/2015 10:08 PM, Pavel Afonine wrote:
P.S.:
How meaningful are the second derivative based estimates obtained via full matrix inversion when the gradient is not 0 (i.e. when not in the minimum)?
I think this is one of the key questions to ask! Very good one!
But the whole point of phenix.refine is to give you a model where the gradient is as close to zero as possible. Are you saying that when you refine a model you end up with significant shifts, but don't apply them? Of course, how fast the second derivatives change when you move away from the optimum model is calculated with the third derivatives. Now the fun really begins! Dale Tronrud
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iEYEARECAAYFAlTdHDUACgkQU5C0gGfAG13UhwCfVQyZMtoDi+qSqMVJFus2qpD3 hrkAn0O31CputFjNj0o2CbDb+OSNjyfk =BHE8 -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2/11/2015 8:08 PM, Peter Zwart wrote:
Dear All,
How meaningful are the second derivative based estimates obtained via full matrix inversion when the gradient is not 0 (i.e. when not in the minimum)? I can understand that when you are working with high-resolution data and your R-value is close to 0, things could work, but what happens when around a more challenging 2A?
The problem at 2 A resolution isn't that the analysis breaks down. Its that the second derivative matrix has a singular subspace which causes failure of the implementations we have. A single-value-decomposition would do the trick, but with the cost of much more CPU time. The singular subspace would tell you which aspects of your model are not defined by your data (or your restraints) and where you could use additional restraints. The non-singular part would tell you the SU's of those parts that are determined, at least to some extent.
If you are interested in the uncertainty of the occupancy, I recommend not doing any refinement, but just generate a list of occupancies and B-values for the atom of interest and compute the (free) likelihood for each model. Subsequent normalisation of the neg-exponent of these values, should provide you with an answer that could be just as believable as any other method around. A little bit of python scripting should do the trick quite easily.
This is pretty much what I recommended. If I recall the variance of the parameter is equal to -(1/2)/(2nd derivative of log likelihood) evaluated near the optimal parameter. This will be an underestimate because it ignores the correlation of this parameter will all the others, but the correlation between B factor and occupancy of a particular atom will be dominate. Dale Tronrud
Both the full matrix inversion and the suggestion above probe the steepness of the data-agreement hole the structure is sitting in. Pavels suggestion explores the spread of local minima around the starting configuration. I am not sure what method is more appropriate, perhaps it is instructive to know what problem you are trying to solve.
HTH P
On 11 February 2015 at 16:13, Masaki UNNO
mailto:[email protected]> wrote: Dear all
Thank you very much for your suggestions. I will try making a number of models in which the atom has different occupancies (e.g. 0.1-1.0). Then, I will refine them by restraining the B-factors. Actually, our structure contains some reaction intermediates not only the substrate. So I would like to estimate the ratio.
Best regards
Masaki -----Original Message----- From: [email protected] mailto:[email protected] [mailto:[email protected] mailto:[email protected]] On Behalf Of Pavel Afonine Sent: Thursday, February 12, 2015 6:36 AM To: Dale Tronrud; [email protected] mailto:[email protected] Subject: Re: [phenixbb] How should we estimate the "uncertainty" of the occupancy of an atom?
Hi Dale,
P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time.
it is open access:
http://phenix-online.org/papers/wd5073_reprint.pdf
All the best, Pavel
_______________________________________________ phenixbb mailing list [email protected] mailto:[email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] mailto:[email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- ----------------------------------------------------------------- P.H. Zwart Staff Scientist Berkeley Center for Structural Biology, Science lead Lawrence Berkeley National Laboratories 1 Cyclotron Road, Berkeley, CA-94703, USA Cell: 510 289 9246 SASTBX: http://sastbx.als.lbl.gov BCSB: http://bcsb.als.lbl.gov PHENIX: http://www.phenix-online.org -----------------------------------------------------------------
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iEYEARECAAYFAlTdGecACgkQU5C0gGfAG11JrgCgl/XKQ6+MT9ELaOoAFHLWCQtE DTgAn0hLmOLuHCaFwoBPS4qj8HZSmj1A =Epvq -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2/11/2015 1:35 PM, Pavel Afonine wrote:
Hi Dale,
P.S. I'll look up the paper you reference but my university does not subscribe to acta Cryst and getting those papers takes time.
it is open access:
Thanks. I recall this paper. It is just an example of what you have been describing, repeated refinements, and what I have been (slightly) criticizing. Repeated refinements are very useful in probing the confidence that you can place in the details of a model, if you can't perform a proper error analysis. This tool's main usefulness is to inform you about the aspects of your model which are not nailed down by your data and restraints. Basically it gives you a set of models which your target function cannot distinguish between. It does not tell you how fast the target function rises when you move away from the minimum, but does tell you when the parameters can change without changing the value of the target function. i.e. it tells you where the second derivative is zero. This is very useful information because you shouldn't be paying much attention to a side chain that forms a bush in the repeated refinements but should pay attention to a side chain that always comes up with the same conformation. The degree of clustering of that single conformer is not equal to the SU of those parameters, however. Since the paper used simulated data it is unfortunate that the analysis was not extended to a higher resolution where the distribution of the repeated refinements could be compared to traditionally calculated SUs. Maybe a follow up paper is in order. Dale Tronrud
All the best, Pavel
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iEYEARECAAYFAlTdDugACgkQU5C0gGfAG10TjACgkPRMXlP7WQZ3qfS4CkAVB9dK xagAniBczN/63/1N1OksBAOVD137/Cws =Gu5Q -----END PGP SIGNATURE-----
participants (6)
-
Dale Tronrud
-
Masaki UNNO
-
Oganesyan, Vaheh
-
Pavel Afonine
-
Peter Zwart
-
Tim Gruene