on Phenix's "Generate "Table 1" for journal"
Dear All, With the Phenix refine PDB and mtz file, I just prepared a "Table 1" for journal with Phenix. Here I find something special, which I hope I can get your advice. First, for the "Total reflections" and "Multiplicity", they are empty (no data given). Wil you please tell me why? Secondly, it writes, "Statistics for the highest-resolution shell are shown in parentheses". Suppose my crystal resolution is 2.5-45.6, then the the highest-resolution value should be 2.5, rather than 45.6, right? In addition, what is the diffrence between highest-resolution shell in the Table 1 and highest-resolution bin in the "Results" of Phenix refine? What is R-merge, R-meas,CC1/2 and CC*? I can get these data from MOSFLM output mtz file. But is my images were processe by HKL, which I have no experience, then how can I get the R-merge, R-meas,CC1/2 and CC*? I am looking forward to getting your reply. Smith
Dear Smith, most questions you asked below are normally answered as part of a basic crystallography training. If that's not the case for you then I suggest you dedicate a few months reading this book (entirely!) Rupp, B, 2009 Biomolecular Crystallography: Principles, Practice, and Application to Structural Biology then follow references therein and read corresponding papers. All together I'm sure this will give you answers to most questions you asked below and far more. As to your more specific questions: content of Phenix generated Table 1 depends on what inputs you provide. If it lacks Multiplicity then perhaps you did not give the tool information from data processing steps. Good luck, Pavel On 3/21/15 5:25 AM, Smith Liu wrote:
Dear All, With the Phenix refine PDB and mtz file, I just prepared a "Table 1" for journal with Phenix. Here I find something special, which I hope I can get your advice. First, for the "Total reflections" and "Multiplicity", they are empty (no data given). Wil you please tell me why? Secondly, it writes, "Statistics for the highest-resolution shell are shown in parentheses". Suppose my crystal resolution is 2.5-45.6, then the the highest-resolution value should be 2.5, rather than 45.6, right? In addition, what is the diffrence between highest-resolution shell in the Table 1 and highest-resolution bin in the "Results" of Phenix refine? What is R-merge, R-meas,CC1/2 and CC*? I can get these data from MOSFLM output mtz file. But is my images were processe by HKL, which I have no experience, then how can I get the R-merge, R-meas,CC1/2 and CC*? I am looking forward to getting your reply. Smith
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Hi Smith, I thought that I might quickly answer your questions, But I think Pavel is correct that some crystallographic reading will be instructive and assist you in answering these questions. Is there someone in you laboratory or at your institution that is more experienced that could assist you? I'd suggest that inaddition to the Rupp text, Principles of Protein X-Ray Crystallography. By J. Drenth is also quite good. The short answer is that much or all of these data can be gathered from the respective log files that are generated during processing/refinement or from the header of the PDB file. My responses follow your questions. --First, for the "Total reflections" and "Multiplicity", they are empty (no data given). The total reflections for the processing section for table 1, you can get these numbers from the HKL2000 scaling log, or if you used aimless, they appear in a chart after the initial creation of the mtz file. I'm guessing that you didn't include the logs during the running of the generate table 1 program (as Pavel has suggested), but truthfully you should be able to locate these numbers from the log files of the programs, if they didn't propagate to the final table. --Secondly, it writes, "Statistics for the highest-resolution shell are shown in parentheses". Suppose my crystal resolution is 2.5-45.6, then the the highest-resolution value should be 2.5, rather than 45.6, right? No, your highest shell would be something like all reflections between 2.6 and 2.5 A -- In addition, what is the diffrence between highest-resolution shell in the Table 1 and highest-resolution bin in the "Results" of Phenix refine? I don't believe that the shells are necessarily the same ranges, and in some cases you might have slightly different resolutions for processing and refinement. depending on your program that you use, you can bin the data into any number of bins -- for high high resolution data phenix has many more "bins" in the results section, but when you processed in --What is R-merge, R-meas,CC1/2 and CC*? I can get these data from MOSFLM output mtz file. But is my images were processe by HKL, which I have no experience, then how can I get the R-merge, R-meas,CC1/2 and CC*? If you processed data, these numbers are diagnostic of the quality of the dataset (R-merge/r-meas/R-pim) and (CC* and CC1/2, or the more traditional I/sigma=2) indicate where data should be cut off. what basis did you you use for where to cut off your data for refinement? When you run the scalepack portion of HKL2000, it makes a log file with appropriate tables at the end, detailing all of these numbers. newer versions of HKL2000 do output CC1/2 along with I/sigma values to assist you in where the useful range of the data ends, again found in the log file. Hope this helps, Dave Lodowski
Hi everyone, I am working on a ~2.8 A data, Mathew coefficients suggests there are three copies of monomers (194aa) in au. A MR solution (~12% identity with the template) was found by phenix.mr_rosetta in P64 space group. This MR solution contains two copies of model (only 80aa each). But further autobuild cycles on this solution only gives R~0.48. As phenix.xtriage suggests, I add the twin law in the refinement, the R becomes~0.42 but still not succeed in the following autobuild cycles. Then I trim the MR solution to poly-glycine, and do another MR search, and could find two copies of the previous MR solution. Now, the new MR solution contains four copies of search model (80aa each model). With the twin-law, this new MR solution gives R~0.37. Now I am stuck in this step. The autobuild with rebuild_in_place=False or True could not further improve the model or R factors. Any suggestions on the following model building? Or maybe the MR solution I get from phenix.mr_rosetta still not represent the correct one? Thanks in advance! Fengyun
Hi Fengyun, First it would be good to check very carefully about twinning and space group. What are the values of the statistics listed at the end of Xtriage (Statistics independent of twin laws and Multivariate Z score L-test:)? Are you positive about the space group? Then what is the solvent content for 2 or 4 copies? Yes, it is entirely possible for an incorrect solution to come from mr_rosetta. It will report the best solution it can find, but that might not be correct. I would look very carefully at a composite omit map and an m2Fo-DFc map for your structure, with and without twin laws. If you can see clear things to change, then change them. If it is all very unclear, then you may be stuck for now. All the best, Tom T ________________________________ From: [email protected] [[email protected]] on behalf of Ni, Fengyun [[email protected]] Sent: Sunday, March 22, 2015 3:33 PM To: [email protected] Subject: [phenixbb] suggestions on poly-glycine model Hi everyone, I am working on a ~2.8 A data, Mathew coefficients suggests there are three copies of monomers (194aa) in au. A MR solution (~12% identity with the template) was found by phenix.mr_rosetta in P64 space group. This MR solution contains two copies of model (only 80aa each). But further autobuild cycles on this solution only gives R~0.48. As phenix.xtriage suggests, I add the twin law in the refinement, the R becomes~0.42 but still not succeed in the following autobuild cycles. Then I trim the MR solution to poly-glycine, and do another MR search, and could find two copies of the previous MR solution. Now, the new MR solution contains four copies of search model (80aa each model). With the twin-law, this new MR solution gives R~0.37. Now I am stuck in this step. The autobuild with rebuild_in_place=False or True could not further improve the model or R factors. Any suggestions on the following model building? Or maybe the MR solution I get from phenix.mr_rosetta still not represent the correct one? Thanks in advance! Fengyun
If you have any weak anomalous scatterers, even sulfurs, you can use these to improve the phases (use partial model phases in Phaser to find sites, make map based on these, then iterate with autobuilding). This worked well for me recently in a twinned case. With twinning, you have to be very careful about the spacegroup as well, as Tom mentioned. Best option: get better data/crystals, either untwinned or high multiplicity. Read up on twinning as well! JPK From: [email protected] [mailto:[email protected]] On Behalf Of Terwilliger, Thomas Charles Sent: Monday, March 23, 2015 11:54 AM To: Ni, Fengyun; [email protected] Subject: Re: [phenixbb] suggestions on poly-glycine model Hi Fengyun, First it would be good to check very carefully about twinning and space group. What are the values of the statistics listed at the end of Xtriage (Statistics independent of twin laws and Multivariate Z score L-test:)? Are you positive about the space group? Then what is the solvent content for 2 or 4 copies? Yes, it is entirely possible for an incorrect solution to come from mr_rosetta. It will report the best solution it can find, but that might not be correct. I would look very carefully at a composite omit map and an m2Fo-DFc map for your structure, with and without twin laws. If you can see clear things to change, then change them. If it is all very unclear, then you may be stuck for now. All the best, Tom T ________________________________ From: [email protected]mailto:[email protected] [[email protected]] on behalf of Ni, Fengyun [[email protected]] Sent: Sunday, March 22, 2015 3:33 PM To: [email protected]mailto:[email protected] Subject: [phenixbb] suggestions on poly-glycine model Hi everyone, I am working on a ~2.8 A data, Mathew coefficients suggests there are three copies of monomers (194aa) in au. A MR solution (~12% identity with the template) was found by phenix.mr_rosetta in P64 space group. This MR solution contains two copies of model (only 80aa each). But further autobuild cycles on this solution only gives R~0.48. As phenix.xtriage suggests, I add the twin law in the refinement, the R becomes~0.42 but still not succeed in the following autobuild cycles. Then I trim the MR solution to poly-glycine, and do another MR search, and could find two copies of the previous MR solution. Now, the new MR solution contains four copies of search model (80aa each model). With the twin-law, this new MR solution gives R~0.37. Now I am stuck in this step. The autobuild with rebuild_in_place=False or True could not further improve the model or R factors. Any suggestions on the following model building? Or maybe the MR solution I get from phenix.mr_rosetta still not represent the correct one? Thanks in advance! Fengyun
I completely agree with Pavel Afonine's advice. For one question that you might not find in Rupp's BMC or other textbook:
then how can I get the R-merge, R-meas,CC1/2 and CC*?
If you have a recent version of HKL, these values are in the scalepack log-file. If you are using an older version of HKL, then you need to re-run the scaling step with: NO MERGE ORIGINAL INDEX and then use phenix.merging_statistics to print these statistics from the unmerged .sca file The unmerged .sca does not have cell param, so supply them like: phenix.merging_statistics /path/to/your/unmerged.sca \ cell="72.394 85.188 294.019 90.000 90.000 90.000" \ n_bins=30 high_resolution=2.2 If you have refined with phenix.refine you can also calculate correlation statistics between model and data (CC_free, CC_work) with phenix.cc_star: phenix.cc_star /path/to/your/unmerged.sca \ /path/to/your/refinement/prefix_066.mtz n_bins=30 For the meaning of these statistics, see: Acta Cryst. (2013). D69, 1215–1222 doi:10.1107/S0907444913001121 and references there-in. eab On 03/21/2015 08:25 AM, Smith Liu wrote:
Dear All, With the Phenix refine PDB and mtz file, I just prepared a "Table 1" for journal with Phenix. Here I find something special, which I hope I can get your advice. First, for the "Total reflections" and "Multiplicity", they are empty (no data given). Wil you please tell me why? Secondly, it writes, "Statistics for the highest-resolution shell are shown in parentheses". Suppose my crystal resolution is 2.5-45.6, then the the highest-resolution value should be 2.5, rather than 45.6, right? In addition, what is the diffrence between highest-resolution shell in the Table 1 and highest-resolution bin in the "Results" of Phenix refine? What is R-merge, R-meas,CC1/2 and CC*? I can get these data from MOSFLM output mtz file. But is my images were processe by HKL, which I have no experience, then how can I get the R-merge, R-meas,CC1/2 and CC*? I am looking forward to getting your reply. Smith
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
participants (7)
-
Dave Lodowski
-
Edward A. Berry
-
Keller, Jacob
-
Ni, Fengyun
-
Pavel Afonine
-
Smith Liu
-
Terwilliger, Thomas Charles