Hi everyone, I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you. I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry. For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows, RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55 The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows, autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } } I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics, Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25 Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721 The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet. Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun
Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Fengyun, It sounds as though your Hg dataset is very promising. Here's what I would try: 1. take the results from your Hg autosol run (overall_best_denmod_map_coeffs.mtz; overall_best_ha.pdb; overall_best_refine_data.mtz) and put them into autobuild along with hires_file=my_native_3A_data.mtz and your sequence file. (Perhaps you already tried this one). 2. Alternative: find the sites in the Pb using the Hg solution: Run autosol with the Pb data and the keywords: input_part_map_coeffs_file=AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz (xxx=your autosol run with Hg) phaser_sites_then_phase=True (find sites using that map file, then phase only using the Pb phases) Then compare the map from the Hg with the Pb map with: phenix.get_cc_mtz_mtz AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz AutoSol_run_yyyy_/overall_best_denmod_map_coeffs.mtz If they are similar (cc>0.2 or so) then you are on the right track with both maps. You can combine them by running autosol one more time (with a parameters file something like this): autosol { seq_file = seq.dat wavelength { group = 1 data = Hg.sca lambda = 1.1 atom_type = Hg f_prime = -7. f_double_prime = 4.5 sites_file = hg.pdb } wavelength { group = 2 data = Pb.sca lambda = 1.1 atom_type = Pb f_prime = -7. f_double_prime = 4.5 sites_file = pb.pdb } } All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 9:51 AM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Tom, Thank you for your suggestions! I did try your #1 advice with the old version phenix1.8. The resulting model does not make much sense. I just install the official phenix 1.9 to run you #2 advice with the following commands but getting error message at the end of file. Please help me. Thanks again! autosol { seq_file = ../seq-prot.dat crystal_info { solvent_fraction = 0.62 } native { data = ../nat.mtz } deriv { data = ../pb1-all-4.4.mtz atom_type = Pb lambda = 0.9464 } phasing { input_part_map_coeffs_file = overall_best_denmod_map_coeffs_from6.mtz input_part_map_coeffs_labels = "FWT,PHWT" phaser_sites_then_phase = True } } ...... Set try_orig_sad_data_in_hyss=False as thoroughness=quick and extreme_dm=False Setting ha_iteration to True by default as this is mir/mad Not truncating ha sites at start of resolve as this is not PHASER SAD phasing Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing Sorry: Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing ------end of error message ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 12:59 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, It sounds as though your Hg dataset is very promising. Here's what I would try: 1. take the results from your Hg autosol run (overall_best_denmod_map_coeffs.mtz; overall_best_ha.pdb; overall_best_refine_data.mtz) and put them into autobuild along with hires_file=my_native_3A_data.mtz and your sequence file. (Perhaps you already tried this one). 2. Alternative: find the sites in the Pb using the Hg solution: Run autosol with the Pb data and the keywords: input_part_map_coeffs_file=AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz (xxx=your autosol run with Hg) phaser_sites_then_phase=True (find sites using that map file, then phase only using the Pb phases) Then compare the map from the Hg with the Pb map with: phenix.get_cc_mtz_mtz AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz AutoSol_run_yyyy_/overall_best_denmod_map_coeffs.mtz If they are similar (cc>0.2 or so) then you are on the right track with both maps. You can combine them by running autosol one more time (with a parameters file something like this): autosol { seq_file = seq.dat wavelength { group = 1 data = Hg.sca lambda = 1.1 atom_type = Hg f_prime = -7. f_double_prime = 4.5 sites_file = hg.pdb } wavelength { group = 2 data = Pb.sca lambda = 1.1 atom_type = Pb f_prime = -7. f_double_prime = 4.5 sites_file = pb.pdb } } All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 9:51 AM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Fengyun, Oh right...I forgot you can only use the density from a map as phase information for SAD phasing. Instead try these keywords: input_phase_file=overall_best_denmod_map_coeffs_from6.mtz input_phase_labels="FWT,PHWT" Let me know if that doesn't do it! All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 2:51 PM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you for your suggestions! I did try your #1 advice with the old version phenix1.8. The resulting model does not make much sense. I just install the official phenix 1.9 to run you #2 advice with the following commands but getting error message at the end of file. Please help me. Thanks again! autosol { seq_file = ../seq-prot.dat crystal_info { solvent_fraction = 0.62 } native { data = ../nat.mtz } deriv { data = ../pb1-all-4.4.mtz atom_type = Pb lambda = 0.9464 } phasing { input_part_map_coeffs_file = overall_best_denmod_map_coeffs_from6.mtz input_part_map_coeffs_labels = "FWT,PHWT" phaser_sites_then_phase = True } } ...... Set try_orig_sad_data_in_hyss=False as thoroughness=quick and extreme_dm=False Setting ha_iteration to True by default as this is mir/mad Not truncating ha sites at start of resolve as this is not PHASER SAD phasing Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing Sorry: Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing ------end of error message ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 12:59 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, It sounds as though your Hg dataset is very promising. Here's what I would try: 1. take the results from your Hg autosol run (overall_best_denmod_map_coeffs.mtz; overall_best_ha.pdb; overall_best_refine_data.mtz) and put them into autobuild along with hires_file=my_native_3A_data.mtz and your sequence file. (Perhaps you already tried this one). 2. Alternative: find the sites in the Pb using the Hg solution: Run autosol with the Pb data and the keywords: input_part_map_coeffs_file=AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz (xxx=your autosol run with Hg) phaser_sites_then_phase=True (find sites using that map file, then phase only using the Pb phases) Then compare the map from the Hg with the Pb map with: phenix.get_cc_mtz_mtz AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz AutoSol_run_yyyy_/overall_best_denmod_map_coeffs.mtz If they are similar (cc>0.2 or so) then you are on the right track with both maps. You can combine them by running autosol one more time (with a parameters file something like this): autosol { seq_file = seq.dat wavelength { group = 1 data = Hg.sca lambda = 1.1 atom_type = Hg f_prime = -7. f_double_prime = 4.5 sites_file = hg.pdb } wavelength { group = 2 data = Pb.sca lambda = 1.1 atom_type = Pb f_prime = -7. f_double_prime = 4.5 sites_file = pb.pdb } } All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 9:51 AM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Tom, Thanks so much! It works this time. However, I search the Pb dataset at different resolution cutoff, the highest correlation with Hg-map is only 0.13. Maybe I am not lucky enough to combine these two datasets. Any further suggestions on the model building on such low resolution (3.1 for native)? Thanks again! Regards, Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 5:05 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, Oh right...I forgot you can only use the density from a map as phase information for SAD phasing. Instead try these keywords: input_phase_file=overall_best_denmod_map_coeffs_from6.mtz input_phase_labels="FWT,PHWT" Let me know if that doesn't do it! All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 2:51 PM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you for your suggestions! I did try your #1 advice with the old version phenix1.8. The resulting model does not make much sense. I just install the official phenix 1.9 to run you #2 advice with the following commands but getting error message at the end of file. Please help me. Thanks again! autosol { seq_file = ../seq-prot.dat crystal_info { solvent_fraction = 0.62 } native { data = ../nat.mtz } deriv { data = ../pb1-all-4.4.mtz atom_type = Pb lambda = 0.9464 } phasing { input_part_map_coeffs_file = overall_best_denmod_map_coeffs_from6.mtz input_part_map_coeffs_labels = "FWT,PHWT" phaser_sites_then_phase = True } } ...... Set try_orig_sad_data_in_hyss=False as thoroughness=quick and extreme_dm=False Setting ha_iteration to True by default as this is mir/mad Not truncating ha sites at start of resolve as this is not PHASER SAD phasing Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing Sorry: Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing ------end of error message ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 12:59 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, It sounds as though your Hg dataset is very promising. Here's what I would try: 1. take the results from your Hg autosol run (overall_best_denmod_map_coeffs.mtz; overall_best_ha.pdb; overall_best_refine_data.mtz) and put them into autobuild along with hires_file=my_native_3A_data.mtz and your sequence file. (Perhaps you already tried this one). 2. Alternative: find the sites in the Pb using the Hg solution: Run autosol with the Pb data and the keywords: input_part_map_coeffs_file=AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz (xxx=your autosol run with Hg) phaser_sites_then_phase=True (find sites using that map file, then phase only using the Pb phases) Then compare the map from the Hg with the Pb map with: phenix.get_cc_mtz_mtz AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz AutoSol_run_yyyy_/overall_best_denmod_map_coeffs.mtz If they are similar (cc>0.2 or so) then you are on the right track with both maps. You can combine them by running autosol one more time (with a parameters file something like this): autosol { seq_file = seq.dat wavelength { group = 1 data = Hg.sca lambda = 1.1 atom_type = Hg f_prime = -7. f_double_prime = 4.5 sites_file = hg.pdb } wavelength { group = 2 data = Pb.sca lambda = 1.1 atom_type = Pb f_prime = -7. f_double_prime = 4.5 sites_file = pb.pdb } } All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 9:51 AM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Fengyun, That's a pretty low CC between maps of 0.13...but it might be worth a try at combining the two datasets anyhow... Meanwhile I'd suggest going ahead with your Hg solution. If you have a lot of computer you might try phenix.parallel_autobuild, or if not, just run autobuild a few times, taking the map (and not the model) at the end of each run and feeding it back into the next run. All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 7:34 PM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thanks so much! It works this time. However, I search the Pb dataset at different resolution cutoff, the highest correlation with Hg-map is only 0.13. Maybe I am not lucky enough to combine these two datasets. Any further suggestions on the model building on such low resolution (3.1 for native)? Thanks again! Regards, Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 5:05 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, Oh right...I forgot you can only use the density from a map as phase information for SAD phasing. Instead try these keywords: input_phase_file=overall_best_denmod_map_coeffs_from6.mtz input_phase_labels="FWT,PHWT" Let me know if that doesn't do it! All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 2:51 PM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you for your suggestions! I did try your #1 advice with the old version phenix1.8. The resulting model does not make much sense. I just install the official phenix 1.9 to run you #2 advice with the following commands but getting error message at the end of file. Please help me. Thanks again! autosol { seq_file = ../seq-prot.dat crystal_info { solvent_fraction = 0.62 } native { data = ../nat.mtz } deriv { data = ../pb1-all-4.4.mtz atom_type = Pb lambda = 0.9464 } phasing { input_part_map_coeffs_file = overall_best_denmod_map_coeffs_from6.mtz input_part_map_coeffs_labels = "FWT,PHWT" phaser_sites_then_phase = True } } ...... Set try_orig_sad_data_in_hyss=False as thoroughness=quick and extreme_dm=False Setting ha_iteration to True by default as this is mir/mad Not truncating ha sites at start of resolve as this is not PHASER SAD phasing Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing Sorry: Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing ------end of error message ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 12:59 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, It sounds as though your Hg dataset is very promising. Here's what I would try: 1. take the results from your Hg autosol run (overall_best_denmod_map_coeffs.mtz; overall_best_ha.pdb; overall_best_refine_data.mtz) and put them into autobuild along with hires_file=my_native_3A_data.mtz and your sequence file. (Perhaps you already tried this one). 2. Alternative: find the sites in the Pb using the Hg solution: Run autosol with the Pb data and the keywords: input_part_map_coeffs_file=AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz (xxx=your autosol run with Hg) phaser_sites_then_phase=True (find sites using that map file, then phase only using the Pb phases) Then compare the map from the Hg with the Pb map with: phenix.get_cc_mtz_mtz AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz AutoSol_run_yyyy_/overall_best_denmod_map_coeffs.mtz If they are similar (cc>0.2 or so) then you are on the right track with both maps. You can combine them by running autosol one more time (with a parameters file something like this): autosol { seq_file = seq.dat wavelength { group = 1 data = Hg.sca lambda = 1.1 atom_type = Hg f_prime = -7. f_double_prime = 4.5 sites_file = hg.pdb } wavelength { group = 2 data = Pb.sca lambda = 1.1 atom_type = Pb f_prime = -7. f_double_prime = 4.5 sites_file = pb.pdb } } All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 9:51 AM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Tom, A quick question, for parallel autobuild, if i have three iteration, the final map would be denmod_average_3.mtz? Is this map i used for next run? Thanks! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 8:58 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, That's a pretty low CC between maps of 0.13...but it might be worth a try at combining the two datasets anyhow... Meanwhile I'd suggest going ahead with your Hg solution. If you have a lot of computer you might try phenix.parallel_autobuild, or if not, just run autobuild a few times, taking the map (and not the model) at the end of each run and feeding it back into the next run. All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 7:34 PM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thanks so much! It works this time. However, I search the Pb dataset at different resolution cutoff, the highest correlation with Hg-map is only 0.13. Maybe I am not lucky enough to combine these two datasets. Any further suggestions on the model building on such low resolution (3.1 for native)? Thanks again! Regards, Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 5:05 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, Oh right...I forgot you can only use the density from a map as phase information for SAD phasing. Instead try these keywords: input_phase_file=overall_best_denmod_map_coeffs_from6.mtz input_phase_labels="FWT,PHWT" Let me know if that doesn't do it! All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 2:51 PM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you for your suggestions! I did try your #1 advice with the old version phenix1.8. The resulting model does not make much sense. I just install the official phenix 1.9 to run you #2 advice with the following commands but getting error message at the end of file. Please help me. Thanks again! autosol { seq_file = ../seq-prot.dat crystal_info { solvent_fraction = 0.62 } native { data = ../nat.mtz } deriv { data = ../pb1-all-4.4.mtz atom_type = Pb lambda = 0.9464 } phasing { input_part_map_coeffs_file = overall_best_denmod_map_coeffs_from6.mtz input_part_map_coeffs_labels = "FWT,PHWT" phaser_sites_then_phase = True } } ...... Set try_orig_sad_data_in_hyss=False as thoroughness=quick and extreme_dm=False Setting ha_iteration to True by default as this is mir/mad Not truncating ha sites at start of resolve as this is not PHASER SAD phasing Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing Sorry: Sorry, you can only use an MR model to find sites with a single dataset and Phaser SAD phasing ------end of error message ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Friday, October 24, 2014 12:59 PM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: RE: [phenixbb] help on MIR dataset Hi Fengyun, It sounds as though your Hg dataset is very promising. Here's what I would try: 1. take the results from your Hg autosol run (overall_best_denmod_map_coeffs.mtz; overall_best_ha.pdb; overall_best_refine_data.mtz) and put them into autobuild along with hires_file=my_native_3A_data.mtz and your sequence file. (Perhaps you already tried this one). 2. Alternative: find the sites in the Pb using the Hg solution: Run autosol with the Pb data and the keywords: input_part_map_coeffs_file=AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz (xxx=your autosol run with Hg) phaser_sites_then_phase=True (find sites using that map file, then phase only using the Pb phases) Then compare the map from the Hg with the Pb map with: phenix.get_cc_mtz_mtz AutoSol_run_xxxx_/overall_best_denmod_map_coeffs.mtz AutoSol_run_yyyy_/overall_best_denmod_map_coeffs.mtz If they are similar (cc>0.2 or so) then you are on the right track with both maps. You can combine them by running autosol one more time (with a parameters file something like this): autosol { seq_file = seq.dat wavelength { group = 1 data = Hg.sca lambda = 1.1 atom_type = Hg f_prime = -7. f_double_prime = 4.5 sites_file = hg.pdb } wavelength { group = 2 data = Pb.sca lambda = 1.1 atom_type = Pb f_prime = -7. f_double_prime = 4.5 sites_file = pb.pdb } } All the best, Tom T ________________________________________ From: Ni, Fengyun [[email protected]] Sent: Friday, October 24, 2014 9:51 AM To: Terwilliger, Thomas Charles Cc: PHENIX user mailing list Subject: RE: [phenixbb] help on MIR dataset Hi Tom, Thank you very much for the suggestions! It takes some time for me to try all your suggestions. Now I could get a solution for a Hg-derivative dataset at very low resolution (5 A) with the following statistics, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Hg 0.8672 0.0160 0.7559 0.2200 48.4721 CURRENT VALUES: 2 Hg 1.0700 0.4392 0.3669 0.2884 60.0000 CURRENT VALUES: 3 Hg 0.0196 0.5968 0.9263 0.1392 1.0000 CURRENT VALUES: 4 Hg 0.1662 0.1408 0.6124 0.2366 6.4178 CURRENT VALUES: 5 Hg 0.3109 0.1818 0.3807 0.2923 56.7720 Solution # 1 BAYES-CC: 52.8 +/- 20.8 (2SD) Dataset #1 FOM: 0.62 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.23 0.85 0.00 100x EST OF CC: 52.84 49.93 31.25 The resulting map looks reasonable for the fact that several long consecutive densities could be located and it somehow looks like what my protein should be like. My native data only diffract to 3.1 A, and the model building based on the above solution (after density modification) could only build some fragment without alpha/beta secondary structures, though R/Rfree is around 0.38/0.47. I have another Pb-derivative, it could find a following solution at 4.4 A, SITE ATOM OCCUP X Y Z B CURRENT VALUES: 1 Pb 0.2317 0.0643 0.6622 0.1180 60.0000 CURRENT VALUES: 2 Pb 0.0534 0.9454 0.5850 0.2217 1.0000 Solution # 2 BAYES-CC: 54.3 +/- 19.2 (2SD) Dataset #1 FOM: 0.39 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.24 0.69 0.00 100x EST OF CC: 54.29 18.27 31.25 The map from Pb-derivative is not as good as that from Hg-derivative. And it could not be combined with Hg-solution when running autosol together for both datasets. I read from phenix that there's autobuild_parallel version for building from bad maps with some options like building with BUCCANEER, does anyone have suggestions on running this? Any suggestions for what I could try next are welcome. Thank you very much! Fengyun ________________________________________ From: Terwilliger, Thomas Charles [[email protected]] Sent: Tuesday, August 26, 2014 10:13 AM To: Ni, Fengyun Cc: PHENIX user mailing list; Terwilliger, Thomas Charles Subject: Re: [phenixbb] help on MIR dataset Hi Fengyun, Some things to try: 1. Are your data twinned (look at intensity moments in xtriage output)? 2. Try space groups P3121 (and P3221 and P321) in addition to P3. 3. Take your best MR solution (or any MR solutions) and calculate phases. Use those phases as input phases to autosol and it will try to find sites using those phases. If you then get back a map that has correlation with these phases (use get_cc_mtz_mtz to check) then you probably have a solution (autosol will only use those input phases to find the sites, not in phasing). 4. Try each of your datasets separately as SAD datasets. You can use MR SAD in autosol (similar to #2 above but using phaser to calculate phases). All the best, Tom T On Aug 25, 2014, at 10:07 PM, Ni, Fengyun wrote:
Hi everyone,
I am working on a MIR dataset: one native data, three EMTS soaked data at different concentration, one PbAc2 data. My initial trial with phenix autosol did not give positive solutions. I hope I could get some information from all of you.
I index and integrate the data in Mosflm and possible group is P3, though the Pointless said the most possible is P3121. But for now, I only tried P3 at different resolutions for the reason that a potential MR solution (below 20% identities, TFZ=7.1) was found in P31 space group. If P3121 is the real space group, the space is too crowed to fit my molecules unless they have some unexpected symmetry.
For the difference among dataset in P3 space group, the output from SCALEIT of CCP4 gives as follows,
RMS differences|RMSiso RMSano ---------------------------------------- FP-hg |172.30 22.50 FP-hg2 |159.20 24.07 FP-hg3 |85.08 27.92 FP-pb |31.15 24.55
The native data's resolution cutoff is 3.2 A, and 3.2/3.8/3.5 for different EMTS data, 3.5 for Pb data.I did notice some radiation damage for EMTS (Hg) data (large negative B-factors in SCALA), but not for Pb data. And the anomalous signal only extend to 5.6 and 4.5 A for EMTS and Pb data, respectively, as indicated by phenix.xtriage. These are all the possible heavy atom data we could obtain. So i tried all of them in phenix.autosol. The input is like follows,
autosol { seq_file = ../seq.dat crystal_info { solvent_fraction = 0.62 #determined from matthews coefficient to assure "reasonable" content in asymmetric unit. } native { data = nat.mtz } deriv { data = hg1.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg2.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = hg3.mtz atom_type = Hg inano = noinano *inano anoonly lambda = 1.00394 } deriv { data = pb.mtz atom_type = Pb inano = noinano *inano anoonly lambda = 0.94640 } model_building { build = False #my structure is DNA/protein complex, so i choose to stop the building process. } }
I tried at different resolutions, 3.6/4.0/4.4/4.8/5.2/5.6/6.0 A. One solution was given based on Pb data at 4.8 A with following statistics,
Solution # 6 BAYES-CC: 59.7 +/- 17.2 (2SD) Dataset #4 FOM: 0.24 Score type: SKEW CORR_RMS NCS_OVERLAP Raw scores: 0.31 0.69 0.00 100x EST OF CC: 59.74 17.55 31.25
Refined heavy atom sites (fractional): X Y Z xyz 0.033 0.020 0.538 xyz 0.417 0.217 0.113 xyz 0.041 0.010 0.721
The statistics is almost the same for its inverse, and i read from the output that the statistics for phasing other Hg datasets are very low (BAYES-CC is lower than 20) based on the above solution. The resulting map does not make much sense yet.
Sorry for the long post. Any suggestion is welcome. Thank you very much! Fengyun _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
participants (2)
-
Ni, Fengyun
-
Terwilliger, Thomas Charles