Hi,
I'm cross-posting this to the Phenix-BB, because you did this calculation in Phenix, but since the site completion for SAD phasing would be done with Phaser, the answer is relevant to both packages.
Basically, I think Jim Pflugrath's suggestion is correct. Comparing your Xe coordinates to S and Cl coordinates from one of our test SAD phasing datasets, the majority of your extra sites are in the positions of either intrinsic S anomalous scatterers or bound halides. The Phaser completion step found them, even though you only asked for 2 sites initially, because it continues adding new sites as long as there are significant peaks in the SAD LLG maps. The refined occupancies make sense, assuming you were collecting your data around 1A wavelength. At that wavelength f" is about 4-5e for Xe and about 0.25-0.3 for S or Cl, and multiplying (say) 4.5e by occupancies of 0.03-0.08 gives effective f" values of 0.14-0.36e. Of course, Phaser isn't just trying to fit the imaginary part of the scattering, so the occupancies could end up slightly higher because it's trying to account for atoms with 16-17 electrons using partially-occupied atoms with 54 electrons.
Knowing that, once Phaser has found the Xe atoms, it will then find the weak anomalous scattering from S and Cl, you could tell Phaser to look in addition for S atoms. (At around 1A wavelength, there's not enough difference between S and Cl, in either their real or anomalous scattering, to matter, so you don't need to look for Cl in addition.) Phaser will reassign atom types based on refined occupancy, so it would likely get most of these right, which would very slightly improve your phasing model (because you would then have the right ratio of real to imaginary scattering). Another advantage to looking for both atom types is that, when you mix atoms with different ratios of real and imaginary scattering, Friedel's law is broken even for the substructure structure factors, which has two consequences. First, the likelihood score will probably be significantly better for the correct enantiomorph than the incorrect one; second, you're likely to end up with a slightly more complete substructure when working with the correct enantiomorph.
Note that AutoSol also chose the correct enantiomorph after phasing, by looking at map quality indicators for the two hands.
Regards,
Randy Read
On 2 Feb 2012, at 16:43, Brennan Bonnet wrote:
> Hi all,
>
> I have a strange result using Phenix's AutoSol to look for xenon sites in lysozyme.
>
> For a few months now I have been trying to produce xenon derivatives of lysozyme using pressures in the range of 50-350psi and time ranges between 5-60min trying to find the "sweet spot" so that this technique can be applied to other proteins.
>
> For each dataset, I AutoSol using phenix and have it look for 2, 3, and 4 xenon sites. I haven't had much success though and typical values when asked to look for 2 sites have been Rwork/Rfree = 0.5585/0.5892, CC=0.14, Bayes-CC=6.5 and with xenon sites:
>
> (a, b, c, alpha, beta, gamma, space group)
> CRYST1 78.870 78.870 36.940 90.00 90.00 90.00 P 41 21 2
>
> (columns after numbering are a, b, c, occ, B factor)
> HETATM 1 XE XE Z 1 48.601 67.475 1.088 0.06 6.58 XE
> HETATM 2 XE XE Z 2 54.986 76.636 2.746 0.11 7.49 XE
>
> which, to me, says that there is no binding (or at least nothing obvious).
>
> I finally got a good result though with 350psi and 5 minutes pressurization. After AutoSol, FOM=0.436, Bayes-CC=33.57, Model CC=0.83, Rwork/Rfree=0.2843/0.3023, 117/125 residues build and placed. And even though I told it to only look for 2 sites, Phenix went ahead and found 22 sites:
>
> (a, b, c, alpha, beta, gamma, space group)
> CRYST1 79.130 79.130 36.920 90.00 90.00 90.00 P 43 21 2
>
> (colums after numbering are a, b, c, occ, B factor)
> HETATM 1 XE XE Z 1 -36.117 -56.764 -1.825 0.18 9.43 XE
> HETATM 2 XE XE Z 2 -54.805 -54.805 0.000 0.14 6.44 XE
> HETATM 3 XE XE Z 3 -8.337 -31.695 -16.766 0.08 8.26 XE
> HETATM 4 XE XE Z 4 -5.702 -64.959 -35.134 0.05 5.21 XE
> HETATM 5 XE XE Z 5 -9.535 -22.397 -31.839 0.06 14.33 XE
> HETATM 6 XE XE Z 6 -7.319 -66.343 -35.123 0.05 6.02 XE
> HETATM 7 XE XE Z 7 -8.580 -21.153 -30.274 0.05 7.43 XE
> HETATM 8 XE XE Z 8 -1.494 -29.583 -2.339 0.06 7.36 XE
> HETATM 9 XE XE Z 9 -0.518 -16.483 -16.776 0.05 6.56 XE
> HETATM 10 XE XE Z 10 -0.464 -29.847 -4.386 0.06 6.04 XE
> HETATM 11 XE XE Z 11 -1.222 -16.234 -18.952 0.04 6.51 XE
> HETATM 12 XE XE Z 12 -6.001 -20.380 -4.532 0.05 7.22 XE
> HETATM 13 XE XE Z 13 -5.169 -27.522 -10.128 0.05 3.54 XE
> HETATM 14 XE XE Z 14 -11.931 -67.039 -4.699 0.05 7.96 XE
> HETATM 15 XE XE Z 15 -6.538 -10.408 -3.761 0.04 10.22 XE
> HETATM 16 XE XE Z 16 -0.458 -23.897 -31.221 0.05 6.85 XE
> HETATM 17 XE XE Z 17 -3.308 -65.181 -8.754 0.03 8.08 XE
> HETATM 18 XE XE Z 18 -11.080 -29.129 -2.144 0.04 5.61 XE
> HETATM 19 XE XE Z 19 -5.720 -73.029 -2.035 0.04 11.24 XE
> HETATM 20 XE XE Z 20 10.236 60.097 33.822 0.03 5.38 XE
> HETATM 21 XE XE Z 21 5.980 68.581 3.532 0.03 7.49 XE
> HETATM 22 XE XE Z 22 11.107 13.277 27.334 0.03 7.10 XE
>
> I'm pretty sure there aren't 22 sites for xenon to bind in lysozyme but it looks to me like the top 2 (maybe top 3) are actual sites which is great.
> The strange part is that when I use the same data and have Phenix look for 3 or 4 sites it then only finds 3 or 4 very low occupancy sites and the CC's/R's are again all very poor.
> Has anyone had this problem or does anyone know what's going on? This is the first success that I'm having with xenon derivatization and it seems to me that if Phenix can find 22 sites when asked to find 2 it should have no problem when asked to find 3 or 4.
>
> Thanks in advance,
> ~Brennan~
------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research Tel: + 44 1223 336500
Wellcome Trust/MRC Building Fax: + 44 1223 336827
Hills Road E-mail: rjr27(a)cam.ac.uk
Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk