Python-based Hierarchical ENvironment for Integrated Xtallography |
Documentation Home |
Tutorial 9: Refinement against twinned data
IntroductionThis tutorial describes the detection of twinning with phenix.xtriage and subsequent refinement with phenix.refineBackground and nomenclatureTwinning is a phenomenon in which the crystal used in data collection is a composition of several distinct domains who orientation differ, but are related by known (and predictable) operators. The net effect of the presence of multiple lattices is that the recorded data is the sum of a number of diffraction patterns. This tutorial deals with the case when the twinned crystal consists of two domains. This type of twinning is know as hemihedral twinning. The twinning can be either merohedral (M; the twin related lattices overlap exactly) or pseudo-merohedral (PM; the twin related lattices overlap almost exactly). Classification of the type of twinning (M or PM) can be performed on the basis of group theoretical arguments and is done in phenix.xtriage. X-ray data collected from a hemihedrally twinned specimen is effectively the sum of two diffraction patterns. The relative size of the smallest crystal domain to the whole crystal is known as the twin fraction, often denoted by &alpha. The operator that relates overlapping miller indices, is know as the twin law. As mentioned in the previous section, the effect of twinning on the X-ray data, is that the intensity for a given miller index as seen on the detector has two non-equal contributors: J(H) = (1-&alpha)I(H) + (&alpha)*I(RH)           [eq. 1a] J(RH) = (1-&alpha)I(RH) + (&alpha)*I(H)        [eq. 1b] In the previous expressions, the intensities I of the miller index H and its twin mate RH build up a single intensity J. The twin law is denoted by R and is used to find the RH, twin related index of H. The twin law R is usually written down in algebraic terms: (k,h,-l). Detection of twinning with phenix.xtriageThe presence of twinning usually reveals itself by intensity statistics that do not fall with the range expected for untwinned data. phenix.xtriage reports a number of intensity statistics:
Examplesphenix.xtriage porin.cv xray_data.unit_cell=104.4,104.4,124.25,90,90,120 xray_data.space_group=R3gives (parts omitted): Determining possible twin laws. The following twin laws have been found: ---------------------------------------------------------------------------------------------------------------- | Type | Axis | R metric (%) | delta (le Page) | delta (Lebedev) | Twin law | ---------------------------------------------------------------------------------------------------------------- | M | 2-fold | 0.000 | 0.000 | 0.000 | -h-k,k,-l | | PM | 2-fold | 2.476 | 1.548 | 0.022 | -h,2/3*h+1/3*k-2/3*l,-2/3*h-4/3*k-1/3*l | | PM | 4-fold | 2.476 | 1.548 | 0.022 | h+k,-2/3*h-1/3*k+2/3*l,2/3*h-2/3*k+1/3*l | | PM | 2-fold | 2.476 | 1.548 | 0.022 | -h,1/3*h-1/3*k+2/3*l,2/3*h+4/3*k+1/3*l | | PM | 3-fold | 2.476 | 1.032 | 0.022 | h+k,-1/3*h-2/3*k-2/3*l,-2/3*h+2/3*k-1/3*l | | PM | 4-fold | 2.476 | 1.548 | 0.022 | -k,1/3*h+2/3*k+2/3*l,-4/3*h-2/3*k+1/3*l | | PM | 3-fold | 2.476 | 1.032 | 0.022 | -k,-1/3*h+1/3*k-2/3*l,4/3*h+2/3*k-1/3*l | ---------------------------------------------------------------------------------------------------------------- M: Merohedral twin law PM: Pseudomerohedral twin law 1 merohedral twin operators found 6 pseudo-merohedral twin operators found In total, 7 twin operator were found . . . . ------------------------------------------------------------------------------- Twinning and intensity statistics summary (acentric data): Statistics independent of twin laws - < I^2 > / < I > ^2 : 1.667 - < F > ^2/ < F^2 > : 0.857 - < |E^2-1| > : 0.609 - < |L| > , < L^2 >: 0.401, 0.225 Multivariate Z score L-test: 8.174 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data is expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it. Statistics depending on twin laws -------------------------------------------------------------------------------------------------- | Operator | type | R obs. | Britton alpha | H alpha | ML alpha | -------------------------------------------------------------------------------------------------- | -h-k,k,-l | M | 0.195 | 0.292 | 0.315 | 0.304 | | -h,2/3*h+1/3*k-2/3*l,-2/3*h-4/3*k-1/3*l | PM | 0.420 | 0.068 | 0.069 | 0.022 | | h+k,-2/3*h-1/3*k+2/3*l,2/3*h-2/3*k+1/3*l | PM | 0.410 | 0.079 | 0.084 | 0.022 | | -h,1/3*h-1/3*k+2/3*l,2/3*h+4/3*k+1/3*l | PM | 0.419 | 0.070 | 0.069 | 0.022 | | h+k,-1/3*h-2/3*k-2/3*l,-2/3*h+2/3*k-1/3*l | PM | 0.413 | 0.078 | 0.079 | 0.022 | | -k,1/3*h+2/3*k+2/3*l,-4/3*h-2/3*k+1/3*l | PM | 0.415 | 0.070 | 0.075 | 0.022 | | -k,-1/3*h+1/3*k-2/3*l,4/3*h+2/3*k-1/3*l | PM | 0.415 | 0.074 | 0.077 | 0.022 | -------------------------------------------------------------------------------------------------- Patterson analysis - Largest peak height : 5.689 (corresponding p value : 7.822e-01) The largest off-origin peak in the Patterson function is 5.69% of the height of the origin peak. No significant pseudo translation is detected. The results of the L-test indicate that the intensity statistics are significantly different then is expected from good to reasonable, untwinned data. As there are twin laws possible given the crystal symmetry, twinning could be the reason for the departure of the intensity statistics from normality. It might be worthwhile carrying out refinement with a twin specific target function. -------------------------------------------------------------------------------As listed clearly in the summary states above, the data is suspected to be twinned. Given the relatively large tolerances in finding twin laws, seven twin laws are found. Six of them are pseudo merohedral, one of them is merohedral. Looking at the R-value analysis in the last table (column R-obs), the merohedral twin law is the most likely in this case, as this merging R-value is lower then any of the other R values reported. A quick model based twinning analysisIn most cases, the presence of twinning does not impede structure solution via molecular replacement (or sometimes even S/MAD). If a molecular replacement solution is available testing which twin law is the most likely can be done via phenix.twin_map_utils xray_data.file=porin.cv model=porin.pdb unit_cell=104.4,104.4,124.25,90,90,120 space_group=R3The latter tool performs bulk solvent scaling and R-value calculation for all possible twin laws in the given crystal setting. In this particular case, the twin law (-h-k,k,-l) is the most likely, as it produces the R-value and lowest refinement target value out of the 7 listed twin listed (0.19 vs 0.25). Selection of a cross validation setAlthough the given reflection file in this tutorial already has an assigned test set for cross validation purposes, assigning it properly is relatively important. When the data is twinned, each observed reflection has a contribution of two (with hemihedral twinning) non-twinned components. When a test set is designed, care must be taken that free and work reflections are not related by a twin law. The R-free set assignment in phenix.refine and phenix.reflection_file_converter is designed with this in mind: the free reflections are chosen to obey the highest possible symmetry of the lattice. Choosing a free set with phenix.refine is as simple as including the keywords xray_data.r_free_flags.generate=Trueon the command line Refinement protocolsWithin phenix, various refinement protocols can be followed. A few typical examples will be shown below. Restrained refinement of positional and atomic displacement parametersStandard restrained refinement of positional and atomic displacement parameters is invoked viaphenix.refine porin.cv porin.pdb twin_law="-h-k,k,-l" unit_cell=104.4,104.4,124.25,90,90,120 space_group=R3Refinement is performed in macro cycles, in which either positions, atomic displacement parameters or twin and bulk solvent parameters are refined. Rigid body refinementThe commandphenix.refine porin.cv porin.pdb twin_law="-h-k,k,-l" strategy=rigid_body unit_cell=104.4,104.4,124.25,90,90,120 space_group=R3will perform rigid body refinement using the twin target function. TLSFor tls refinement, it is advisable to construct a small parameter file that contains TLS definitions:refinement.refine { strategy = *individual_sites rigid_body *individual_adp group_adp *tls \ occupancies group_anomalous none adp { tls = "chain A" } } refinement.twinning { twin_law = "-h-k,k,-l" }Saving these parameters as tls.def one can run phenix.refine porin.cv porin.pdb tls.def unit_cell=104.4,104.4,124.25,90,90,120 space_group=R3 Water pickingOrdered solvent can be picked as a part of the refined procedure, more details are available from the phenix.refine manual. Note that the (difference) map used by phenix.refine, is constructed using detwinned data (see below). Including water picking in refinement can be carried out as follows:phenix.refine porin.cv porin.pdb twin_law="-h-k,k,-l" ordered_solvent=True unit_cell=104.4,104.4,124.25,90,90,120 space_group=R3 Map inspectionElectron density maps during twin refinement are constructed using detwinned data. Choice of map coefficients and detwinning mode, is controlled by a set of parameters as shown below: refinement.twinning { twin_law = "-h-k,k,-l" detwin { mode = algebraic proportional *auto local_scaling = False map_types { twofofc = *two_m_dtfo_d_fc two_dtfo_fc fofc = *m_dtfo_d_fc gradient m_gradient aniso_correct = False } } }By default, data is detwinned using algebraic techniques, unless the twin fraction is above 45%, in which case detwinning is performed using proportionality of twin related Icalc values. Detwinning using the proportionality option, results in maps that are more biased towards the model, resulting in seemingly cleaner, but in the end less informative maps. The 2mFo-dFc map coefficients can be chosen to have sigmaA weighting (two_m_dtfo_d_fc) or not (two_dtfo_fc). IN both cases, the map coefficients correspond to the 'untwinned' data. A difference map can be constructed using either sigmaA weighted detwinned data (m_dtfo_d_fc), a sigmaA weighted gradient map (m_gradient) or a plain gradient map (gradient). The default is m_dtfo_d_fc but can be changed to gradient or m_gradient if desired. |