Hi Lucas,

Your P21 just looks twinned. Have a look at the last section of the xtriage output:


-------------------------------------------------------------------------------
Twinning and intensity statistics summary (acentric data):

Statistics independent of twin laws
  - <I^2>/<I>^2 : 1.915
  - <F>^2/<F^2> : 0.821
  - <|E^2-1|>   : 0.677
  - <|L|>, <L^2>: 0.442, 0.267
       Multivariate Z score L-test: 3.623
       The multivariate Z score is a quality measure of the given
       spread in intensities. Good to reasonable data is expected
       to have a Z score lower than 3.5.
       Large values can indicate twinning, but small values do not
       neccesarily exclude it.


Statistics depending on twin laws
------------------------------------------------------------------
| Operator  | type | R obs. | Britton alpha | H alpha | ML alpha |
------------------------------------------------------------------
| h,-k,-h-l |  PM  | 0.090  | 0.406         | 0.417   | 0.370    |
------------------------------------------------------------------

Patterson analyses
  - Largest peak height   : 6.379
   (correpsonding p value : 6.273e-01)


The largest off-origin peak in the Patterson function is 6.38% of the
height of the origin peak. No significant pseudotranslation is detected.

The results of the L-test indicate that the intensity statistics
are significantly different then is expected from good to reasonable,
untwinned data.
As there are twin laws possible given the crystal symmetry, twinning could
be the reason for the departure of the intensity statistics from normality.
It might be worthwhile carrying out refinement with a twin specific target function.

-------------------------------------------------------------------------------

You might have NCS parallel to the twin axis. If you run mmtbx.xtwin_map_utils on your P21 data, you get a mtzfile called obs_and_calc.mtz
push that through xtriage (make sure to specify labels for observed and calculated data) to get the RvsR statistic. When you do that, please also provide the keyword
'perform=True' on the xtriage command line for my pleasure and send me the resulting log file.

I guess the twin refinement will push the r values down a bit.

Note that for futurte reference, xtriage produces a logfile as well, usually named logfile.log unless you ask it to be a different name. logfile.log contains some ccp4 style graphs you can view with loggraph (such as the L, NZ, Britton and H curves).

If the above is too cryptic and you need more info, please let me know.

HTH

Peter









Lucas Bleicher wrote:
Hi, Peter!

I am sending the xtriage output for both space groups.

So, the correct approach would be to use the MTZ
processed in P21 with the twin law from the xtriage
output from this MTZ file?

Thanks in advance,
Lucas


      Alertas do Yahoo! Mail em seu celular. Saiba mais em http://br.mobile.yahoo.com/mailalertas/

############################################################# ## mmtbx.xtriage ## ## ## ## P.H. Zwart, R.W. Grosse-Kunstleve & P.D. Adams ## ## ## ############################################################# Date 2007-07-31 Time 16:43:49 BRT -0300 ##-------------------------------------------## ## WARNING: ## ## Number of residues unspecified ## ##-------------------------------------------## Effective parameters: scaling.input { parameters { asu_contents { n_residues = None n_bases = None n_copies_per_asu = None } misc_twin_parameters { missing_symmetry { tanh_location = 0.08 tanh_slope = 50 } twinning_with_ncs { perform_analyses = False n_bins = 7 } twin_test_cuts { low_resolution = 10 high_resolution = None isigi_cut = 3 completeness_cut = 0.85 } } reporting { verbose = 1 log = "logfile.log" ccp4_style_graphs = True } } xray_data { file_name = "LamNatP1High_scala1.mtz" obs_labels = "MEAN_lnls,SIGIMEAN_lnls" calc_labels = None unit_cell = 52.16379929 64.59190369 108.2256012 89.99259949 89.99089813 \ 66.10500336 space_group = "P 1" high_resolution = None low_resolution = None } } <scitbx_array_family_flex_ext.double object at 0xb5e96dac> Symmetry, cell and reflection file content summary Miller array info: LamNatP1High_scala1.mtz:IMEAN_lnls,SIGIMEAN_lnls Observation type: xray.amplitude Type of data: double, size=94082 Type of sigmas: double, size=94082 Number of Miller indices: 94082 Anomalous flag: False Unit cell: (52.1638, 64.5919, 108.226, 89.9926, 89.9909, 66.105) Space group: P 1 (No. 1) Systematic absences: 0 Centric reflections: 0 Resolution range: 26.0819 1.85 Completeness in resolution range: 0.854134 Completeness with d_max=infinity: 0.853839 ##----------------------------------------------------## ## Basic statistics ## ##----------------------------------------------------## Matthews coefficient and Solvent content statistics Number of residues unknown, assuming 50% solvent content ---------------------------------------------------------------- | Best guess : 1220 residues in the asu | ---------------------------------------------------------------- Completeness and data strength analyses The following table lists the completeness in various resolution ranges, after applying a I/sigI cut. Miller indices for which individual I/sigI values are larger than the value specified in the top row of the table, are retained, while other intensities are discarded. The resulting completeness profiles are an indication of the strength of the data. ---------------------------------------------------------------------------------------- | Res. Range | I/sigI>1 | I/sigI>2 | I/sigI>3 | I/sigI>5 | I/sigI>10 | I/sigI>15 | ---------------------------------------------------------------------------------------- | 26.08 - 4.55 | 97.0% | 96.3% | 95.5% | 93.5% | 87.3% | 80.3% | | 4.55 - 3.62 | 96.5% | 96.1% | 95.4% | 93.8% | 87.8% | 81.2% | | 3.62 - 3.16 | 95.7% | 94.5% | 93.2% | 89.4% | 78.1% | 66.3% | | 3.16 - 2.87 | 93.1% | 90.7% | 87.1% | 79.1% | 60.8% | 44.5% | | 2.87 - 2.67 | 88.4% | 84.0% | 78.5% | 66.7% | 44.0% | 27.9% | | 2.67 - 2.51 | 85.5% | 78.6% | 71.1% | 57.7% | 32.6% | 17.9% | | 2.51 - 2.38 | 82.4% | 74.1% | 65.3% | 49.3% | 24.8% | 12.2% | | 2.38 - 2.28 | 81.6% | 72.3% | 62.4% | 46.0% | 20.4% | 9.0% | | 2.28 - 2.19 | 78.5% | 67.6% | 55.1% | 37.2% | 14.8% | 5.7% | | 2.19 - 2.12 | 75.5% | 61.9% | 49.0% | 30.7% | 9.9% | 3.5% | | 2.12 - 2.05 | 72.1% | 55.6% | 41.8% | 23.7% | 6.3% | 1.9% | | 2.05 - 1.99 | 68.6% | 50.2% | 35.8% | 17.9% | 4.1% | 1.0% | | 1.99 - 1.94 | 63.2% | 42.9% | 28.5% | 13.7% | 3.1% | 0.5% | | 1.94 - 1.89 | 56.6% | 36.7% | 23.2% | 10.4% | 1.7% | 0.3% | ---------------------------------------------------------------------------------------- The completeness of data for which I/sig(I)>3.00, exceeds 85% for for resolution ranges lower than 2.87A. The data is cut at this resolution for the potential twin tests and intensity statistics. Maximum likelihood isotropic Wilson scaling ML estimate of overall B value of LamNatP1High_scala1.mtz:IMEAN_lnls,SIGIMEAN_lnls: 15.59 A**(-2) Estimated -log of scale factor of LamNatP1High_scala1.mtz:IMEAN_lnls,SIGIMEAN_lnls: -2.45 Maximum likelihood anisotropic Wilson scaling ML estimate of overall B_cart value of LamNatP1High_scala1.mtz:IMEAN_lnls,SIGIMEAN_lnls: 15.61, 1.39, -1.33 11.40, 0.27 21.10 Equivalent representation as U_cif: 0.18, -0.04, -0.02 0.14, 0.00 0.27 ML estimate of -log of scale factor of LamNatP1High_scala1.mtz:IMEAN_lnls,SIGIMEAN_lnls: -2.45 Correcting for anisotropy in the data Some basic intensity statistics follow. Low resolution completeness analyses The following table shows the completeness of the data to 5 Angstrom. unused: - 26.0822 [ 0/38 ] 0.000 bin 1: 26.0822 - 10.5555 [528/541] 0.976 bin 2: 10.5555 - 8.4726 [551/564] 0.977 bin 3: 8.4726 - 7.4299 [530/547] 0.969 bin 4: 7.4299 - 6.7636 [568/582] 0.976 bin 5: 6.7636 - 6.2860 [525/540] 0.972 bin 6: 6.2860 - 5.9200 [547/559] 0.979 bin 7: 5.9200 - 5.6266 [530/546] 0.971 bin 8: 5.6266 - 5.3839 [519/532] 0.976 bin 9: 5.3839 - 5.1783 [568/584] 0.973 bin 10: 5.1783 - 5.0009 [516/532] 0.970 unused: 5.0009 - [ 0/0 ] Mean intensity analyses Analyses of the mean intensity. Inspired by: Morris et al. (2004). J. Synch. Rad.11, 56-59. The following resolution shells are worrisome: ------------------------------------------------ | d_spacing | z_score | compl. | <Iobs>/<Iexp> | ------------------------------------------------ | 9.993 | 7.15 | 0.97 | 0.426 | | 8.448 | 7.29 | 0.98 | 0.559 | | 7.451 | 6.09 | 0.97 | 0.676 | | 4.303 | 4.64 | 0.97 | 1.223 | | 3.410 | 4.52 | 0.96 | 1.176 | | 3.333 | 6.63 | 0.96 | 1.262 | | 3.261 | 5.36 | 0.96 | 1.205 | | 3.194 | 5.48 | 0.96 | 1.203 | | 3.071 | 4.87 | 0.94 | 1.183 | | 2.962 | 6.43 | 0.94 | 1.245 | | 2.911 | 4.61 | 0.93 | 1.160 | | 2.454 | 4.96 | 0.87 | 0.872 | | 2.425 | 6.89 | 0.87 | 0.825 | | 2.397 | 4.68 | 0.87 | 0.873 | | 2.182 | 6.79 | 0.84 | 0.830 | | 2.142 | 5.54 | 0.84 | 0.856 | | 1.870 | 5.46 | 0.60 | 1.205 | | 1.857 | 11.65 | 0.54 | 1.570 | ------------------------------------------------ Possible reasons for the presence of the reported unexpected low or elevated mean intensity in a given resolution bin are : - missing overloaded or weak reflections - suboptimal data processing - satelite (ice) crystals - NCS - translational pseudo symmetry (detected elsewhere) - outliers (detected elsewhere) - ice rings (detected elsewhere) - other problems Note that the presence of abnormalities in a certain region of reciprocal space might confuse the data validation algorithm throughout a large region of reciprocal space, even though the data is acceptable in those areas. Possible outliers Inspired by: Read, Acta Cryst. (1999). D55, 1759-1764 Acentric reflections: ----------------------------------------------------------------- | d_space | H K L | |E| | p(wilson) | p(extreme) | ----------------------------------------------------------------- | 2.114 | -18, -9, 35 | 3.79 | 5.57e-07 | 5.08e-02 | | 3.066 | -16, -8, 12 | 3.93 | 1.90e-07 | 1.77e-02 | | 3.066 | 16, 8, 12 | 3.95 | 1.69e-07 | 1.57e-02 | | 2.115 | 18, 9, 35 | 3.79 | 5.65e-07 | 5.15e-02 | ----------------------------------------------------------------- p(wilson) : 1-(1-exp[-|E|^2]) p(extreme) : 1-(1-exp[-|E|^2])^(n_acentrics) p(wilson) is the probability that an E-value of the specified value would be observed when it would selected at random from the given data set. p(extreme) is the probability that the largest |E| value is larger or equal then the observed largest |E| value. Both measures can be used for outlier detection. p(extreme) takes into account the size of the dataset. Centric reflections: None Ice ring related problems The following statistics were obtained from ice-ring insensitive resolution ranges mean bin z_score : 3.46 ( rms deviation : 1.89 ) mean bin completeness : 0.89 ( rms deviation : 0.09 ) The following table shows the z-scores and completeness in ice-ring sensitive areas. Large z-scores and high completeness in these resolution ranges might be a reason to re-assess your data processsing if ice rings were present. ------------------------------------------------ | d_spacing | z_score | compl. | Rel. Ice int. | ------------------------------------------------ | 3.897 | 0.51 | 0.97 | 1.000 | | 3.669 | 1.87 | 0.97 | 0.750 | | 3.441 | 4.52 | 0.96 | 0.530 | | 2.671 | 3.84 | 0.89 | 0.170 | | 2.249 | 3.61 | 0.87 | 0.390 | | 2.072 | 3.23 | 0.82 | 0.300 | | 1.948 | 0.62 | 0.76 | 0.040 | | 1.918 | 0.44 | 0.73 | 0.180 | | 1.883 | 5.46 | 0.60 | 0.030 | ------------------------------------------------ Abnormalities in mean intensity or completeness at resolution ranges with a relative ice ring intensity lower then 0.10 will be ignored. No ice ring related problems detected. If ice rings were present, the data does not look worse at ice ring related d_spacings as compared to the rest of the data set Basic analyses completed ##----------------------------------------------------## ## Twinning Analyses ## ##----------------------------------------------------## Using data between 10.00 to 2.87 Angstrom. Determining possible twin laws. The following twin laws have been found: --------------------------------------------------------------------------------- | Type | Axis | R metric (%) | delta (le Page) | delta (Lebedev) | Twin law | --------------------------------------------------------------------------------- | PM | 2-fold | 0.009 | 0.010 | 0.000 | -h,-k,l | | PM | 2-fold | 0.090 | 0.079 | 0.001 | -h,-h+k,-l | | PM | 2-fold | 0.099 | 0.080 | 0.001 | h,h-k,-l | --------------------------------------------------------------------------------- M: Merohedral twin law PM: Pseudomerohedral twin law 0 merohedral twin operators found 3 pseudo-merohedral twin operators found In total, 3 twin operator were found Number of centrics : 0 Number of acentrics : 27599 Largest patterson peak with length larger then 15 Angstrom Frac. coord. : 0.119 -0.276 -0.000 Distance to origin : 16.337 Height (origin=100) : 5.893 p_value(height) : 7.365e-01 The reported p_value has the following meaning: The probability that a peak of the specified height or larger is found in a Patterson function of a macro molecule that does not have any translational pseudo symmetry is equal to 7.365e-01 p_values smaller then 0.05 might indicate weak translation pseudo symmetry, or the self vector of a large anomalous scatterer such as Hg, whereas values smaller then 1e-3 are a very strong indication for the presence of translational pseudo symmetry. Wilson ratio and moments Acentric reflections <I^2>/<I>^2 :1.932 (untwinned: 2.000; perfect twin 1.500) <F>^2/<F^2> :0.820 (untwinned: 0.785; perfect twin 0.885) <|E^2 - 1|> :0.678 (untwinned: 0.736; perfect twin 0.541) NZ test (0<=z<1) to detect twinning and possible translational NCS ----------------------------------------------- | Z | Nac_obs | Nac_theo | Nc_obs | Nc_theo | ----------------------------------------------- | 0.0 | 0.000 | 0.000 | 0.000 | 0.000 | | 0.1 | 0.046 | 0.095 | 0.000 | 0.248 | | 0.2 | 0.127 | 0.181 | 0.000 | 0.345 | | 0.3 | 0.210 | 0.259 | 0.000 | 0.419 | | 0.4 | 0.287 | 0.330 | 0.000 | 0.474 | | 0.5 | 0.361 | 0.394 | 0.000 | 0.520 | | 0.6 | 0.427 | 0.451 | 0.000 | 0.561 | | 0.7 | 0.486 | 0.503 | 0.000 | 0.597 | | 0.8 | 0.544 | 0.551 | 0.000 | 0.629 | | 0.9 | 0.597 | 0.593 | 0.000 | 0.657 | | 1.0 | 0.643 | 0.632 | 0.000 | 0.683 | ----------------------------------------------- | Maximum deviation acentric : 0.055 | | Maximum deviation centric : 0.683 | | | | <NZ(obs)-NZ(twinned)>_acentric : -0.024 | | <NZ(obs)-NZ(twinned)>_centric : -0.467 | ----------------------------------------------- L test for acentric data using difference vectors (dh,dk,dl) of the form: (2hp,2kp,2lp) where hp, kp, and lp are random signed integers such that 2 <= |dh| + |dk| + |dl| <= 8 Mean |L| :0.441 (untwinned: 0.500; perfect twin: 0.375) Mean L^2 :0.267 (untwinned: 0.333; perfect twin: 0.200) The distribution of |L| values indicates a twin fraction of 0.08. Note that this estimate is not as reliable as obtained via a Britton plot or H-test if twin laws are available. --------------------------------------------- Analysing possible twin law : -h,-k,l --------------------------------------------- Results of the H-test on a-centric data: (Only 50.0% of the strongest twin pairs were used) mean |H| : 0.053 (0.50: untwinned; 0.0: 50% twinned) mean H^2 : 0.005 (0.33: untwinned; 0.0: 50% twinned) Estimation of twin fraction via mean |H|: 0.447 Estimation of twin fraction via cum. dist. of H: 0.449 Britton analyses Extrapolation performed on 0.47 < alpha < 0.495 Estimated twin fraction: 0.442 Correlation: 0.9953 R vs R statistic: R_abs_twin = <|I1-I2|>/<|I1+I2|> Lebedev, Vagin, Murshudov. Acta Cryst. (2006). D62, 83-95 R_abs_twin observed data : 0.054 R_sq_twin = <(I1-I2)^2>/<(I1+I2)^2> R_sq_twin observed data : 0.004 No calculated data available. R_twin for calculated data not determined. Maximum Likelihood twin fraction determination Zwart, Read, Grosse-Kunstleve & Adams, to be published. The estimated twin fraction is equal to 0.415 --------------------------------------------- Analysing possible twin law : -h,-h+k,-l --------------------------------------------- Results of the H-test on a-centric data: (Only 50.0% of the strongest twin pairs were used) mean |H| : 0.093 (0.50: untwinned; 0.0: 50% twinned) mean H^2 : 0.015 (0.33: untwinned; 0.0: 50% twinned) Estimation of twin fraction via mean |H|: 0.407 Estimation of twin fraction via cum. dist. of H: 0.407 Britton analyses Extrapolation performed on 0.44 < alpha < 0.495 Estimated twin fraction: 0.399 Correlation: 0.9959 R vs R statistic: R_abs_twin = <|I1-I2|>/<|I1+I2|> Lebedev, Vagin, Murshudov. Acta Cryst. (2006). D62, 83-95 R_abs_twin observed data : 0.093 R_sq_twin = <(I1-I2)^2>/<(I1+I2)^2> R_sq_twin observed data : 0.011 No calculated data available. R_twin for calculated data not determined. Maximum Likelihood twin fraction determination Zwart, Read, Grosse-Kunstleve & Adams, to be published. The estimated twin fraction is equal to 0.361 --------------------------------------------- Analysing possible twin law : h,h-k,-l --------------------------------------------- Results of the H-test on a-centric data: (Only 50.0% of the strongest twin pairs were used) mean |H| : 0.085 (0.50: untwinned; 0.0: 50% twinned) mean H^2 : 0.013 (0.33: untwinned; 0.0: 50% twinned) Estimation of twin fraction via mean |H|: 0.415 Estimation of twin fraction via cum. dist. of H: 0.417 Britton analyses Extrapolation performed on 0.46 < alpha < 0.495 Estimated twin fraction: 0.411 Correlation: 0.9961 R vs R statistic: R_abs_twin = <|I1-I2|>/<|I1+I2|> Lebedev, Vagin, Murshudov. Acta Cryst. (2006). D62, 83-95 R_abs_twin observed data : 0.086 R_sq_twin = <(I1-I2)^2>/<(I1+I2)^2> R_sq_twin observed data : 0.009 No calculated data available. R_twin for calculated data not determined. Maximum Likelihood twin fraction determination Zwart, Read, Grosse-Kunstleve & Adams, to be published. The estimated twin fraction is equal to 0.350 Exploring higher metric symmetry The point group of data as dictated by the space group is P 1 the point group in the niggli setting is P 1 The point group of the lattice is Hall: C 2 2 (x+y,2*y,z) A summary of R values for various possible point groups follow. ----------------------------------------------------------------------------------------------- | Point group | mean R_used | max R_used | mean R_unused | min R_unused | choice | ----------------------------------------------------------------------------------------------- | Hall: C 2y (x-y,2*x,z) | 0.086 | 0.086 | 0.054 | 0.054 | | | P 1 | None | None | 0.070 | 0.054 | | | Hall: C 2y (x+y,2*y,z) | 0.093 | 0.093 | 0.054 | 0.054 | | | Hall: C 2 2 (x+y,2*y,z) | 0.070 | 0.086 | None | None | | | P 1 1 2 | 0.054 | 0.054 | 0.093 | 0.093 | <--- | ----------------------------------------------------------------------------------------------- R_used: mean and maximum R value for symmetry operators *used* in this point group R_unused: mean and minimum R value for symmetry operators *not used* in this point group The likely point group of the data is: P 1 1 2 Possible space groups in this point groups are: Unit cell: (52.1638, 108.226, 64.5258, 90, 113.762, 90) Space group: P 1 2 1 (No. 3) Unit cell: (52.1638, 108.226, 64.5258, 90, 113.762, 90) Space group: P 1 21 1 (No. 4) Note that this analyses does not take into account the effects of twinning. If the data is (allmost) perfectly twinned, the symmetry will appear to be higher than it actually is. ------------------------------------------------------------------------------- Twinning and intensity statistics summary (acentric data): Statistics independent of twin laws - <I^2>/<I>^2 : 1.932 - <F>^2/<F^2> : 0.820 - <|E^2-1|> : 0.678 - <|L|>, <L^2>: 0.441, 0.267 Multivariate Z score L-test: 3.807 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data is expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not neccesarily exclude it. Statistics depending on twin laws ------------------------------------------------------------------- | Operator | type | R obs. | Britton alpha | H alpha | ML alpha | ------------------------------------------------------------------- | -h,-k,l | PM | 0.054 | 0.442 | 0.449 | 0.415 | | -h,-h+k,-l | PM | 0.093 | 0.399 | 0.407 | 0.361 | | h,h-k,-l | PM | 0.086 | 0.411 | 0.417 | 0.350 | ------------------------------------------------------------------- Patterson analyses - Largest peak height : 5.893 (correpsonding p value : 7.365e-01) The largest off-origin peak in the Patterson function is 5.89% of the height of the origin peak. No significant pseudotranslation is detected. The results of the L-test indicate that the intensity statistics are significantly different then is expected from good to reasonable, untwinned data. As there are twin laws possible given the crystal symmetry, twinning could be the reason for the departure of the intensity statistics from normality. It might be worthwhile carrying refinement with a twin specific target function. Note that the symmetry of the intensities suggest that the assumed space group is too low. As twinning is however suspected, it is not immediuatly clear if this is the case. Carefull reprocessing and (twin)refinement for all cases might resolve this question. -------------------------------------------------------------------------------