Re: [phenixbb] xtriage to double-check symmetry
So I thought I'd listen to xtriage: seemed easy since I have both data and model in P6522, and I ran: phenix.xtriage final.mtz reference.structure.file=final.pdb
Uh-oh.... what I meant is, I have both data and model in ==>P65<==. (Bad typo -- explains Peter's patient but puzzled response, which did however confirm I'd understood correctly after all :) So, again: I have *some* lower symmetry (not P1, though); should I expect xtriage to print out tests for P6522 somewhere? Because I don't find it. <later> That said: after updating phenix to 1.4-162, I now do find that table in the output (attached, last table), and it seems we're in-between: seems we have probably indeed missed the two-fold, so now we must figure out why it "didn't refine" in P6522. Thanks for you help and explanation! phx.
But isn't P6522 already the highest symmetry? I think if you run X-triage on the data reduced as P65, it will as you say check the extra operations implied by p6522 and see if they are justified. If the data is already reduced in p6522, those reflections have already been merged and there is no way to tell how good the agreement was?
I don't think xtriage will look at your model- but then I've never given it a model. I would guess from your lower Rfree in P65 that it is the lower symmetry- the 2-fold operators that make it almost fit p6522 are psuedosymmetry, noncrystallographic symmetry that almost fits the p6522 operators but not quite.
Best, Ed
#############################################################
## phenix.xtriage ##
## ##
## P.H. Zwart, R.W. Grosse-Kunstleve & P.D. Adams ##
## ##
#############################################################
#phil __OFF__
Date 2009-09-18 Time 08:19:32 BST +0100 (1253258372.97 s)
##-------------------------------------------##
## WARNING: ##
## Number of residues unspecified ##
##-------------------------------------------##
Effective parameters:
#phil __ON__
scaling {
input {
asu_contents {
n_residues = None
n_bases = None
n_copies_per_asu = None
}
xray_data {
file_name = "final.mtz"
obs_labels = None
calc_labels = None
unit_cell = 81.38990021 81.38990021 79.47530365 90 90 120
space_group = "P 65"
high_resolution = None
low_resolution = None
reference {
data {
file_name = None
labels = None
unit_cell = None
space_group = None
}
structure {
file_name = "final.pdb"
}
}
}
parameters {
reporting {
verbose = 1
log = "logfile.log"
ccp4_style_graphs = True
}
misc_twin_parameters {
missing_symmetry {
sigma_inflation = 1.25
}
twinning_with_ncs {
perform_analyses = False
n_bins = 7
}
twin_test_cuts {
low_resolution = 10
high_resolution = None
isigi_cut = 3
completeness_cut = 0.85
}
}
}
optional {
hklout = None
hklout_type = mtz sca *mtz_or_sca
label_extension = "massaged"
aniso {
action = *remove_aniso None
final_b = *eigen_min eigen_mean user_b_iso
b_iso = None
}
outlier {
action = *extreme basic beamstop None
parameters {
basic_wilson {
level = 1e-06
}
extreme_wilson {
level = 0.01
}
beamstop {
level = 0.001
d_min = 10
}
}
}
symmetry {
action = detwin twin *None
twinning_parameters {
twin_law = None
fraction = None
}
}
}
}
gui {
result_file = None
}
}
#phil __END__
Symmetry, cell and reflection file content summary
Miller array info: final.mtz:F,SIGF
Observation type: xray.amplitude
Type of data: double, size=15235
Type of sigmas: double, size=15235
Number of Miller indices: 15235
Anomalous flag: False
Unit cell: (81.3899, 81.3899, 79.4753, 90, 90, 120)
Space group: P 65 (No. 170)
Systematic absences: 0
Centric reflections: 606
Resolution range: 35.2429 2.20034
Completeness in resolution range: 0.999344
Completeness with d_max=infinity: 0.999082
##----------------------------------------------------##
## Basic statistics ##
##----------------------------------------------------##
Matthews coefficient and Solvent content statistics
Number of residues unknown, assuming 50% solvent content
----------------------------------------------------------------
| Best guess : 278 residues in the asu |
----------------------------------------------------------------
Completeness and data strength analyses
The following table lists the completeness in various resolution
ranges, after applying a I/sigI cut. Miller indices for which
individual I/sigI values are larger than the value specified in
the top row of the table, are retained, while other intensities
are discarded. The resulting completeness profiles are an indication
of the strength of the data.
----------------------------------------------------------------------------------------
| Res. Range | I/sigI>1 | I/sigI>2 | I/sigI>3 | I/sigI>5 | I/sigI>10 | I/sigI>15 |
----------------------------------------------------------------------------------------
| 35.25 - 5.42 | 99.9% | 98.8% | 97.7% | 96.5% | 94.7% | 92.0% |
| 5.42 - 4.30 | 100.0% | 99.1% | 97.9% | 97.1% | 94.6% | 91.8% |
| 4.30 - 3.76 | 100.0% | 99.3% | 98.0% | 96.6% | 93.4% | 89.4% |
| 3.76 - 3.42 | 100.0% | 98.9% | 96.2% | 93.5% | 87.1% | 80.7% |
| 3.42 - 3.17 | 100.0% | 95.2% | 91.3% | 87.4% | 76.8% | 62.0% |
| 3.17 - 2.99 | 100.0% | 93.7% | 86.3% | 76.9% | 60.7% | 45.2% |
| 2.99 - 2.84 | 100.0% | 88.3% | 73.8% | 60.3% | 37.7% | 19.2% |
| 2.84 - 2.71 | 100.0% | 83.5% | 66.6% | 50.2% | 24.2% | 11.4% |
| 2.71 - 2.61 | 100.0% | 75.9% | 55.7% | 41.2% | 18.0% | 6.9% |
| 2.61 - 2.52 | 99.9% | 69.8% | 44.4% | 28.7% | 10.9% | 5.2% |
| 2.52 - 2.44 | 99.5% | 61.0% | 34.9% | 19.6% | 8.2% | 3.9% |
| 2.44 - 2.37 | 98.6% | 57.6% | 28.8% | 15.8% | 5.8% | 2.6% |
| 2.37 - 2.31 | 98.8% | 44.3% | 20.0% | 11.2% | 4.7% | 2.0% |
| 2.31 - 2.25 | 99.2% | 31.5% | 14.4% | 7.6% | 3.4% | 0.8% |
----------------------------------------------------------------------------------------
The completeness of data for which I/sig(I)>3.00, exceeds 85%
for resolution ranges lower than 2.99A.
The data are cut at this resolution for the potential twin tests
and intensity statistics.
Maximum likelihood isotropic Wilson scaling
ML estimate of overall B value of None:
49.24 A**(-2)
Estimated -log of scale factor of None:
0.40
Maximum likelihood anisotropic Wilson scaling
ML estimate of overall B_cart value of None:
51.68, -0.00, 0.00
51.68, -0.00
44.95
Equivalent representation as U_cif:
0.65, 0.33, -0.00
0.65, 0.00
0.57
Eigen analyses of B-cart:
Value Vector
Eigenvector 1 : 51.675 ( 0.91, -0.42, 0.00)
Eigenvector 2 : 51.675 ( 0.42, 0.91, -0.00)
Eigenvector 3 : 44.948 (-0.00, 0.00, 1.00)
ML estimate of -log of scale factor of None:
0.40
Correcting for anisotropy in the data
Some basic intensity statistics follow.
Low resolution completeness analyses
The following table shows the completeness
of the data to 5 Angstrom.
unused: - 35.2437 [ 0/4 ] 0.000
bin 1: 35.2437 - 10.6916 [138/139] 0.993
bin 2: 10.6916 - 8.5258 [136/136] 1.000
bin 3: 8.5258 - 7.4597 [127/127] 1.000
bin 4: 7.4597 - 6.7830 [133/133] 1.000
bin 5: 6.7830 - 6.2998 [133/133] 1.000
bin 6: 6.2998 - 5.9302 [133/133] 1.000
bin 7: 5.9302 - 5.6345 [131/131] 1.000
bin 8: 5.6345 - 5.3901 [128/128] 1.000
bin 9: 5.3901 - 5.1832 [133/133] 1.000
bin 10: 5.1832 - 5.0049 [137/137] 1.000
unused: 5.0049 - [ 0/0 ]
Mean intensity analyses
Analyses of the mean intensity.
Inspired by: Morris et al. (2004). J. Synch. Rad.11, 56-59.
The following resolution shells are worrisome:
------------------------------------------------
| d_spacing | z_score | compl. | <Iobs>/<Iexp> |
------------------------------------------------
| 6.201 | 5.09 | 1.00 | 0.681 |
------------------------------------------------
Possible reasons for the presence of the reported
unexpected low or elevated mean intensity in
a given resolution bin are :
- missing overloaded or weak reflections
- suboptimal data processing
- satellite (ice) crystals
- NCS
- translational pseudo symmetry (detected elsewhere)
- outliers (detected elsewhere)
- ice rings (detected elsewhere)
- other problems
Note that the presence of abnormalities
in a certain region of reciprocal space might
confuse the data validation algorithm throughout
a large region of reciprocal space, even though
the data are acceptable in those areas.
Possible outliers
Inspired by: Read, Acta Cryst. (1999). D55, 1759-1764
Acentric reflections:
-----------------------------------------------------------------
| d_space | H K L | |E| | p(wilson) | p(extreme) |
-----------------------------------------------------------------
| 3.054 | 0, 1, 26 | 5.09 | 5.35e-12 | 7.77e-08 |
-----------------------------------------------------------------
p(wilson) : 1-(1-exp[-|E|^2])
p(extreme) : 1-(1-exp[-|E|^2])^(n_acentrics)
p(wilson) is the probability that an E-value of the specified
value would be observed if it were selected at random
the given data set.
p(extreme) is the probability that the largest |E| value is
larger or equal than the observed largest |E| value.
Both measures can be used for outlier detection. p(extreme)
takes into account the size of the dataset.
Centric reflections:
-----------------------------------------------------------------
| d_space | H K L | |E| | p(wilson) | p(extreme) |
-----------------------------------------------------------------
| 6.408 | 0, 11, 0 | 3.81 | 1.38e-04 | 7.72e-02 |
-----------------------------------------------------------------
p(wilson) : 1-(erf[|E|/sqrt(2)])
p(extreme) : 1-(erf[|E|/sqrt(2)])^(n_acentrics)
p(wilson) is the probability that an E-value of the specified
value would be observed when it would selected at random from
the given data set.
p(extreme) is the probability that the largest |E| value is
larger or equal than the observed largest |E| value.
Both measures can be used for outlier detection. p(extreme)
takes into account the size of the dataset.
Ice ring related problems
The following statistics were obtained from ice-ring
insensitive resolution ranges
mean bin z_score : 1.55
( rms deviation : 1.28 )
mean bin completeness : 1.00
( rms deviation : 0.00 )
The following table shows the z-scores
and completeness in ice-ring sensitive areas.
Large z-scores and high completeness in these
resolution ranges might be a reason to re-assess
your data processsing if ice rings were present.
------------------------------------------------
| d_spacing | z_score | compl. | Rel. Ice int. |
------------------------------------------------
| 3.897 | 0.17 | 1.00 | 1.000 |
| 3.669 | 2.93 | 1.00 | 0.750 |
| 3.441 | 2.03 | 1.00 | 0.530 |
| 2.671 | 3.02 | 1.00 | 0.170 |
| 2.249 | 2.71 | 1.00 | 0.390 |
------------------------------------------------
Abnormalities in mean intensity or completeness at
resolution ranges with a relative ice ring intensity
lower than 0.10 will be ignored.
No ice ring related problems detected.
If ice rings were present, the data does not look
worse at ice ring related d_spacings as compared
to the rest of the data set.
Basic analyses completed
##----------------------------------------------------##
## Twinning Analyses ##
##----------------------------------------------------##
Using data between 10.00 to 2.99 Angstrom.
Determining possible twin laws.
The following twin laws have been found:
--------------------------------------------------------------------------------
| Type | Axis | R metric (%) | delta (le Page) | delta (Lebedev) | Twin law |
--------------------------------------------------------------------------------
| M | 2-fold | 0.000 | 0.000 | 0.000 | h,-h-k,-l |
--------------------------------------------------------------------------------
M: Merohedral twin law
PM: Pseudomerohedral twin law
1 merohedral twin operators found
0 pseudo-merohedral twin operators found
In total, 1 twin operator were found
Details of automated twin law derivation
----------------------------------------
Below, the results of the coset decomposition are given.
Each coset represents a single twin law, and all symmetry equivalent twin laws are given.
For each coset, the operator in (x,y,z) and (h,k,l) notation are given.
The direction of the axis (in fractional coordinates), the type and possible offsets are given as well.
Furthermore, the result of combining a certain coset with the input space group is listed.
This table can be usefull when comparing twin laws generated by xtriage with those listed in lookup tables
In the table subgroup H denotes the *presumed intensity symmetry*. Group G is the symmetry of the lattice.
Left cosets of :
subgroup H: P 6
and group G: P 6 2 2
Coset number : 0 (all operators from H)
x,y,z h,k,l Rotation: 1 ; direction: (0, 0, 0) ; screw/glide: (0,0,0)
-x,-y,z -h,-k,l Rotation: 2 ; direction: (0, 0, 1) ; screw/glide: (0,0,0)
-y,x-y,z k,-h-k,l Rotation: 3 ; direction: (0, 0, 1) ; screw/glide: (0,0,0)
-x+y,-x,z -h-k,h,l Rotation: 3 ; direction: (0, 0, 1) ; screw/glide: (0,0,0)
x-y,x,z h+k,-h,l Rotation: 6 ; direction: (0, 0, 1) ; screw/glide: (0,0,0)
y,-x+y,z -k,h+k,l Rotation: 6 ; direction: (0, 0, 1) ; screw/glide: (0,0,0)
Coset number : 1 (H+coset[1] = P 6 2 2)
x-y,-y,-z h,-h-k,-l Rotation: 2 ; direction: (1, 0, 0) ; screw/glide: (0,0,0)
-y,-x,-z -k,-h,-l Rotation: 2 ; direction: (-1, 1, 0) ; screw/glide: (0,0,0)
x,x-y,-z h+k,-k,-l Rotation: 2 ; direction: (2, 1, 0) ; screw/glide: (0,0,0)
-x,-x+y,-z -h-k,k,-l Rotation: 2 ; direction: (0, 1, 0) ; screw/glide: (0,0,0)
y,x,-z k,h,-l Rotation: 2 ; direction: (1, 1, 0) ; screw/glide: (0,0,0)
-x+y,y,-z -h,h+k,-l Rotation: 2 ; direction: (1, 2, 0) ; screw/glide: (0,0,0)
Note that if group H is centered (C,P,I,F), elements corresponding to centering operators are omitted.
(This is because internally the calculations are done with the symmetry of the reduced cell)
Splitting data in centrics and acentrics
Number of centrics : 306
Number of acentrics : 5664
Patterson analyses
------------------
Largest Patterson peak with length larger than 15 Angstrom
Frac. coord. : 0.000 0.000 0.239
Distance to origin : 18.971
Height (origin=100) : 7.557
p_value(height) : 4.031e-01
The reported p_value has the following meaning:
The probability that a peak of the specified height
or larger is found in a Patterson function of a
macro molecule that does not have any translational
pseudo symmetry is equal to 4.031e-01.
p_values smaller than 0.05 might indicate
weak translational pseudo symmetry, or the self vector of
a large anomalous scatterer such as Hg, whereas values
smaller than 1e-3 are a very strong indication for
the presence of translational pseudo symmetry.
Systematic absences
-------------------
The following table gives information about systematic absences.
For each operator, the reflections are split in three classes:
Absent : Reflections that are absent for this operator.
Non Absent: Reflection of the same type (i.e. (0,0,l)) as above, but they should be present.
Complement: All other reflections.
For each class, the is reported, as well as the number of
'violations'. A 'violation' is designated as a reflection for which a
I/sigI criterion is not met. The criteria are
Absent violation : I/sigI > 3.0
Non Absent violation : I/sigI < 3.0
Complement violation : I/sigI < 3.0
Operators with low associated violations for *both* absent and non absent
reflections, are likely to be true screw axis or glide planes. Both the
number of violations and their percentages are given. The number of
violations within the 'complement' class, can be used as a comparison for
the number of violations in the non-absent class.
-------------------------------------------------------------------------------------------------------------------------------------------
| Operator | absent under operator | | not absent under operator | | all other reflections | | |
| | (violations) | n absent | (violations) | n not absent | (violations) | n compl | Score |
-------------------------------------------------------------------------------------------------------------------------------------------
| 6_0 (c) | 0.00 (0, 0.0%) | 0 | 22.42 (0, 0.0%) | 3 | 28.44 (303, 5.1%) | 5967 | 1.14e+00 |
| 6_1 (c) | 0.00 (0, 0.0%) | 0 | 22.42 (0, 0.0%) | 3 | 28.44 (303, 5.1%) | 5967 | 1.14e+00 |
| 6_2 (c) | 0.00 (0, 0.0%) | 0 | 22.42 (0, 0.0%) | 3 | 28.44 (303, 5.1%) | 5967 | 1.14e+00 |
| 6_3 (c) | 0.00 (0, 0.0%) | 0 | 22.42 (0, 0.0%) | 3 | 28.44 (303, 5.1%) | 5967 | 1.14e+00 |
| 6_4 (c) | 0.00 (0, 0.0%) | 0 | 22.42 (0, 0.0%) | 3 | 28.44 (303, 5.1%) | 5967 | 1.14e+00 |
| 6_5 (c) | 0.00 (0, 0.0%) | 0 | 22.42 (0, 0.0%) | 3 | 28.44 (303, 5.1%) | 5967 | 1.14e+00 |
-------------------------------------------------------------------------------------------------------------------------------------------
Analyses of the absences table indicates a number of likely space group
candidates, which are listed below. For each space group, the number of
absent violations are listed under the '+++' column. The number of present
violations (weak reflections) are listed under '---'. The last column is a
likelihood based score for the particular space group. Note that
enantiomorphic spacegroups will have equal scores. Also, if absences were
removed while processing the data, they will be regarded as missing
information, rather then as enforcing that absence in the space group choices.
-----------------------------------------------------------------------------------
| space group | n absent | <Z>_absent |
participants (2)
-
Frank von Delft
-
Peter Zwart