[phenixbb] decrease number of refs in test set

Ralf W. Grosse-Kunstleve rwgk at cci.lbl.gov
Wed May 21 22:21:43 PDT 2008


> I have a data set with about 10% of the reflections in the test set.   
> I want to retain this test set for further refinement in phenix, but I  
> want to decrease the number of reflections to say 5%. Can this be done  
> in phenix?

Please try the example script below. You'll need the current CCI Apps
for this to work (Version 2008_05_03_2330).

  phenix.python mtz_convert_free_to_work.py your.mtz

Adjust the "label" if necessary. I hope you don't have anomalous
R-free flags since that's a little more difficult to handle.

Please note that your R-free will be severely compromised if you go
back to the 10% R-free set. In fact, it would be best to completely
discard the 10% set. Once a reflection was used in refinement, it
should never be used as a free reflection again, at least not with
the same model.

Ralf


--------------------------------------------------------------------------------
import iotbx.mtz
from cctbx.array_family import flex
import sys, os

def run(args, label="R-free-flags", convert_fraction=0.5, random_seed=0):
  assert len(args) == 1
  input_file_name = args[0]
  output_file_name = "less_free_"+os.path.basename(input_file_name)
  print "Reading file:", input_file_name
  mtz_obj = iotbx.mtz.object(file_name=input_file_name)
  column = mtz_obj.get_column(label=label)
  selection_valid = column.selection_valid()
  flags = column.extract_values()
  def get_and_report(what):
    free_indices = ((flags != 0) & selection_valid).iselection()
    work_indices = ((flags == 0) & selection_valid).iselection()
    if (  free_indices.size()
        + work_indices.size() != selection_valid.count(True)):
      raise RuntimeError("""\
Unexpected array of R-free flags:
  Expected: 0 for work reflections, 1 for test reflections.""")
    print what, "number of free reflections:", free_indices.size()
    print what, "number of work reflections:", work_indices.size()
    return free_indices
  free_indices = get_and_report("Input")
  mt = flex.mersenne_twister(seed=random_seed)
  permuted_indices = free_indices.select(
    mt.random_permutation(size=free_indices.size()))
  n_convert = int(permuted_indices.size() * convert_fraction + 0.5)
  print "Number of reflections converted from free to work:", n_convert
  flags.set_selected(permuted_indices[:n_convert], 0)
  get_and_report("Output")
  column.set_values(values=flags, selection_valid=selection_valid)
  print "Writing file:", output_file_name
  mtz_obj.write(file_name=output_file_name)

if (__name__ == "__main__"):
  run(sys.argv[1:])
--------------------------------------------------------------------------------



More information about the phenixbb mailing list