CarboLoad

Overview

CarboLoad will load carbohydrates into your model. It can be used in several modes. The modes are designed as a progression to help guide the user at the command-line. Notwithstanding, some modes are useful in their own right.

Mode 0

The help mode gives the command:

phenix.carbo_load

or:

phenix.carbo_load --help

give:

CarboLoad - A program for loading carbohydrates into a protein model

Usage:
  phenix.carbo_load protein_pdb_file_name=pdb3a37.ent \
                    carbohydrate_file_name=man.txt \
                    map_coeffs_file_name=pdb3a36_refine_001_map_coeffs.mtz \
                    residue_selection="resname LG1"

For more details on the parameters see below.

Mode 1

This mode is designed to find all possible N-linking sites.:

phenix.carbo_load model.pdb

or:

phenix.carbo_load protein_pdb_file_name=model.pdb

will output a list of all ASN residues that are in the motif for N-linking to carbohydrates. It will also print out a possible selection such as:

residue_selection="chainid A resname ASN resid 37"

Mode 2

Adding a carbohydrate polymer to the model requires a residue selection for the addition and the carbohydrate information. The addtion of e.g.:

residue_selection="chainid A resname ASN resid 37" carbohydrate_file_name=NAG-NAG

will add a chain of two NAG units to the selected ASN. THe carbohydrate parameter can be a SCaLES string (see below) or a file containing a SCaLES string or GlycoCT format (see below).

Mode 3

If you include a data file or, more efficiently and flexibly, a map file CarboLoad with perform a Real Space Refinement when each saccharide unit is placed in the model.

Input files

CarboLoad requires a number of inputs and also has some options, all of which are provided using the phil formalism. The list of required inputs:

protein_pdb_file_name: this is a file containing the protein model of current interest into which the carbohydrates will be loaded. It is best to have a CRYST1 record in place.

map_coeffs_file_name: this is a maps coefficients file output from a phenix.refine run.

data_file_name: a data file. A maps coefficients file is saved for possible use in subsequent runs as it saves time.

The input for the carbohydrates can be either a string on the command-line or a file name containing the allowable formats.

carbohydrate_file_name: this is a file containing chemical information about the carbohydrates.

GlycoCT (Herget et al.) : This can be generated using the Carbohydrate Builder in REEL.

Simplified Carbohydrate Line-Entry System (SCaLES) : The majority of carbohydrates linked to proteins a N-linked shorted polymers of high mannose. The add the first three units the parameters can be set to "NAG-NAG-MAN". More details are available here.

Optional inputs include:

output.file_name: Name of the output files.

Optional control parameters include:

allow_multiple_chain_selection: makes the residue selection work on every chain in the model. This is useful in situation where the model contains multiple identical (almost) chains.

nprocs: runs each carbohydrate polymer addition in a seperate process. Useful for adding more than one polysaccharide.

nqueues: runs jobs on a local queueing system.

Output files

By default, CarboLoad will generate a PDB file with the loaded carbohydrates and the linking files. These are no longer required by phenix.refine but are instructive.

References

Herget, S., R. Ranzinger, K. Maass, and C.-W. V. D. Lieth. 2008. “GlycoCT-a Unifying Sequence Format for Carbohydrates.” Carbohydrate Research 343 (12): 2162–71. doi:10.1016/j.carres.2008.03.011.