[phenixbb] mmCIF for deposition

Billy Poon BKPoon at lbl.gov
Tue Mar 19 10:23:44 PDT 2019


Hi everyone,

The latest build, 1.15-3448, has an updated version of
mmtbx.prepare_pdb_deposition that will add the entity, entity_poly,
entity_poly_seq, struct_ref, and struct_ref_seq loops to an existing mmCIF
model when a sequence file is provided. The struct_ref and struct_ref_seq
loops are still under development since they will require user input and
we're checking with the PDB to see if they actually need to be defined for
the file used for deposition.

The sequence should be the canonical sequence with the one letter code for
each residue. If you have a non-standard residue, the program will align
the provided sequences with the chains in the model and replace the one
letter code with the three-letter residue name in parentheses.

Existing loops will be kept if they exist if the input model is in mmCIF
format. However, only loops that are not overwritten is kept. So if the
input file already has an entity_poly loop, this program will overwrite
that loop.

This tool is still under development and will be incorporated into the GUI.
If you encounter any issues, please let us know and if possible, please
provide your input files.

Some known issues include:
- alignment will fail with alternate conformations where each conformation
is a different residue
- alignment with chains with many "UNK" residues

Thanks!

--
Billy K. Poon
Research Scientist, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
1 Cyclotron Road, M/S 33R0345
Berkeley, CA 94720
Tel: (510) 486-5709
Fax: (510) 486-5909
Web: https://phenix-online.org


On Sat, Jan 5, 2019 at 1:51 PM Billy Poon <BKPoon at lbl.gov> wrote:

> Hi Bernhard,
>
> It’s something I’ll be working more the first half of this year and will
> be related to a general reorganization of how the validation tools, table
> one, and file deposition tools work. Currently, each tool sort of operates
> independently and so the information is not handled consistently.
>
> The plan is to do the final statistics calculation once with the
> validation tool and then export the statistics as a table one and
> eventually as a CIF file for deposition (the sequence would be added if a
> sequence file is available). The cryo-em comprehensive validation tool does
> this currently with exporting statistics in a table. The X-ray/neutron
> comprehensive validation tool will be changed to follow the same approach.
>
> The additional wrinkle with non-standard residues is that there are many
> of those residues. Nigel has a way for combing through our monomer library
> to build the relationships (e.g. MSE is based on M). I’m working on a tool
> to use those relationships so that the user just provides the canonical
> sequence (M) and we will fill in the appropriate non-standard residue (MSE)
> in the CIF file. This way, users do not have to manually build the PDB
> specific format of putting non-standard residues in parentheses.
>
> The next Phenix is planned for the end of February/early March, and that
> will have at least a beta version of this tool.
>
> On Fri, Jan 4, 2019 at 7:44 PM Bernhard Lechtenberg <
> blechtenberg at sbpdiscovery.org> wrote:
>
>> Hi Billy,
>>
>> Do you have any updates on this? I just used tried to use
>> mmtbx.prepare_pdb_deposition with the .cif file from phenix.refine and
>> fasta sequences as input, followed by pdb_extract to deposit several
>> structures to the PDB. The .cif files were accepted by PDB, but the
>> refinement statistics were lost and something with the structure factors
>> also seemed wrong, as the validation reports did not contain metrics for
>> R-free and RSRZ outliers. I don’t quite understand how the second problem
>> happens, since mmtbx.prepare_pdb_deposition does not see the structure
>> factors. However, when I skipped the mmtbx.prepare_pdb_deposition and
>> directly used the output cif from phenix.refine in pdb_extract and then
>> uploaded this file to the PDB, both those issues were fixed.
>>
>> I first used an older phenix version (1.14rc2-3139) on a Mac, then
>> upgraded to the latest nightly-built version (dev-3374), but the issue
>> persisted.
>>
>> Additionally, for one my five structures, I had the same issue as Patrick
>> described in October (see below), also with both versions of phenix.
>>
>> Bernhard
>>
>>
>> *Bernhard C. Lechtenberg, PhD  *Postdoctoral Associate
>> Riedl Lab
>> Cancer Metabolism and Signaling Networks Program
>> NCI-Designated Cancer Center
>>
>>
>> 10901 N. Torrey Pines Road, La Jolla, CA 92037
>> <https://maps.google.com/?q=10901+N.+Torrey+Pines+Road,+La+Jolla,+CA+92037&entry=gmail&source=g>
>>
>> *T  *858.646.3100 ext. 4216  *E* blechtenberg at SBPdiscovery.org
>>
>> *Science Benefiting Patients®*
>>
>> On Oct 2, 2018, at 4:01 PM, Billy Poon <BKPoon at lbl.gov> wrote:
>>
>> Hi Pat,
>>
>> I'm in the process of reworking that tool since it is dropping some
>> information from phenix.refine in the process of adding the sequence.
>> Something should be available by the end of the week in a new build.
>>
>> --
>> Billy K. Poon
>> Research Scientist, Molecular Biophysics and Integrated Bioimaging
>> Lawrence Berkeley National Laboratory
>> 1 Cyclotron Road
>> <https://maps.google.com/?q=1+Cyclotron+Road&entry=gmail&source=g>, M/S
>> 33R0345
>> Berkeley, CA 94720
>> Tel: (510) 486-5709
>> Fax: (510) 486-5909
>> Web: https://phenix-online.org
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fphenix-online.org&data=02%7C01%7Cblechtenberg%40sbpdiscovery.org%7C1f02e727a9a24dbf989308d628bb1cc9%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C636741181463446660&sdata=quN%2BNM5dHlBOs0xk0T53tu3ZN92oLc7v3h6tyZ%2BViXM%3D&reserved=0>
>>
>>
>> On Mon, Oct 1, 2018 at 8:41 AM Patrick Loll <pjloll at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Following the instructions given here:
>>>
>>>
>>> https://www.phenix-online.org/documentation/overviews/xray-structure-deposition.html
>>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phenix-online.org%2Fdocumentation%2Foverviews%2Fxray-structure-deposition.html&data=02%7C01%7Cblechtenberg%40sbpdiscovery.org%7C1f02e727a9a24dbf989308d628bb1cc9%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C636741181463456667&sdata=Q6TEdFOFt3wCQqJ4tA1tHWypnlY8Qb9%2BgOR1a6tErO8%3D&reserved=0>
>>>
>>> I’m attempting to use mmtbx.prepare_pdb_deposition to insert sequence
>>> information into the mmCIF that contains the model coordinates.
>>> Unfortunately, the program fails with an error (shown below).
>>>
>>> The sequence file is FASTA format, and contains an entry for each of the
>>> (4) chains in the AU, i.e.
>>>
>>> >A
>>> MSEQNCE…
>>> >B
>>> MSEQNCE…
>>> etc.
>>>
>>> Any bright ideas?
>>>
>>>
>>> ============this is what happens (vide
>>> infra)====================================
>>>
>>>
>>> [PJL-iMac:blahblah/PJL_final] loll% mmtbx.prepare_pdb_deposition
>>>  filename.cif   seq_name.fasta
>>> Starting mmtbx.prepare_pdb_deposition
>>> on Mon Oct  1 11:16:23 2018 by loll
>>>
>>> ===============================================================================
>>> Processing files:
>>>
>>> -------------------------------------------------------------------------------
>>>
>>>   Found model, filename.cif
>>>   Found sequence, seq_name.fasta
>>>
>>> Processing PHIL parameters:
>>>
>>> -------------------------------------------------------------------------------
>>>   No PHIL parameters found
>>> Final processed PHIL parameters:
>>>
>>> -------------------------------------------------------------------------------
>>>   data_manager {
>>>     model {
>>>       file = “filename.cif"
>>>     }
>>>     default_model = “filename.cif"
>>>     sequence_files = "seq_name.fasta"
>>>     default_sequence = "seq_name.fasta"
>>>   }
>>>
>>> Starting job
>>>
>>> ===============================================================================
>>> Validating inputs
>>> Using model: filename.cif
>>> Using sequence: seq_name.fasta
>>> Creating mmCIF block for sequence
>>> Traceback (most recent call last):
>>>   File
>>> "/Applications/phenix-1.14-3260/build/../modules/cctbx_project/mmtbx/command_line/prepare_pdb_deposition.py",
>>> line 9, in <module>
>>>     run_program(program_class=prepare_pdb_deposition.Program)
>>>   File
>>> "/Applications/phenix-1.14-3260/modules/cctbx_project/iotbx/cli_parser.py",
>>> line 71, in run_program
>>>     task.run()
>>>   File
>>> "/Applications/phenix-1.14-3260/modules/cctbx_project/mmtbx/programs/prepare_pdb_deposition.py",
>>> line 98, in run
>>>
>>> alignment_params=self.params.mmtbx.validation.sequence.sequence_alignment)
>>>   File
>>> "/Applications/phenix-1.14-3260/modules/cctbx_project/iotbx/pdb/hierarchy.py",
>>> line 1190, in as_cif_block_with_sequence
>>>     assert len(chain.residue_groups) + chain.n_missing_start +
>>> chain.n_missing_end == len(sequence)
>>> AssertionError
>>> (gouts of smoke, terrified squealing)
>>>
>>>
>>> The ‘No PHIL parameters found’ message is concerning, but the program
>>> clearly seems to be finding the input file names.
>>>
>>> Suggestions welcome.
>>>
>>> Thanks,
>>>
>>> Pat
>>>
>>>
>>> ---------------------------------------------------------------------------------------
>>> Patrick J. Loll, Ph. D.
>>> Professor of Biochemistry & Molecular Biology
>>> Drexel University College of Medicine
>>> Room 10-102 New College Building
>>> 245 N. 15th St
>>> <https://maps.google.com/?q=245+N.+15th+St&entry=gmail&source=g>.,
>>> Mailstop 497
>>> Philadelphia, PA  19102-1192  USA
>>>
>>> (215) 762-7706
>>> pjloll at gmail.com
>>> pjl28 at drexel.edu
>>>
>>>
>>> _______________________________________________
>>> phenixbb mailing list
>>> phenixbb at phenix-online.org
>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fphenix-online.org%2Fmailman%2Flistinfo%2Fphenixbb&data=02%7C01%7Cblechtenberg%40sbpdiscovery.org%7C1f02e727a9a24dbf989308d628bb1cc9%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C636741181463466669&sdata=S5Y55I0NxgVfkdUevXh3BQFr%2Fy6YJbF6TtYg48B7muA%3D&reserved=0>
>>> Unsubscribe: phenixbb-leave at phenix-online.org
>>
>> _______________________________________________
>> phenixbb mailing list
>> phenixbb at phenix-online.org
>>
>>
>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fphenix-online.org%2Fmailman%2Flistinfo%2Fphenixbb&amp;data=02%7C01%7Cblechtenberg%40sbpdiscovery.org%7C1f02e727a9a24dbf989308d628bb1cc9%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C636741181463496692&amp;sdata=eSLHcjYDUZfKM04MLaIossUjFGIyR2F6XGP5jta%2F%2B5c%3D&amp;reserved=0
>> Unsubscribe: phenixbb-leave at phenix-online.org
>>
>>
>> --
> --
> Billy K. Poon
> Research Scientist, Molecular Biophysics and Integrated Bioimaging
> Lawrence Berkeley National Laboratory
> 1 Cyclotron Road, M/S 33R0345
> Berkeley, CA 94720
> Tel: (510) 486-5709
> Fax: (510) 486-5909
> Web: https://phenix-online.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20190319/eba39bc5/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SBP_email.png
Type: image/png
Size: 25981 bytes
Desc: not available
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20190319/eba39bc5/attachment-0001.png>


More information about the phenixbb mailing list