[phenixbb] Dummy atoms

Pavel Afonine pafonine at lbl.gov
Wed Jul 28 15:34:18 PDT 2010


  I agree with Phil about UNK - it seems to be good indeed to call 
unknown (undefined) residue as it appears on the map rather than call 
ALA something that in fact is TYR, and then later on getting confused 
about the mismatch between actual sequence and the one derived from PDB 
file. This is actually what I get confused all the time looking at 
results of model building programs, because the first thing I always do 
is I compare the real actual sequence with the one derived from PDB file 
- just to validate the result of model building.

However, I agree with Tom too about loosing identity in cases where we 
really do know what to expect: polypeptide or rna/dna.

Hm... interesting situation -:)

I guess UNK is may be still better, ONLY IF you go one level deeper and 
look at atom names (or make sure you do that consistently). Say you name 
a "residue" as UNK and name corresponding atoms within this residue as 
CA, N, C, O (kind of peptide pattern) - then you have a chance to guess 
what it is. Of course how you then know where you place those CA,N,C and 
O...

Pavel.



On 7/28/10 3:19 PM, Tom Terwilliger wrote:
> One disadvantage of using UNK is that it is often a loss of 
> information. For example in the case Phil mentions...we do think that 
> we have a polypeptide.  By labelling protein residues UNK we no longer 
> distinguish them from DNA, or depending on HETATM vs ATOM 
> identification, from ligands.
> -Tom T
>
> On Jul 28, 2010, at 4:01 PM, Phil Evans wrote:
>
>> UNK residues have another valid use where you can see peptide but not 
>> assign a sequence register. A poly-Ala model in that case is better 
>> labelled UNK than ALA, since it isn't ALA
>>
>> Phil
>>
>>
>> On 28 Jul 2010, at 19:12, Pavel Afonine wrote:
>>
>>> Dear Ed,
>>>
>>>> I think it is very important to be able to include unknown atoms
>>>> in a deposited pdb file (with echoing the caveat about flooding
>>>> the structure with UNK's to lower the R-factor).
>>>
>>> yes, as I wrote in original reply, including these atoms may improve 
>>> the map and in turn may reveal or improve some its other important 
>>> (biologically) places. The only point is: please define these dummy 
>>> atoms properly, providing all the information, such as scattering 
>>> element type that you or your program used for such an approximation.
>>>
>>>> For one thing, these structures are produced not just for 
>>>> structure-factor
>>>> calculation and validation. Many of the end users will never even
>>>> bother to do a structure factor calculation.
>>>
>>> The ability to reproduce the R-factor is not only for someones 
>>> pleasure but for the validation purposes at least. If I've got a PDB 
>>> file for which I can't compute the R-factors (and, by the way, even 
>>> the map too), then I don't need the deposited Fobs too, unless I'm 
>>> going to re-determine the structure from scratch.
>>>
>>>> It important for the
>>>> depositor to be able to refer to an unknown but likely significant
>>>> ligand and for the reader to be able to go and look at that position
>>>> (ideally surrounded by electron density).
>>>
>>> Sure, it is important.
>>>
>>>> For another thing, the structure factor calculation will give exactly
>>>> the same result whether the dummy atoms are omitted or are flagged
>>>> with zero occupancy or atom-type X to be ignored in sf calculation.
>>>
>>> If you look in PDB you will find that very often the occupancies are 
>>> not set up to 1. Plus, as I mentioned, often the B-factors for these 
>>> atoms are set to some funny numbers (looks like they were refined).
>>> Are we sure that these programs were ignoring these dummies in Fcalc 
>>> calculations? If so how the B-factor were refined, or they were made up?
>>>
>>> Again, if it is defined properly, for example, like this:
>>>
>>> ATOM   1959  O   DUM A   1      -8.762   8.060  25.324  1.00 31.23 
>>>           O
>>>
>>> or
>>>
>>> ATOM   1959  O   UNK A   1      -8.762   8.060  25.324  1.00 31.23 
>>>           O
>>>
>>> then it is absolutely OK to have such entries, because it is 
>>> completely defined and can be used in any calculations without any 
>>> unnecessary guesswork. But if you start masking things with X or 
>>> blanks then I (and the software I write) will start asking all these 
>>> nasty questions...
>>>
>>> All the best!
>>> Pavel.
>>>
>>> _______________________________________________
>>> phenixbb mailing list
>>> phenixbb at phenix-online.org <mailto:phenixbb at phenix-online.org>
>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>
>> _______________________________________________
>> phenixbb mailing list
>> phenixbb at phenix-online.org <mailto:phenixbb at phenix-online.org>
>> http://phenix-online.org/mailman/listinfo/phenixbb
>
>
> Thomas C. Terwilliger
> Mail Stop M888
> Los Alamos National Laboratory
> Los Alamos, NM 87545
>
> Tel:  505-667-0072                 email: terwilliger at LANL.gov 
> <mailto:terwilliger at LANL.gov>
> Fax: 505-665-3024                 SOLVE web site: http://solve.lanl.gov
> PHENIX web site: http:www.phenix-online.org <http://www.phenix-online.org>
> ISFI Integrated Center for Structure and Function Innovation web site: 
> http://techcenter.mbi.ucla.edu
> TB Structural Genomics Consortium web site: http://www.doe-mbi.ucla.edu/TB
> CBSS Center for Bio-Security Science web site: http://www.lanl.gov/cbss
>
>
>
>
>
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20100728/80ad1b7f/attachment-0003.htm>


More information about the phenixbb mailing list