[phenixbb] Solving MR solution without sequence information

Tom Peat t.peat at unsw.edu.au
Fri Feb 3 14:53:15 PST 2023


Hello Eric,

You've had some good responses as to things to do already, but I'll throw in one 'old school' method.
When I had this situation (although with somewhat higher resolution data), I went through the density with Coot and tried to put in residues where I thought I could identify them (Trp, Phe, Cys, Pro, etc). I did this iteratively (with some refinement) until I came up with a stretch of say 8-10 residues where I thought the sequence fit the density reasonably well. I then did a search for that sequence. In your case, if you obtained the protein from E. coli, then I would just search the E. coli set of proteins using something like UniProt. You obviously need to take into account that you won't be able to tell the difference between Asp/Asn and Glu/Gln, so don't look for 100% matches. This allowed me to narrow down the possible proteins to just one or two and I then had a full sequence to work with.
Might be worth a shot.
Best of luck, tom
________________________________
From: phenixbb-bounces at phenix-online.org <phenixbb-bounces at phenix-online.org> on behalf of Rosenberg, Eric (NIH/NCI) [F] <eric.rosenberg at nih.gov>
Sent: Saturday, February 4, 2023 7:22 AM
To: phenixbb at phenix-online.org <phenixbb at phenix-online.org>
Subject: [phenixbb] Solving MR solution without sequence information

You don't often get email from eric.rosenberg at nih.gov. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>

Hi all,



I’m in a bit of bind here and am seeking some advice. For context, a former graduate student in our lab set crystal trays of an MBP fusion protein, the fused part after MBP being ~400 amino acids long. This region is also predicted to be mostly unstructured, but has a C-terminal SH3 domain. Our graduate student then graduated and before throwing out some of her trays a year or two later, we found some hits of the MBP fusion protein that actually diffracted to 2.9 Angstrom. I spent some time working on it after we collected the data (June 2021), but because I didn’t know what crystallized specifically, it was impossible to phase, and replicating seemed next to impossible, too. The Matthew’s coefficient was predicting ~130 amino acids in the ASU, space group C222 or C2221. Since whatever crystallized was clearly a degradation product of the MBP fusion, I tried phasing with SH3 domains and a lot of other things to no avail. As a final last ditch effort I eventually submitted the .mtz file to SBGrid to perform a Wide Search MR job, and low and behold it actually found MR solutions that had TFZ scores ~17 in space group C2221!



So here’s my current situation—I have been able to phase the data set using the MR search model, but again, I don’t know what specifically it is that I’ve crystallized. I’m currently able to get the Rfree to ~0.4, but can’t seem to improve it. I am really at a loss of what to do, since there are obvious backbone issues with the protein (as seen from iterative build composite omit maps), but every time I try to manually correct them it seems to make the Rfree worse. The MR solution does not align very well at all to the MBP fusion, only ~20 identity, and again, I don’t know to which ~130 amino acids I crystallized out of the ~400 of the MBP fusion. Is it one continuous stretch, two copies of a shorter stretch, etc.?



I tried phasing with a polyalanine model of the MR search model and then tried autobuilding just a polyalanine sequence to get the backbone right, but that doesn’t seem to work. Autobuild also fails when trying to various fragments of the MBP fusion sequence. Other than opening coot and manually building the entire polypeptide chain, is there an easier method? I think that once the backbone is totally right the phases will improve so I can start putting in side chains, but I’m not sure. My latest effort is to just use Sculptor prior to Phaser in order to force the sequences to match, but again, I don’t know precisely what sequence was crystallized. I have tried both the Phenix and CCP4 software suites, for reference.



Any and all help would be much appreciated (and yield an acknowledgement on a paper, if this ever works).



Best,

Eric Rosenberg



CRTA Postdoctoral Fellow

Randazzo Lab

Laboratory of Cellular and Molecular Biology

National Cancer Institute, US
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20230203/91531111/attachment.htm>


More information about the phenixbb mailing list