Hi everyone,

I wanted to thank you sincerely for all of your replies, I learned about so many useful tools and strategies to solve structures in situations like these. As it turns out, it was actually an additive protein that I did not know was in the drop; we contacted our former graduate student and apparently she added an additive to some, but not all, wells from which we took proteins. When using the additive protein for phasing and then performing a single round of refinement, the Rfree was ~0.33. So it wasn’t after MBP fusion after all, the additive just crystallized on its own.

Lesson for the future—always double-check with the person who actually set the drop!

Thank you again,

Eric Rosenberg

From: Rosenberg, Eric (NIH/NCI) [F]
Sent: Friday, February 3, 2023 3:23 PM
To: phenixbb@phenix-online.org
Subject: Solving MR solution without sequence information

Hi all,

I’m in a bit of bind here and am seeking some advice. For context, a former graduate student in our lab set crystal trays of an MBP fusion protein, the fused part after MBP being ~400 amino acids long. This region is also predicted to be mostly unstructured, but has a C-terminal SH3 domain. Our graduate student then graduated and before throwing out some of her trays a year or two later, we found some hits of the MBP fusion protein that actually diffracted to 2.9 Angstrom. I spent some time working on it after we collected the data (June 2021), but because I didn’t know what crystallized specifically, it was impossible to phase, and replicating seemed next to impossible, too. The Matthew’s coefficient was predicting ~130 amino acids in the ASU, space group C222 or C2221. Since whatever crystallized was clearly a degradation product of the MBP fusion, I tried phasing with SH3 domains and a lot of other things to no avail. As a final last ditch effort I eventually submitted the .mtz file to SBGrid to perform a Wide Search MR job, and low and behold it actually found MR solutions that had TFZ scores ~17 in space group C2221!

So here’s my current situation—I have been able to phase the data set using the MR search model, but again, I don’t know what specifically it is that I’ve crystallized. I’m currently able to get the Rfree to ~0.4, but can’t seem to improve it. I am really at a loss of what to do, since there are obvious backbone issues with the protein (as seen from iterative build composite omit maps), but every time I try to manually correct them it seems to make the Rfree worse. The MR solution does not align very well at all to the MBP fusion, only ~20 identity, and again, I don’t know to which ~130 amino acids I crystallized out of the ~400 of the MBP fusion. Is it one continuous stretch, two copies of a shorter stretch, etc.?

I tried phasing with a polyalanine model of the MR search model and then tried autobuilding just a polyalanine sequence to get the backbone right, but that doesn’t seem to work. Autobuild also fails when trying to various fragments of the MBP fusion sequence. Other than opening coot and manually building the entire polypeptide chain, is there an easier method? I think that once the backbone is totally right the phases will improve so I can start putting in side chains, but I’m not sure. My latest effort is to just use Sculptor prior to Phaser in order to force the sequences to match, but again, I don’t know precisely what sequence was crystallized. I have tried both the Phenix and CCP4 software suites, for reference.

Any and all help would be much appreciated (and yield an acknowledgement on a paper, if this ever works).

Best,

Eric Rosenberg

CRTA Postdoctoral Fellow

Randazzo Lab

Laboratory of Cellular and Molecular Biology

National Cancer Institute, US