Zoom Transcript


Claudia Lucía Millán Nebot
https://www.youtube.com/c/phenixtutorials

AMIT KUMAR
thanx Claudia

Claudia Lucía Millán Nebot
And the paper :) https://journals.iucr.org/d/issues/2019/10/00/di5033/

Anil Sohail
Can you please make it full screen?

Pedro Matias
Do you need CCP4 to install PHENIX on Windows?

Anil Sohail
OK thanks

Tom T
In “VIEW OPTIONS” you can make the view of Paul’s screen bigger if you want

Gihan Ketawala
Does the current version of Phenix support the new MacOS?

Tom T
Yes

Billy Poon
Yes, I have installed on macOS 11 Big Sur. I’m still testing it, but what I have tested is working. Please let us know if you find an issue!

Jan Gebauer
Will you generally support the new Apple Silicin M1 processor(s) (it's an ARM as far as I understood)….

Billy Poon
The current installers should run, but will run in translation mode (Rosetta 2). We are getting one of the new Apple Silicon machines for more testing.

Billy Poon
We will be looking into building universal installers that work natively on both architectures.

Misbha
Do u plan to integrate data processing programs in the phenix package in future

Billy Poon
Currently, we distribute DIALS, but there is no GUI associated with it. It is run via the command-line.

Randy Read
If you report a bug, please make sure you’ve given your real email address in the preferences! If you don’t, we can’t ask questions or work with you to solve the p;

Randy Read
…problem

Billy Poon
We are shipping DIALS 2.2 (https://dials.github.io).

student
No

Amal SEFFOUH
no

gundeep kaur
no

Rosario Recacha
Yes

Veronica Delsoglio
no

Arputha Latha Leo Xavier Raj
yes

Pedro Matias
Use the icons in the participants list

Misbha
Stop the counting :)

Gourab Basu Choudhury
no

Claudia Lucía Millán Nebot
That one button :)

Muthu
no

Claudia Lucía Millán Nebot
Something like this will appear, and you just have to click yes or no

AMIT KUMAR
can you share start file for this workshop?

AMIT KUMAR
to use hands on practice on phenix

Billy Poon
Do you have Phenix 1.18.2 installed?

Jan Gebauer
Stop your video for a while?

AMIT KUMAR
yes

Billy Poon
The tutorial dataset can be created by clicking on “New project".

Billy Poon
Then “Set up tutorial data”

Claudia Lucía Millán Nebot
The phenix interface has tutorial data integrated that you can use for the project.

Billy Poon
P9-sad is the first Experimental Phasing dataset.

AMIT KUMAR
thanks

RAGHVENDRA SINGH
yes

RAGHVENDRA SINGH
yes

Monika Bjelcic
can you make your screen biger, its hard to see

Monika Bjelcic
tnx

Claudia Lucía Millán Nebot
there, you can zoom in on the shared screen

Claudia Lucía Millán Nebot
Say, to 150% for example, and will be much larger :)

Andrea Pica
How do you (we) know which fields are mandatory and need to be filled and which ones can be skipped?

Pedro Matias
what about the Se white line?

prasunkumar
how will we plan experiments in case of soaking

RAGHVENDRA SINGH
what is minimum I/siga which is sufficient to solve the structure?

Phenix team: Dorothee
Soaking: choose the inputs according to your anomalous scatterer: type of anomalous scatterer, number of sites, wavelength etc

Phenix team: Dorothee
Soaking: typical number of sites: 1 site per 10 or 20 protein atoms.

Hongyi Xu
Is there a similar planner for SIR/MIR?

Irwin Selvam
Another way of checking which fields are necessary is to go to the online documentation, the instructions for running a program via the command line will often say which parameters are mandatory and which are optional

Pedro Matias
How would you reach na I/sigI of 98 in practice?

Claudia Lucía Millán Nebot
By the way, for those of you zooming, If you need to recenter the zoomed region to see better the phenix interface you just need to put the cursor there, click and move it till is like you want it

Pedro Matias
Radiation damage problems?

Misbha
It is surprising because I was always told that for S-SAD you need data better than 2A. That doesn’t seem to be true..

Misbha
Sulfur

Pedro Matias
He wrote S-SAD not Se-SAD

Misbha
how do we find whether the Met’s are ordered or not….

Jingjing Zhao
How strong the isormorphous signal that we normally get to do SIR or MIR successfully? Compare with SAD, does SIR/MIR require higher data accuracy?

Pedro Matias
what about disulphide bridges

Anil Sohail
What if we have large number of Sulphurs (disulfide bonds) ie 56 sulphur in 800 residues. According to formular No. of sites should be less?

Jingjing Zhao
thank you a lot!

Jiang Yin
In practice, I found that high multiplicity helps improve the odds of successful phasing. Is this mostly due to an improved I/sigma or is there more/other contribution? one has to strike a balance between collecting more data to improve I/sigma and not letting radiation damage ruin the prospects of solving the structure. In difficult cases, I just collect until I see clear signs of crystal decay and truncate my data later in various ways to find the proper cutoff by looking at CCano. Is there a better, real-time metric during data collection to guide our experiment on the fly?

Anil Sohail
Many thanks

student
How do you calculate the sites?

Jiang Yin
on the subject of systematic errors, what are the impacts of ice rings? it is hard to preserve xtals in the summer without ice.

Jacqueline Vitali
I have to go. Where do we find the recording afterwards?

Claudia Lucía Millán Nebot
In the phenix website

Phenix team: Dorothee
https://www.phenix-online.org/user-workshops.html

Phenix team: Dorothee
It’ll take a couple of days to upload the videos and the chat transcript.

Andrea Pica
Just to clarify... when we say I/sigma(I), are we talking mean(I)/sd(I) ?

Lan Guan
How does the “add to group” work? thanks

Randy Read
Yes, I/sigma(I) here refers to the mean. So you can get sufficiently accurate data by merging data from a number of isomorphous crystals, and it’s the final accuracy that matters.

Andrea Pica
Thanks Randy. I guess you need to collect from at least 4-5 crystals. I/sigma(I) of 98 is very high...

Camilla Donnelly
What is meant by NCS copies?

Misbha
Could you please explain a bit more what “Data accuracy” means? What are the things which we should be looking at?

RAGHVENDRA SINGH
what are those values or parameters we need to see and conform that we got the sub-structure solution

Eta Isiorho
If you run Auto_sol and the FOM is around .209 should you kill it or keep it running?

Jiang Yin
I suppose data precision is being used as proxy for data accuracy here?

RAGHVENDRA SINGH
what is systematic error and how it is different than common error? what is the link of systematic error and SAD

LT Zhai
Is it possible to solve the structure with a low FOM?

LT Zhai
And also are there more chances to solve the structure using MAD than SAD?

Jan Gebauer
connectivity and map skew

RAGHVENDRA SINGH
coz of less noice in ryt side

student
There are continuous density

Thomas Flower
solvent is flat on right

Nic Steussy
connectivity

Eta Isiorho
Contiguous density

Jose Artur Brito
Clear difference between solvente and non-solvente regions.

RAGHVENDRA SINGH
error is less

Najeeb Ullah
continuity

Pedro Matias
continuous main chains, side chains visible

student
In a right map

Lothar
clear separation of protein and solvent density

jenniferfleming
Continuous density, shape of density, can see aas

Eta Isiorho
Solvent region obvious

Tatjana von Rosen
the density is continuing

Maria Hrmova
connected features

Almudena Ponce Salvatierra
Clear solvent boundary

Vatsal Purohit
protein density easily discernable

Vatsal Purohit
Can autosol also accomodate a molecular replacement solution (pdb file) into its phase determination in cases where you're just using it to determine some heavy atom binding sites?

prasunkumar
How to use Autosol if the protein has significant number of non-natural residues

prasunkumar
say 3 nonnatural every 7 residues

RAGHVENDRA SINGH
what are those values or parameters we need to see and conform that we got the sub-structure solution in phenix log

prasunkumar
yes

prasunkumar
Thanx

Andrea Pica
Following up on the latest question, how do we know that the dataset we have collected has sufficient anomalous signal? DANO/sigDANO? CC(ano)?

Anil Sohail
Thanks

Francisco Murphy
Thanks

Hongyi Xu
thanks for the great lecture

Jiang Yin
in the P9 example, a cis peptide in front of non-proline was built at residue 19. How stringently are the geometric restraints used at this stage? Can one enforce a tighter restraint?

Andrea Flores
So, we have a problem with a beamline scientist where we collect data, who insists on trimming the data way too much and he refuses to deliver the raw data. Is there any way around this to solve the structure?

Billy Poon
The “beta-blip” is the first tutorial dataset under “Molecular Replacement”.

Monika Bjelcic
how do we know ASU number?

Jacob W
Can % identity simply be found through aligning the sequence files?

LT Zhai
Could we put 2 protein sequence into one file?

Nika Žibrat
Is simple one-component interface suitable for solving twinned crystals if the crystal is almost identical to its PDB model, or is this irrelevant at this point?

massimosammito
You can check the probability associated to the Matthews Coefficient that is plot by phaser once running

Najeeb Ullah
Monika from Matthews co efficient

Lan Guan
If we use half model for search, do we use identity 50%?

Thandeka Moyo
Is there is use for searching for one component at a time rather than both at once?

Monika Bjelcic
ok then follow up question, how to calculate Matthews co efficient

Stephan Rempel
For a homomer, would you have a sequence file with only one sequence or the entire homomer? Also, would you make an ensemble for every component of the homomer?

Claudia Lucía Millán Nebot
We will be having time for questions later on during the general discussion time so if we do not reply to all of the questions in the chat, we will have a look to them later

massimosammito
Jacob the sequence identity that you get from the alignment is good enough for Phaser. There are more sophisticated ways and I will show one later.

Claudia Lucía Millán Nebot
Ensembling also will be discussed now :)

Eta Isiorho
Alternatively, could you search for Blip run MR then use it as a partial solution for MR with beta?

Jacob W
Thank you massimosammito :)

massimosammito
Monika, Phaser computes the Matthews Coefficient and plot the probability disitribution in the Graph tabs of the results

Monika Bjelcic
ok, so does that mean we don't need to know the number before running the phaser?

Irwin Selvam
You can also get the Matthews Coefficient from xtriage

massimosammito
If you see that your initial assumption does not match with the best probability output from phaser, is worth to stop the run and restart with the correct number. Although this parameter might be difficult to predict and it is worth to try several runs with different possibilities in borderline cases

Maria Hrmova
What is the minimum recommended %sequence identity number for Phaser to work?

prasunkumar
How small can be the search model

massimosammito
Maria I will say 30% indicates at least homology

Maria Hrmova
Thank you.

Randy Read
As Claudia is explaining, the eLLG is a complicated combination of sequence identity, model completeness and data resolution, so none of these in isolation tells you whether MR will work.

massimosammito
The model can be small but it should at least be 10% of the total scattering in the crystal and only if the resolution is very good. eLLG is the key parameter to look at to guide you in choosing the model

Randy Read
Some models with less than 15% sequence identity can work, especially when used as trimmed ensembles (see Massimo’s presentation in a few minutes), but the success rate at this end of the spectrum will be low!

Claudia Lucía Millán Nebot
phenix dot tools are super handy :)

Claudia Lucía Millán Nebot
https://toolkit.tuebingen.mpg.de/tools/hhpred

Claudia Lucía Millán Nebot
Really relevant this point Massimo is making: the RMSD for the ensemble is already set by Ensembler! No need to type a seq id

massimosammito
Eta Isiorho, phaser has sophisticated algorithms to choose the correct order in which the two molecules (beta and blip) should be searched, based on the eLLG. Unless, you have a serious reason, I will just configure the run with both molecules and let Phaser choose the order automatically

Claudia Lucía Millán Nebot
Randy already pointed out at the very beginning about how powerful is the fact that after you get a component place correctly you increase the signal a lot. Therefore, you want to get your best (larger, less RMSD) model placed first

Claudia Lucía Millán Nebot
So that your background is better before placing the next one

massimosammito
And usually that decision is taken automatically by Phaser

massimosammito
In the Graph tab of the result of a Phaser run in Phenix GUI you can find a nice plot for the intensity distribution that might worth to look at to check if twinning might be occurring as Airlie was mentioning

Claudia Lucía Millán Nebot
https://youtu.be/QlWahDGBTbE

Claudia Lucía Millán Nebot
That is the tutorial Airlie is referring to

Claudia Lucía Millán Nebot
Q1) When I run Phaser, I get low LLGs and the run is taking forever. What does it mean? What can I do?

Eta Isiorho
When I run phaser, I get high LLGs but a poor map and no structure solution

massimosammito
Eta, look at the TFZ scores, and is it a single solution?

LT Zhai
I want to know when we do the MR for a complex, could we put the two protein sequences into one file instead of two separate file?

prasunkumar
Mathews Coefficient gives more than 5 probable oligomeric state. What does it mean? Should I try to run phaser with all oligomeric state

Claudia Lucía Millán Nebot
Q2) In a multicomponent search, which starts out well with a clear signal, at a certain point for a new component Ido not get an increase in the LLG and zscores are low. What does this mean?

Andrea Pica
In case of twinning, how to determine the real space group? (P1 in case of the orange juice boxes)

LC
How does phaser deal with a anistropic scaled dataset ( from StarAniso) ?

massimosammito
The sequences are used to calculate the molecular weight. You have to specify what is the full content of the crystal. So put a fasta file with the whole thing.

Eta Isiorho
I looked at the TFZ, it’s ~7 but 6 MR solutions

Claudia Lucía Millán Nebot
You should not use anisotropically truncated files from staranlso in phaser, as Airlie mentioned all the data should be available to phaser to do its best correction

massimosammito
LC it is better not process data with Staraniso for using it in Phaser, as we have sophisticated way to work with anisotropy

massimosammito
A good solution is indicated by a TZ> 8

Jan Gebauer
What would be the best strategy for highly helical structures? (In my case a collagen triple helix, with potentially pseudotranslational tNCS due to the internal rotation axis)

Randy Read
Phaser can use StarAniso data, but it will do a better job of correcting the data for anisotropy and tNCS if it gets a complete set of data, so we really prefer data that have not been through StarAniso.

LC
Ok, Thanks.

Stephan Rempel
For a 11-mer of identical subunits, should the sequence file contain the 11 sequences or 1 sequence?

Silvia Napolitano
After a phaser run, with a high LLG (1455) and a high TFZ (39.9), I get these warnings: "the top solution from a FTF did not pack" and "the top solution from a TF rescoring did not pack". What does it mean? Do I have to worry about them?

Jose Artur Brito
Imagine, hypothetically, you have a two-subunit complex and you have a MR solution for one of them. How big should it be relatively to the "unknown" subunit in order to see the latter in electron density without the need for further phasing? Hypothetically, let's say data to 2.7A. And also hypothetically, P41212...

Claudia Lucía Millán Nebot
Q3) What does it mean when you get multiple solutions with good LLG and TFZ scores, but not a unique solution. Should I just take the top one?

Laura Pacoste
Is there are feature in phenix where you can modify the .mtz file, for instance cut of a certain wedge from the reciprocal lattice and exclude it from the data?

Claudia Lucía Millán Nebot
Q4) Should I split my model into different domains? If yes, how?

Claudia Lucía Millán Nebot
I’m saving the chat :) so we can keep unanswered questions for the general discussion after our time slot in which all the speakers will be replying to both experimental phasing and molecular replacement related questions.

Claudia Lucía Millán Nebot
SCEDS

Claudia Lucía Millán Nebot
https://www.phenix-online.org/documentation/reference/sceds.html

LT Zhai
When I ran the MR, I got a high LLG and TFZ over 8, but the map didn’t fit well, so was this solution right?

Claudia Lucía Millán Nebot
Q6) I have a convincing MR solution but my Rfree is too high

Claudia Lucía Millán Nebot
Silvia, the warning messages you refer to are telling that the translation with the highest LLG is not passing a packing test. This might or not be problematic depending on whether you are then getting solutions that are sensible or not. Is your structure a coiled coil or any other kind of highly repetitive structure? Do you get a single solution or multiple ones?

Najeeb Ullah
SAXS, EM, DLS, SEC suggest hexamer, but diffraction data suggesting hexamer with 70% solvent, 8mers 60%, 10mers 50% and 12mers 40%. and the final MR solution is 8mers, but when I generate symmetry mates in PyMol, shows dodecamer. Further unit cell ha 18 symmetries mates with dodecamer seen. hexamer-dodecamer equilibrium is reported, with dodecamer more active enzymatically.

kyunghyeonlee
I tried to use the twin law that I got from Xtriage to do refinement, but it didn’t work. Is there any better way to fix twinning problem?

massimosammito
LT is the solution a unique solution in Phaser? Are you sure you have find all the copies in the ASU? Or maybe there is a space for something more, or something else? When you say “the map does not fit well” if you referent to the rfree there are several reasons why that could happen including twinning.

Claudia Lucía Millán Nebot
Jose Artur Brito, you can compute eLLG for the first component search and compare with the ones you already get. This would not tell you whether you will see density or not, though. You might need to do some density modification and map improvement.

massimosammito
Jose, yours is a very difficult question to answer. I would say that the larger the better, although it really depends on the quality of the placed model and the bfactors of the missing one. I would say, to try density modification after placing the only model that you have to see if the density in the second region improve enough for tracing with auto build.

Nika Žibrat
can one do something at this stage of structure solvinf about twinned data that display perfect twin in statistics, but also a space group in which no twin laws exist, such as space group 23/24?

Avradip
is a 2 Ang model better than a 3 Ang of the same one?

Claudia Lucía Millán Nebot
Jan Gebauer do you mean a coiled-coil structure?

Claudia Lucía Millán Nebot
Sorry jan I read again, you mean collagen, but yes it suffers from the same problems than coiled coils

Jan Gebauer
It's not a coiled coil (in terms of alpha helical) but very similar as it is a collagen triple helix (three polyproline type II helices...)

Tristan Croll
@Avradip all else being equal, yes. In practice, though, it can depend. If the domain you're interested is well resolved in the 3A structure but poorly resolved in the 2A one, the 3A may be better. If the two models are in quite different conformations, it would pay to try both (or just the common core of the two)

Avradip
Thank you!

LC
if I scale my data with StarAniso (anisotropic cut-off, I should use the full data (uncut anisotropic) for phaser and then the anisotropic cut data for refining?

Claudia Lucía Millán Nebot
yes LC

LC
ok thanks

Claudia Lucía Millán Nebot
Yes, as a general advice (summary tips for the ‘’good’’ model search). Do try to find the models using HHPRED, and if there are multiple models available, after appropiatedly sculpting them, use them in phaser runs (both as single models but also with ensembles as Massimo showed)

LC
What do you think/advise on using rmsd instead of identity % when the model if low id % but you pretty sure the structure is highly similar?

massimosammito
LC at the end what does it matter is the RMSD so that it means that if you know that number it will be a better estimation of the error

Claudia Lucía Millán Nebot
LC phaser will refine the value later on anyway

Claudia Lucía Millán Nebot
Thanks to all the participants

Francisco Murphy
thanks to you

Vatsal Purohit
thank you.