The Phenix AI Agent automates macromolecular structure determination. You give it your experimental data and (optionally) some guidance, and it figures out which PHENIX programs to run, in what order, with what settings — and then runs them for you automatically.
Think of it as an experienced crystallographer sitting next to you at the computer. It looks at your data, decides what to do first, checks the results, and decides what to do next. If something goes wrong, it tries a different approach. When it's done, it tells you what it found and where to look.
The fastest way to see the AI Agent in action is to run one of the built-in PHENIX tutorials:
All you need to do is click Run. The agent reads the tutorial's README file, extracts the experiment parameters (wavelength, atom type, resolution, etc.), selects a plan template, and runs the entire workflow automatically. No advice, no settings changes, no file selection needed.
The p9-sad tutorial, for example, completes in 5 cycles with 0 failures, producing a model with R-free 0.247 at 1.74 Å resolution.
The AI Agent expects to be given data files from crystallography (mtz files) or from cryo-EM (mrc or ccp4 maps), a sequence file, and optional model files or ligand files. You can tell it what files to use by specifying a directory containing the files or by loading the files in the AI Agent GUI window.
The AI Agent also accepts instructions. You can say things like "first run xtriage and then molecular replacement", or "solve the structure", or "use main.number_of_macro_cycles=1 in phenix refine", or whatever else you want to tell it to do with your X-ray or cryo-EM data.
Note on instructions: if you are supplying PHIL keywords for the agent, specify them exactly as you would on the command line. For example:
use main.number_of_macro_cycles=1 in refinement
but not:
use one macro_cycle in refinement
(macro_cycles happens to be a special case that the agent recognizes, but most parameters are not converted like this.)
You can supply instructions in the GUI. If your directory contains a README file the agent can read it and follow the instructions in it.
Structure solution pathways. The AI Agent selects from 17 plan templates covering X-ray MR, SAD, MAD, cryo-EM, ligand fitting, and other workflows. A typical pathway for X-ray data might be:
xtriage → autosol → autobuild → refine → molprobity
if the data have anomalous signal, or:
xtriage → predict_and_build → refine
if they do not. For cryo-EM data:
mtriage → resolve_cryo_em → dock_in_map → real_space_refine
If you supply a ligand, the pathway will end with ligand fitting and model refinement. You can modify most aspects of the pathway with your instructions, as long as the necessary data is available.
How decisions are made. Each cycle, the agent runs a six-node decision pipeline: PERCEIVE (extract metrics, categorize files) → THINK (analyze logs with crystallographic expertise) → PLAN (select next program) → BUILD (construct the command) → VALIDATE (check workflow rules) → OUTPUT (update session). The LLM participates in two nodes (THINK and PLAN); the other four are deterministic. The LLM sets the intent; deterministic code enforces the accuracy.
In Expert mode (the default), the agent also creates a multi-stage strategy plan at the start of the session. After each cycle, a deterministic gate compares metrics to the stage's success criteria and decides whether to advance, retreat, or stop.
What the LLM does not control. The LLM cannot write arbitrary command-line strings. It sets strategy flags (e.g. atom_type=Se, resolution=2.5) which the BUILD node expands through programs.yaml into validated PHIL parameters. File paths, resolution values, R-free flags, and output prefixes are all injected by deterministic code. Any parameter not in the program's allowlist is stripped by the command sanitizer.
Once you click Run, the PHENIX GUI switches to the Agent Progress tab. Here is what you'll see, illustrated with real output from the p9-sad tutorial.
The plan. In Expert mode, the first thing you see is the strategy plan:
==================================================
STRATEGY PLAN: SAD/MAD experimental phasing (X-ray)
==================================================
○ Stage 1: Analyze data quality and anomalous signal
Goal: Data quality analysis complete
○ Stage 2: SAD/MAD phasing and initial model building
Goal: Experimental phasing complete
○ Stage 3: Rebuild and refine model
Goal: R-free <0.30
○ Stage 4: Final refinement with ordered solvent
Goal: R-free <0.25
○ Stage 5: Final model validation
==================================================
Each stage has a goal. The symbols show status: ○ pending, ● active, ✓ complete, ⊘ skipped, ✗ failed.
Cycle output. Each program the agent runs appears as a numbered cycle with its decision, reasoning, and result. Here is cycle 2 (autosol) from the same run:
Cycle 2: phenix.autosol
Decision: phenix.autosol
Reasoning: User requested SeMet SAD with additional
sulfur search using AutoSol, truncating to 2.5 Å,
wavelength 0.9792 Å, and ~5 Se sites. Xtriage
indicates usable anomalous signal, so experimental
phasing is appropriate.
Source: llm (openai)
File Selection:
Data_Mtz: p9.sca (Reason: llm_selected)
Sequence: seq.dat (Reason: llm_selected)
Command:
phenix.autosol autosol.data=p9.sca seq_file=seq.dat
autosol.lambda=0.9792 resolution=2.5
autosol.atom_type=Se mad_ha_add_list=S
autosol.sites=5 nproc=4
Running: phenix.autosol ... [OK]
[GATE] Phase complete: experimental_phasing →
build_and_refine
The Reasoning field shows exactly why the agent made this choice. The Source field tells you whether the LLM or the rules engine made the decision. The GATE line shows the plan advancing to the next stage.
Stage transitions. When the agent completes or retreats from a stage:
✓ STAGE COMPLETE: Analyze data quality All steps completed (1/1 cycles) — advancing ⚠ RETREAT: initial_refinement → molecular_replacement R-free stuck above 0.45 after 3 cycles
A retreat is not a failure — it means the agent recognized that its current approach isn't working and is trying something different.
Here is the complete output from the p9-sad tutorial (5 cycles, 0 failures):
Cycle 1: phenix.xtriage → Data quality OK Resolution: 1.74 Å, Space group: I 4 Anomalous measurability: 0.198 Cycle 2: phenix.autosol → SAD phasing succeeds Cycle 3: phenix.autobuild → R-free: 0.230 Cycle 4: phenix.molprobity → Clashscore: 12.46 Cycle 5: STOP R-free = 0.2466 (< 0.25 target). Stopping. Final Quality: R-free 0.2466 Good R-work 0.2295 Clashscore 12.5 Acceptable Rama Outliers 2.4% Rotamer Outliers 1.9% Key Output Files: overall_best_final_refine_001.pdb (model) overall_best_refine_data.mtz (data) overall_best_refine_map_coeffs.mtz (map coefficients)
| Metric | What it means | Good | Concerning |
|---|---|---|---|
| R-free | Agreement between model and data (lower is better) | <0.25 | >0.35 |
| R-work | Like R-free on working set (always lower) | <0.20 | >0.30 |
| Clashscore | Atomic clashes per 1000 atoms (lower is better) | <5 | >20 |
| Ramachandran fav | Residues in ideal backbone geometry | >97% | <90% |
| Map CC | Map-model correlation, cryo-EM (higher is better) | >0.7 | <0.5 |
If R-free is going down cycle by cycle, things are working.
All output files are in the agent's working directory (usually ai_agent_directory/ inside your project folder).
Your refined model: The .pdb file from the last successful refinement cycle. The Results panel in the GUI shows you exactly where each output file is.
Map coefficients: The .mtz file from the last refinement, containing 2Fo-Fc and Fo-Fc maps. Open this in Coot alongside the refined model to inspect the electron density.
HTML structure report: structure_report.html — a self-contained report with final metrics, an R-free trajectory chart, a workflow timeline, and output file locations. Click the Open Structure Report button in the Results tab to view it.
Text structure report: structure_determination_report.txt — a human-readable summary of the entire determination.
Session summary: session_summary.json — a machine-readable summary with final metrics, stage outcomes, and metric trajectory. Useful for comparing multiple runs programmatically.
The agent produces a model, but you are the scientist. Before depositing or publishing, always inspect the model manually:
If the AI Agent encounters a fatal error (such as a crystal symmetry mismatch between your files, or a missing SHELX installation), it stops the run and produces an HTML diagnosis page that describes the problem and what you can do to fix it. The diagnosis contains three sections: "What went wrong," "Most likely cause," and "How to fix it."
If the agent is unable to produce an LLM-generated diagnosis (e.g. in rules-only mode), a simpler deterministic diagnosis is produced instead.
Error message saying "Daily API usage limit reached. Please try again tomorrow or run with your own API key." This happens after you run the ai_agent or the ai_analysis tool on the server several times on the same day. The quotas reset each day, so you can try the next day. Alternatively, you can set up your own API key (see section below) and run locally instead of on the server.
Error message saying "API quota exceeded, please try another provider (eg provider=openai) or wait for quota reset". This happens if you run on the server and the quota for all users for a particular provider is exceeded. The best solution for this situation is to select a different provider (if you were running with provider=google, try provider=openai or provider=ollama). Alternatively you can wait for this quota to reset (typically the next day).
"R-free stuck above 0.40." The model isn't fitting the data. Check the xtriage output for warnings about space group or twinning. Try a different search model or let the agent use AlphaFold (provide only the sequence).
"Trying a different approach" (RETREAT). The agent recognized that its current strategy isn't working and is backtracking. This is expected behavior — not a failure.
"No matching plan template." The agent couldn't find a strategy for your combination of files and advice. It will still run in reactive mode (choosing programs one at a time). Check that you've provided the right files for your experiment type.
"Safety Stop" or "Agent stopped." Something is fundamentally wrong that the agent cannot fix by trying different programs. The stop message will explain the issue. Check the xtriage output, verify your input files, and look at the failure diagnosis if one was produced.
Analysis Depth (thinking_level). The most important setting:
| Setting | What it does |
|---|---|
| None | Fastest. No expert reasoning. |
| Basic | Adds AI reasoning about each step. |
| Advanced | Full structural validation, expert knowledge base. |
| Expert | (Default) Everything in Advanced, plus a multi-stage strategy plan with goal tracking. |
For most runs, Expert (the default) is the best choice.
Max cycles. Maximum programs to run (default: 20). A typical structure determination takes 5–15 cycles.
Restart mode. Fresh (start new) or Resume (continue from where the previous run left off).
Provider. Which AI to use: Google (default), OpenAI, or Ollama (local).
Verbosity. Quiet (errors only), Normal (decisions and metrics), or Verbose (full detail including file selection).
If you have already run an AI Agent job, you can restore it in the GUI by clicking on Job History and then clicking on the run. At the bottom of the Configure panel, you can set:
You can change the advice when you resume. The agent will try to follow your new instructions.
If you use the AI Agent, your log files are sent to the Phenix server, and from there to OpenAI or Google Gemini. That means the data could potentially be used by those providers in any way that they use other data sent to them.
Running locally with Ollama keeps all data on your machine. See below for setup instructions.
Note that access to OpenAI and Gemini through Phenix is non-commercial only.
Normally you will use Google (Gemini) as the LLM provider. This requires an API key for Google (supplied with Phenix). You can also use Ollama (run on the Phenix server) or OpenAI (also uses a supplied API key).
A shared set of keys is supplied, so you can run without getting your own key. The number of analyses with shared keys is limited, however.
You can run on your own machine (not using the Phenix server) by unchecking "Run on server" in the GUI or setting run_on_server=False on the command line.
Your options when running locally:
If you install Ollama on your machine (typically with a GPU), you can run everything locally without sending any data outside your machine.
After installing Ollama, set up the required models:
ollama pull llama3.1:70b ollama pull llama3.1:8b ollama pull nomic-embed-text
Set the environment variables:
setenv CUDA_VISIBLE_DEVICES 0,1,2 setenv OLLAMA_NUM_PARALLEL 6 setenv OLLAMA_HOST 0.0.0.0:11434
Then run the agent with:
run_on_server=False provider=ollama
Note: the required models may change with Phenix versions.
If you have a Google API key (set GOOGLE_API_KEY) or an OpenAI API key (set OPENAI_API_KEY), you can run locally with:
run_on_server=False provider=google (or provider=openai)
You can test your API keys with:
phenix.python $PHENIX/modules/cctbx_project/libtbx/langchain/tests/test_api_keys.py
(Note: the path to this test could change.)
Getting a Google API key requires several steps:
Google gives you credits good for 90 days. Beyond that you pay for access.
Setting up an OpenAI key is simpler but there is no free trial: