Hi Eric,

We will definitely have a look at model_vs_data speed. 

In the mean time, if you have some Python skills, you may consider looking at a Python pickle file in your Phenix distribution:
modules/chem_data/polygon_data/all_mvd.pickle.
It contains various statistics (R-factors, etc, more than 70 characteristics) for ~70 000 pdb files. If what you want to obtain from model_vs_data is present there, you may be able to skip calculations and get the numbers you want.

If you are interested, I can write a small script illustrating how to get some numbers.

Best regards,
Oleg Sobolev.

On Sun, Oct 15, 2017 at 10:10 PM, Pavel Afonine <pafonine@lbl.gov> wrote:
Eric,

I'm sorry it is slow, or slower than you expect. I'm sure technically it can run up to 10 seconds max per largest structure in PDB, as advertised in

http://journals.iucr.org/d/issues/2013/04/00/dz5273/dz5273.pdf

if corresponding code is appropriately refactored (making most use of above paper); plus time taken by phenix.molprobity.

Unfortunately, I'm away till October 26, and then will be at the workshop till November 6th.

Others from the team (Dorothee, Oleg, Nigel, Billy, Vincent, Chris?) are encouraged to feel free to volunteer to address obvious runtime bottlenecks in this code.

All the best,
Pavel

On 10/15/17 04:35, Eric Williams wrote:
I'm having a devil of a time getting model_vs_data to run in a timely fashion. Running the latest stable Linux build on Ubuntu 14.04 takes on the order of 3 minutes per structure. That wouldn't be so bad if I weren't trying to run it on every entry in the PDB. Is it just a limitation of Python? SFCheck, which I think is written mostly in FORTRAN, runs in a few seconds. Any help anyone could offer would be appreciated. Thanks. :)

Eric


_______________________________________________
phenixbb mailing list
phenixbb@phenix-online.org
http://phenix-online.org/mailman/listinfo/phenixbb
Unsubscribe: phenixbb-leave@phenix-online.org