[cctbxbb] Space-saving tips
Tristan Croll
tic20 at cam.ac.uk
Mon Jan 25 06:36:55 PST 2021
Hi all,
Following on from Graeme's email about .o files, another way to quite dramatically reduce the size of the distribution (with little impact on code complexity or performance) would be to keep all the various text-based data files in .zip format. With modern Python fetching the data from these is little different from working with uncompressed files anyway (1-2 extra lines of code, generally negligible runtime cost). A couple of snippets as an example from ISOLDE (where I keep all MD ligand definition files in a .zip):
To get a list of the file contents:
def _ligand_db_from_zip(self, ligand_zip):
from zipfile import ZipFile
import os
namelist = []
with ZipFile(ligand_zip) as zf:
for fname in zf.namelist():
name, ext = os.path.splitext(fname)
if ext.lower() == '.xml':
namelist.append(name)
return namelist
To read a given file from the zip as needed:
if ligand_db is not None:
zip, namelist = ligand_db
if name in namelist:
from zipfile import ZipFile
logger.info('Loading residue template for {} from internal database'.format(name))
with ZipFile(zip) as zf:
with zf.open(name+'.xml') as xf:
forcefield.loadFile(xf)
Even just doing this for the CaBLAM contour data cuts about 160 MB off the size of the uncompressed distribution.
Best regards,
Tristan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20210125/c5ecd571/attachment.htm>
More information about the cctbxbb
mailing list