[phenixbb] local BLAST server
rdo20 at cam.ac.uk
Fri Aug 8 10:24:44 PDT 2014
Thanks very much!
On 2014-08-08 16:19, Zhijie Li wrote:
> Hi Rob,
> You may want to check the pdbaa.gz first. It seems this is the PDB
> subset of the nr database.
> ftp://ftp.ncbi.nlm.nih.gov/blast/db/README says:
> pdbaa.*tar.gz | protein sequences from pdb protein
> structures, | its parent database is nr.
> Hi Rob,
> The BLAST nr database (fasta format) can be downloaded from the NCBI
> As I remember it is the nr.gz file. When unzipped the file is called
> According to BLAST the nr database does contain PDB entries.
> It is significantly larger than the PDB data file you are currently
> You might consider extract all the PDB sequences from it so that you do
> need to go through all the non-PDB sequences.
> -----Original Message----- From: R.D. Oeffner
> Sent: Friday, August 08, 2014 10:14 AM
> To: phenixbb at phenix-online.org
> Subject: [phenixbb] local BLAST server
> I'm in the process of installing a local BLAST server for doing blast
> protein queries. As I understand it I need a file with all the FASTA
> sequences as input for initially generating my local BLAST database.
> The one present in
> ftp://ftp.rcsb.org/pub/pdb/derived_data/pdb_seqres.txt seems to
> contain redundant entries. Querying it produces many extra PDB
> chain-ids when compared to a BLAST query on the NCBI web server.
> Does anyone know where to get a non-redundant version of FASTA records
> so that I can create a similar database as the one used by NCBI?
> Many thanks,
> Robert Oeffner, Ph.D.
> Research Associate, The Read Group
> Department of Haematology,
> Cambridge Institute for Medical Research
> University of Cambridge
> Cambridge Biomedical Campus
> Wellcome Trust/MRC Building
> Hills Road
> Cambridge CB2 0XY
> tel: +44(0)1223 763234
> mobile: +44(0)7712 887162
> phenixbb mailing list
> phenixbb at phenix-online.org
More information about the phenixbb