Faster PDB batch download script

bioinformatics
bash
Published

June 20, 2025

This is a rewrite of batch download script shared by RCSB (link).

  1. It’s much faster than the original
  2. It’s reverse compatible
  3. -C let’s you pass a text file with 1 ID per line as input (insted of coma separated list)
  4. All dependencies should be included in most linux distros
$ ./PDB_fast_batch.sh -h

Usage: ./PDB_fast_batch.sh -f <file> [-o <dir>] [-CcpaAxsmr]

 -h       : get this help message
 -f <file>: the input file containing a comma-separated list of PDB ids
 -o  <dir>: the output dir, default: current dir
 -C       : Column input mode, the input file contains one PDB id per line
 -c       : download cif.gz file for each PDB id
 -p       : download pdb.gz file for each PDB id (not available for large structures)
 -a       : download pdb1.gz file (1st bioassembly) for each PDB id (not available for large structures)
 -A       : download assembly1.cif.gz file (1st bioassembly) for each PDB id
 -x       : download xml.gz file for each PDB id
 -s       : download sf.cif.gz file for each PDB id (diffraction only)
 -m       : download mr.gz file for each PDB id (NMR only)
 -r       : download mr.str.gz for each PDB id (NMR only)

==> Download <==