MACROMOLECULAR
CHARGE FLIPPING
updated february 2013
The charge-flipping algorithm introduced by Oszlányi and Sütő for single crystals in 2004 has been adapted to accommodate protein crystals diffraction data in the computer program SUPERFLIP. A flow diagram of the procedure is given below.
Two main applications are described:
* ab initio procedure for the determination of protein crystal structures using diffraction data at atomic resolution;
* procedure for heavy atom or anomalous scatterers substructure determination from isomorphous or anomalous differences.
References:
1/. Application of Charge flipping to protein crystallography:
Dumas,
C. & van der Lee, A. (2008) «Macromolecular
structure
solution by charge flipping», Acta Cryst. D64, 864-873.
2/. SUPERFLIP program:
Palatinus, L. & Chapuis, G. (2007). « SUPERFLIP - a computer program for the solution of crystal structures by charge flipping in arbitrary dimensions», J. Appl. Cryst. 40, 786-790.
3/. Review on Charge flipping:
Oszlányi, G. & Süto, A. (2008). «The charge flipping algorithm», Acta Cryst. A64, 123-134.
4/. Symmetry determination following structure solution in P1:
Palatinus, L. & van der Lee, A. (2008) J. Appl. Cryst. 41, 975-984.SUPERFLIP program and utilities:
We
refer to the official SUPERFLIP
site at the Department of Structure Analysis, Institut of Physics, Praha and the École Polytechnique Fédérale de
Lausanne (EPFL) for source files, documentation,
and license
agreement.
Download source code or the appropriate binaries for your system => Current version: 09/05/11 14:22
Source code, zipped executables for MacOSX (Intel) or Windows, GNU-Linux x86, GNU-Linux x86-64(fc13)
Uncompress the binary, rename it to superflip, make it executable (chmod +x superflip) and move it in your $PATH (/usr/local/bin or ~/bin are good places).
Macromolecular structures can be solved by SUPERFLIP in two ways:
* by setting up an input file to be used with a user-provided hkl-file and running superflip program ($ superflip example.inflip).
Two
examples (input and log files) can be found here:
◊ heavy atom sub-structure solution:
anomalous differences data, 40 sites P1 space-group substructure40.inflip
substructure40.sflog
anomalous differences data, 120 sites, C2 space-group substructure120.inflip substructure120.sflog
◊ ab initio structure solution at atomic resolution protein.inflip
protein.sflog
*
by using C-shell scripts, flipsub for heavy-atom substructure solution, and fliprot for ab initio structure solution at atomic resolution.
These scripts create
the SUPERFLIP input file on the fly using a limited number of command line
options,
input files in CCP4 MTZ format, output files in CCP4 map format and pdb format (heavy atom sites).
Download the C-shell scripts using CCP4 environment (version 6.3.x or 6.2.x) : fliprot (version 02/28/2013) flipsub (version 02/28/2013)
The user should install fliprot or flipsub file in a path directory (see your $PATH) and make them executable (chmod +x flipsub fliprot)
Various application examples follow here.
Ab initio protein structure solution at atomic resolution
usage: fliprot mydata.mtz FP=Fobs name=mytest
or fliprot 2anv-sf.cif name=2anv SG=5
where:
mydata.mtz (or pdbcode-sf.cif) input structure factor file: MTZ or mmCIF(pdb) format
optional key words:
SG=18
...... space group number (read from mtz file, required for some
PDB-structure-factor files)
FP=Fobs ...... MTZ label assignement for amplitude (default FP=FP)
name=flip ...... generic name for output files (default fliprot)
1.05A ...... dmin resolution (default all reflections)
ked=1.25 ...... coefficient for delta threshold parameter (default 1.3)
weak=0.1 ...... weak reflection threshold (default 0.05)
trial=5 ...... number of repeated trials (default 1 repeat=never)
maxcycl=5000 ...... maximum number of cycles per trial (default 20000)
mode=peakiness...... convergence detection mode = peakiness or symmetry (by default except SG=P1).
conv=4.0 ...... convergence threshold
criterion (peakiness, default 3.0 or symmetry, default 80.0)
Test
data used: pdb
code 1mfm [PubMed]
1152 non-H protein atoms, 283 waters &
Cd/Cu/Zn atoms in the asymmetric unit, space group P212121
Ab
initio phasing of superoxyde dismutase using charge flipping:
electron
Download
1mfm-sf.cif
and 1MFM.pdb from PDB
site
and use it as input file for fliprot
script.
Command: fliprot 1mfm-sf.cif SG=19 name=mfm
The procedure asks
the unit cell parameters (not in the cif file):
CRYST1 from pdb
file: 34.99 48.11 81.08 90.0
90.0 90.0
Annotated log file (typical cpu-time 3 to 5 minutes on an Intel 2.4GHz cpu processor)
After convergence, the reference model (1MFM.pdb)
can be superimposed on the CF density map and the correct phase enantiomorph selected:
Use the PHENIX commands, compare the overall Correlation coefficient and display mfm.map and offset.pdb:
phenix.get_cc_mtz_pdb mfm.mtz 1MFM.pdb any_offset=true labin="FP=Fobs PHIB=PHIcf"
phenix.get_cc_mtz_pdb mfm.mtz 1MFM.pdb any_offset=true labin="FP=Fobs PHIB=PHIcfi"

Test data used: pdb code 2anv [PubMed]
Download
2anv-sf.cif
and 2ANV.pdb from PDB
site and
use it as input file for fliprot
script.
usage: flipsub data.mtz DANO=Dano_peak name=HAtest
where:
data.mtz ...... input reflection file in merged MTZ format, with DANO or FA label(s)
or
data.sca ...... input reflection file in merged scalepack format
optional:
SG=18 ...... space group number (default read from .mtz file or .sca file)
DANO=Dano_pk ...... MTZ label assignement for anomalous amplitude (default DANO=DANO)
name=HAtest ...... generic name for output files (default flipsub###)
2.5A ...... high resolution cutoff (default all reflections)
conv=4.0 ...... convergence criterion threshold (for peakiness default 2.5 and symmetry 85.0)
norm=no ...... no local normalization of amplitude differences (default norm=yes)
ked=1.25 ...... coefficient for delta flipping parameter (default 1.2)
weak=0.35 ...... weak reflection threshold (default 0.3)
trial=5 ...... number of repeated trials (default 1)
maxcycl=3000 ...... maximum number of cycles per trial (default 2000)
output files:
CCP4 CF heavy-atom map in P1 space group ......... HAtest.map
CCP4
CF heavy-atom map (asymmetric unit)
.........
HAtest-au.map
PDB file for Heavy-atom positions (in P1 unit cell) ......... HAtest.pdb
PDB file for Heavy-atom positions (asymmetric unit) ......... HAtest-au.pdb
Heavy atom positions in fractional coordinates ......... HAtest-au.ha
The resulting coordinate files can be used as input file
for your favorite phasing program SHARP, PHENIX (Autosol or Phaser-EP),
CCP4, ... Typically edit the xxx-au.pdb
file (or xxx-au.ha file, in fractional units) to select the appropriate
number of heavy-atom sites in the asymmetric unit and remove
non-significant sites.
Various test datasets for MAD, SAD phasing are available here:
example 1: Locating heavy-atom substructure containing 20-22 bromide sites used for SAD phasing
Download sfdata-haptbr.tgz (AUTOSTRUCT / CCP4 site) untar the archive and use haptbr.mtz as input data for flipsub script
Command: flipsub haptbr.mtz 2.0A name=haptbr
(using normalized anomalous differences up to 2 Ĺ resolution, using symmetry score to detect convergence)
flipsub haptbr.mtz 2.0A name=haptbr mode=peakiness conv=2.6
(using normalized anomalous differences up to 2 Ĺ resolution, using peakiness convergence criteria)
example 2: Locating heavy-atom substructure containing 40 selenium sites used for SAD/MAD phasing
Download sfdata-cynsemet.tgz , (AUTOSTRUCT / CCP4 site) untar the archive and use cynsemet.mtz file as input data for flipsub scripts
Commands:
flipsub cynsemet.mtz 2.4A DANO=DANO_SE3 name=cyn-pk
(using normalized anomalous differences in the peak wavelength dataset, up to 2.4 Ĺ resolution)
flipsub cynsemet.mtz 2.6A DANO=DANO_SE2 name=cyn-ip
(using normalized anomalous differences in the inflection wavelength dataset, up to 2.6 Ĺ resolution)
flipsub cynsemet.mtz DANO=DANO_SE3
(using anomalous differences in the peak wavelength dataset, no resolution cutoff)
example 3: Locating heavy-atom substructure containing 8 selenium sites and determination of the correct space-group:
Download sfdata-jia.tgz, (AUTOSTRUCT / CCP4 site) untar the archive and use jia_peak.sca file (scalepack format) as input data for flipsub script:
Commands:
flipsub jia_peak.sca name=jia_peak
(using normalized anomalous differences in the peak wavelength dataset).
flipsub jia_peak.sca 5A
(using normalized anomalous differences in the peak wavelength dataset and 5 A resolution cutoff).
Simulation of a wrong space-group assignment, C222 instead of C2221 (the true space group), a
typical ambiguity in symmetry determination when axial reflection row or systematic extinctions are missing :
copy the file jia_peak.sca to jia_peak-C222.sca,
edit this file to remove crucial (00l) reflections (0,0,14 to 0,0,66) and change space-group from c2221 to c222.
All informations on this screw axis are now removed (systematic extinctions) and SG is assigned to SG#21(C222).
Now test flipub procedure with this wrong space-group :
flipsub jia_peak-C222.sca SG=21 name=wrongC222 mode=peakiness conv=6.0
The substructure solution is solved in P1
by SUPERFLIP and the correct space-group C2221 is proposed (success rate ~90%) based on a
symmetry analysis of the structure-factor phases obtained: the symmetry agreement factors for 2(1,0,0), 2(0,1,0) and 2_1(0,0,1) symmetry operations have a good score (less than 5). The output map wrongC222.map incorporates the wrong input symmetry operators C222 (default option searchsymmetry average) as shown by the poor "Overall agreement factor" (>70-80).
Now restart the script with the correct SG (C2221, #20) determined by SUPERFLIP.
flipsub jia_peak-C222.sca SG=20 name=newC2221
example 4: Locating heavy-atom substructure in CSN5 crystal SAD data, 20 SeMet residues and 2 zinc atoms (4F7O.pdb) in the asymmetric unit.
Download the SAD dataset 4F7O.mtz (mtz format, 2.6 A resolution) as input data for flipsub script:
Command: flipsub 4F7O.mtz DANO=DANO_x1 weak=0.2 name=CSN5 trial=10
A java applet illustrating the charge flipping algorithm
RSCB
Protein Data Bank: atomic coordinate
files and structure factors of biological macromolecules;
CCP4:
software
suite for macromolecular crystallography;
Last modifications: 28.february.2013