Usage¶
SPyCi-PDB runs entirely through command-lines. Theoretically, it is compatible with any machine that can run Python. However, it’s only been thoroughly tested on WSL 2.0 on Windows 11, Linux Ubuntu LTS versions 24.04, 22.04, 20.04, and 18.04.
Please follow the explanations in this page plus the documentation on the command-line themselves via:
spycipdb <MODULE> -h
Please note that SPyCi-PDB can also be used as a library as back-calculator functions are completely modular:
import spycipdb
Command-Lines¶
To execute spycipdb command-line, run spycipdb in your
terminal window, after installation:
spycipdb
or:
spycipdb -h
Both will output the help menu.
Note
All subclients have the -h option to show help information.
Formatting for Input and Output Files¶
Conformers will be required in the PDB format with the .pdb file extension.
For tarballs and folders, only the .pdb files in the folder/tarball will
be used. Accepted tarballs include .tar, .tar.gz, and .tar.xz file
extensions.
Most modules will require a sample experimental results template to base
the back-calculations off of. Please note that experimental result values
are not required. For the mathematical equations for the default internal calculators
(pre, noe, jc, smfret) please refer to the paper.
All input files can be in the .txt file format with comma-delimitation
for values. Examples can be found in the example/drksh3_exp_data/ folder.
The required header formatting for the .txt file for each experimental
module is highlighted below.
Output files are saved in a standard human-readable .JSON format.
In most cases, the first key-value pair gives the format for each of the values
in subsequent key-value pairs.
Paramagnetic Resonance Entropy (PRE) module¶
res1,atom1,res2,atom2'format': {'res1': [], 'atom1': [], 'res2': [], 'atom2': []}[dist_values].Nuclear Overhauser Effect (NOE) module¶
res1,atom1,atom1_multiple_assignments,res2,atom2,atom2_multiple_assignments'format': {'res1': [], 'atom1': [], 'atom1_multiple_assignments': [], 'res2': [], 'atom2': [], 'atom2_multiple_assignments': []}[dist_values].3J-HNHA coupling (JC) module¶
resnumresnum indicates the JC for a specific residue.'format': [resnum][jc_values].single-molecule Fluoresence Resonance Energy Transfer (smFRET) module¶
res1,res2,scaler'format': { 'res1': [], 'res2': [], 'scale': []}[smfret_values].Residual Dipolar Coupling (RDC) module¶
Input: The default back-calculator for RDCs is PALES. To run PALES, however, you
will need the provide an experimental file template that is the same format as for PALES.
For more information, please see the Data Format
section.
| Output: 'format': {resnum1: [], resname1: [], atomname1: [], resnum2: [], resname2: [], atomname2: []}
| Subsequent keys are the names of the PDB file with a value of: [rh_values].
| About: The default back-calculator uses the third-party program PALES, which uses the steric obstruction model to derive the RDC. It has also been chosen due to its popularity in the field for RDC back-calculations.
Note
SAXS, Rh, and CS modules do not require input files. The following is formatting for the output.
Small Angle X-ray Scattering (SAXS) module¶
index and value
where index represents the X-axis and value represents the Y-axis with
units I_abs(s)[cm^-1]/c[mg/ml].Hydrodynamic Radius (Rh) module¶
Chemical Shift (CS) module¶
'format': {'res': [], 'resname': []}.
Subsequent keys are the names of the PDB file with a value of:
{'H': [], 'HA': [], 'C': [], 'CA': [], 'CB': [], 'N': []}.Basic Usage Examples¶
The example/ folder contains instructions to test native back-calculators
on 100 test conformers of the unfolded state of the drkN SH3 domain.
To get started with using the different modules for back-calculating experimental data, we will be using the unfolded state of drkN SH3, an intensively studied intrinsically disordered protein.
The goal of these examples is to walk you through expected experimental data
formatting styles, as well as how to get started with each module. The
experimental files could be found in the example/drksh3_exp_data
folder. A set of 100 conformers generated using IDPConformerGenerator has
been provided as a tarball: example/drksh3_csss_100.tar.xz.
Note that these experimental data files are comma-delimited per CSV
formatting for ease of use in pandas dataframe as well as
Microsoft Excel usage.
To use the bare-bones version of spycipdb, the PRE, NOE, JC, Rh,
and smFRET modules do not require third-party installation instructions.
Every module is equipped with --help sections with detailed usage
examples and documentation for each functionality. For customized --output
file names, the flag -o can be used with every module.
Please note for large number of PDB ensembles, it is recommended to specify
the number of CPU cores you are comfortable with using to maximize speed.
Having just -n will utilize all but one CPU thread.
NOE, and JC Modules¶
To perform the back-calculation using the default internal calculators, give the tarball of provided structures as the first argument as well as the path to the experimental data file to use as a template. For example, using the NOE module:
spycipdb noe ./drksh3_csss_100.tar.xz -e ./drksh3_exp_data/drksh3_NOE.txt -n
Likewise to the NOE module, the tarball of the provided structures and sample JC experimental data file are required:
spycipdb jc ./drksh3_csss_100.tar.xz -e ./drksh3_exp_data/drksh3_JC.txt -n
PRE Module¶
There are two method of back-calculating PRE values within SPyCi-PDB. The first uses a distance-based approach as seen in the original X-EISD paper. The second method uses DEERPREdict with adjustable experimental parameters and yields intensity ratios.
Using the default method, the usage is akin to the NOE and JC modules above.:
spycipdb pre ./drksh3_csss_100.tar.xz -e ./drksh3_exp_data/drksh3_PRE.txt -n
To use DEERPREdict within SPyCi-PDB, you must specify --method deerpredict
as well as a parameters text file with the following default parameters.:
atom=H
temp=298
tau_c=2*1e-9
tau_t=0.5*1e-9
delay=10e-3
r_2=10
wh=750
Where temp is the integer temperature value of the experiment in Kelvin,
tau_c and tau_t are the rotational tumbling time and internal
correlation time respectively, delay is the indept delay within the pulse
sequence, r_2 is the diamagnetic transverse relaxation rate in the diamagnetic
molecule, and wh is the strength of the magnetic field. These parameters can be
changed and specified by using --parameters ./deerpredict_params.txt. An example
of usage could be.:
spycipdb pre ./PATH_TO_CONFORMERS -m deerpredict --parameters ./PATH_TO_TXT -e ./PATH_TO_EXP_DATA -n
The output will represent intensity ratios as Ipara/Idia.
CS Module - Using UCBShift¶
After ensuring UCBShift is installed, the CS module does not require experimental file samples. Furthermore, you could also adjust the pH value to be considered. The default pH value is 5.0. A sample command is as follows:
spycipdb cs ./drksh3_csss_100.tar.xz --ph 7 -n
The above command sets a custom pH of 7.0. Please also note that UCBShift is fairly
RAM intensive, it’s recommended to run with less than 10 CPUs (can be changed with
the flag -n 10 for example).
SAXS Module - Using CRYSOLv3¶
After ensuring the proper ATSAS/CRYSOL version is installed, the following command can be used to run the SAXS module. Please note again that an experimental template is not required:
spycipdb saxs ./drksh3_csss_100.tar.xz -n
The SAXS module is equipped with a --lm flag, as CRYSOLv3 uses an adjustable
number of harmonics to perform the appropriate SAXS back-calculation. The default
value is 20 (from 1-100). Increasing the number of harmonics will also increase
the cost of computational time.
Rh Module - Using HullRadSAS v3.1¶
HullRadSAS should be working out of the box with the basic installation instructions. An experimental file template is not required:
spycipdb rh ./drksh3_csss_100.tar.xz -n