Using the PELE Web Server
You have to sign up with the server to be able to submit jobs and to have records of all your submitted jobs in one place. One advantage of this way is that you may gain access to more resources (more CPUs per jobs, longer wall-clocks, or even additional compute clusters) later by contacting the server administrator. For submitting in this way, you must first register with the server, then you will be able to login. Upon login, you will be automatically directed to your jobs page. From there (and also from submit page), you can submit new jobs. But all the jobs submitted while you were logined, are only accessible by going to your jobs page and not with bookmarks.
As input, the only mandatory file that must be provided by the user is the structure(s) in PDB format (with all hydrogen atoms added). Then a control file can be made based on the selected ready-made script or uploaded by the user. You can find more information about the format of inputs and different running parameters below.
Requirements (input files)
The user needs to have a PDB file for the structure(s), including all hydrogens. The server does not modify any protonation state or add any ligand atoms (i.e. the user should know the system and decide the best protonation state).
Additionally, in order to recognize the right protonation state, the user should change the name of some residues in the PDB file:
HIS/HIE/HIP (for delta, epsilon or both protonated histidines)
LYN/ARN (neutral lysine or argenine)
ASH/GLH (protonated aspartic or glutamic)
CYT/CYX (deprotonated negative/neutral cysteine)
Water residues (if present) should be named SPC or HOH and defined as HETATM. For example:
HETATM 4957 OW HOH A 605 -1.397 5.896 2.606 1.00 0.00 O HETATM 4958 HW1 HOH A 606 -1.036 6.564 1.955 1.00 0.00 H HETATM 4959 HW2 HOH A 607 -2.357 6.099 2.795 1.00 0.00 H
If a ligand is present (checking the ligand box), the server will automatically create a template for the ligand, assigning bond order and charges (OPLS-2005 charges).
In near future, all these assignment will be performed interactively with a protein viewer.
Control files and ready-made scripts
There are several optimized scripts to perform routine tasks. By selecting any of them, only few parameters should be entered. More information about each script can be found in the submit page. Hovering pointer over each option shows a descriptive text about its purpose.
The last two scripts are actually special cases:
- Use an uploaded control file. This option allows you to upload a ready-made PELE's control file as a script and use it to run the job (for advanced users).
- Set all script parameters manually. This option allows you to set all the parameters manually by going through a long list of all possible parameters.
For manual fine tuning of the job, it is also possible to edit the generated control file by clicking on Create and edit button. In this case you should have a good understanding of PELE's control file syntax. You may consult information below to have some insights about those commands.
Short tutorial on editing control file
Briefly, the pele control file uses several control lines. The first lines define the data directories. Then there is a section were we can define energy parameters, solvent models, ionic strength, etc. For example,
energy params solvent vdgbnp
energy params ionic 0.3
where we selected a variable dielectric SGB model and an ionic strength of 0.3 (M).
Next we specify the input file to be used.
load pdb filename ions yes waters no het yes
The server automatically assigns (or changes in case you upload a control file) filename to the job ID. For instance, if the job ID is 597, filename becomes 597.pdb.
After providing an input file, you can specify geometrical constraints. Here are some options:
constraints atom A:1:_CA_ atom _:3:_FE_ 100 2.4
constraints atom A:1:_CA_ xyz 1.0 2.0 3.0 100 2.4
constraints atom A:1:_CA_ current 100 2.4
constraints angle A:1:_CA_ A:1:_CB_ C:1:_H1_ 10.0 1.5708
constraints torsion A:1:_CA_ A:1:_CB_ A:1:_CG_ A:1:_CD_ 10.0 0
constraints atom A:76:_CA_ AFM 87.59 78.72 112.2 0.01 7.0 0.0
The first constraint setups a harmonic constraint between two atoms: the Cα of the residue number 1 in the A chain and the FE atom of residue number 3 in the chain with no name (_), with a 100 kcal/mol and a distance of 2.4 A. Similarly the next lines constraint to a point or to its current position. Angle and torsions constraint units are in radians.
Last constraint example (AFM) is used to pull atoms. Following the example, an harmonic constraint is generated between A:76:_CA_ and a virtual point at the beginnig in the same position of the atom. Then, virtual bead is moved at a 0.01 A/step in the direction between the position introduced 87.59 78.72 112.2 and the atom selected. The harmonic constant value is 7.0 and the equilibrium length is 0.0.
Notice: constraints, for example, are important in the case that you have an ion (Ca2+, Mg2+..) which is coordinated to the protein and/or waters. Not adding the constraints will most likely result in the ion flying away or changing its coordination ligands.
Additionally, a native input file can be provided. The native file can be used to compute ligand or overall rmsd. Optionally, the rmsd to the native file can then be used to follow or to bias the simulation (see below the sapwn commands).
load native 1iep_CUT.pdb het yes
Next the main PELE line, defining the run tasks and parameters is defined. Within the PELE line the order of parameters does not matter. We usually start, however to define the task, then the global parameters, the the side chain parameters, then the minimization and finally the ANM parameters. The task section defines the flow of the program: how to end, what to do, using mpi to interchange trajectories, etc.. For example:
show atom 1 A:154:FE__ &
show bind_ene 1 &
show rmsd_lig 1 heavy 0 &
show adist A:3:_CA_ A:55:_CA_ &
exit steps gt 1000 &
Where we ask PELE to perform 1000 Monte Carlo steps (for each processor!), showing us at each step the distance of the center of mass of ligand 1 ((the ligand we are perturbing--at present time always 1) to the iron atom in the residue 154 of chain A. similarly it shows the binding energy of the ligand and its rmsd to the native structure. The rmsd includes all heavy atoms and no ligands around (0Å around --i.e. only the ligand). The following line
show rmsd_lig 1 heavy 5 &
would compute the rmsd to the native file of the ligand and everything around 5 Å. If no native file is entered or if rmsd_ini_lig is used, the rmsd is computed to the initial structure. The task also asks PELE to print the distance between two protein atoms, the Cα's between residues 3 and 55 from the chain A.
A more complex task might include changing some variables depending on the current step number. For example, we added here a change in the translation of the ligand. Every fifth step it will do 2.75Å, otherwise only 1.0Å. We also change the number of tries to perturb the ligand in each step. Each try, PELE does a full ligand perturbation, choosing at the end the best of them (based on the potential energy after each try). Thus, increasing the tries we might guide better the ligand towards "preferred" electrostatic areas...
if modulos 5 = 0 then &
tra_r 2.75 &
tries 10 &
tra_r 1.0 &
tries 1 &
exit steps gt 1000 &
spawn atom 1 A:3:_CA_ gt 2 &
exit steps gt 1000 &
spawn point 1 15.8 53.7 13.4 lt 6 &
spawn bind_ene 1 lt 10 &
spawn atom 1 A:3:_CA_ within 10.0 &
spawn rmsd_sys calpha lt 2.0 &
Global Control Variables
The following variables indicate how often a given procedure is performed within the Metropolis cycles. A value of 1 means to perform the procedure in each cycle.
- LIG freq - How often a ligand perturbation is produced.
- ANM freq - How often a backbone ANM perturbation is produced.
- MIN freq - How often the final minimization is produced.
- SP freq - How often the side chain sampling is produced.
- WR freq - How often to write an accepted structure (pdb).
- Spawn WR freq - How often to write a structure that advances in the spawning criteria (if a spawning criteria is defined).
Advanced Global Control Variables
- Metropolis temperature - Metropolis temperature. Default is 1000. We should not confuse this temperature with the molecular dynamics temperature.
- Init Min - Whether to perform an initial minimization or not before running PELE.
- Rem Bulk - Level of removing overall translation and rotation motion. Default 3, meaning to remove overall translation and rotation at the end of each Monte Carlo step.
Ligand Control Variables
- Ligand Chain - Chain name of the ligand to be perturbed.
- Ligand Res - Residue number of the ligand to be perturbed.
- Tra range - Ligand translation range in the perturbation. In Angstroms.
- Rot range - Ligand rotational range in the perturbation. In radians.
Advanced Ligand Control Variables
- Waitfor - Number of steps to keep the same ligand translation perturbation direction. Increasing the waitfor we can further force the escape from cavities, etc.
- lcom con - Force constant to constrain the ligand after translation and rotation. Thus, we can constrain the position of the ligand after the initial perturbation. In this way we will minimize the ligand motion in the following minimizations.
- tries - Number of independent ligand perturbation in each cycle. We can perform several independent perturbations and the program will choose the one with best ligand-protein interaction energy.
ANM Control Variables
- ANM eig freq - How often, within the Monte Carlo cycles, to compute and diagonalize the ANM matrix.
- ANM eig num - Number of ANM modes to include in the calculation.
- ANM displacement - Maximum Cα displacement in the ANM perturbation. The maximum displacement will correspond to the largest eigenvector coefficient. In Angstroms.
- ANM type - Whether to perform a calpha ANM or all atom ANM. Keep in mind that all atom diagonalization might require several minutes.
Advanced ANM Control Variables
- Ualig - Whether to include the ligand atoms in the ANM matrix. The program automatically assigns some ligand atoms depending on its size.
- Mix mod - A float number indicating the level of mixing the modes. For example If 03 is selected then the chosen mode will contribute 0.3 and the rest of modes will be the remaining 0.7.
- Belly - A float number indicating the radius around the ligand to include in the ANM matrix. By default all the protein is included in the ANM but for quicker induced fit studies we can select only a smaller region around the ligand.
- rmsg - Root mean square gradient for the ANM minimization.
- Alphaup - Whether to update the solvent Born alpha during the ANM minimization. Since the ANM minimization is only an approximate way to move the backbone usually we prefer not to update the alpha (makes it much faster). If no final minimization is present, however, it should be updated.
Side Chain Control Variables
- SP radius - A float number indicating the radius around the ligand to include in the side chain prediction.
- Top Side - The number of excited side chains to include in the prediction. PELE computes the energy for each side chain before and after the ANM and can include the residues more excited as a result of the backbone displacement in the side chain prediction step.
- Randomize - Whether to randomize the side chain chain conformation before the prediction.
- inccur - Include the current side chain dihedrals in the rotamer list.
- SP RES - Side chain rotamer library resolution.
- SP Algorithm - Side Chain prediction algorithm [Heuristic/Monte Carlo].
Minimization Control Variables
- Rmsg - Root mean square gradient for the minimization.
Advanced Minimization Control Variables
- caconst - To keep the conformational sampling provided by the ANM perturbation, the final minimization can have constraints in the Cα positions.
- alphaup - Whether to update the solvent Born alpha during the minimization.
- iterations - Max number of iterations.
- MIN radius - If we are doing ligand perturbation, we can also define a region to minimize only around the perturbed ligand.